Games for generating data
March 6th, 2007

On his new blog, Luke Biewald writes about the use of online games to generate data, as in ESP Game, Peekaboom, and Phetch. He lists two new games: Free-Association and Categorilla that focus on generating data for natural language processing.

I would absolutely love it if approaches like David‘s games start to generate large scale established corpora used to train and evaluate Natural Language Processing models and research. There is something so much more satisfying about using a data set collected as a byproduct of a real task rather than the typical corpus hand labeled by linguistic graduate students. But it’s hard to find a naturally occurring task where you can isolate a single AI problem, such as extracting specific semantic relationships. (Although my friend Rion and other’s work on extracting semantic relationships from raw text is very cool as well.)

Fatal error: Call to undefined function sociable_html() in /home/permutype/ on line 36