That's AMAZING! I was just thinking about using Neo4j to store some extracted n-grams, I previously did it with a SQLite database but maybe using a graph an application could surf between nodes more efficiently. One question: is it possible to download the google ngram corpus release (or at least some part of it) for free (and legally, of course) ? I've found just this page ( http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13) but it seems I would have to pay. Cheers, Jacopo Farina
2011/11/28 Peter Neubauer <peter.neuba...@neotechnology.com> > Seriously cool stuff René! > > I would love to hear more as the project progresses! Also, maybe the > dataset could be added to the example dataset collection for playing around > with neo4j? WDYT? > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - NOSQL for the Enterprise. > http://startupbootcamp.org/ - Öresund - Innovation happens HERE. > > > 2011/11/27 René Pickhardt <r.pickha...@googlemail.com> > > > Hey Everyone, > > > > I am curently advising two high school students for a programing project > > for some german student competition. > > > > They have inserted the German google n-gram data set several GB of > natural > > language to a neo4j data base and used this to make sentence prediction > to > > improve typing speed. > > > > The entire project is far from being complete but there is some code > > available on how we modelled n-grams in neo4j and what we used for > > prediction > > > > Both approaches very basic and as you would expect them. Still they > already > > work in a decent way showing again the power of neo4j. > > > > We would be happy for some feedback thoghts and suggestions for further > > improvement. Find more info in my blog post: > > > > > http://www.rene-pickhardt.de/download-google-n-gram-data-set-and-neo4j-source-code-for-storing-it/ > > > > or in the source code: > > > > > http://code.google.com/p/complet/source/browse/trunk/Completion_DataCollector/src/completion_datacollector/Main.java?spec=svn64&r=64 > > > > by the way. even though the code is just hacked down it uses hashmaps to > > store nodes in memory and increase inserting speed. and builds the lucene > > index later. Of course it would be even better to use the batch inserter. > > > > best regards René > > -- > > -- > > mobile: +49 (0)176 6433 2481 > > > > Skype: +49 (0)6131 / 4958926 > > > > Skype: rene.pickhardt > > > > www.rene-pickhardt.de > > <http://www.beijing-china-blog.com> > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user