Hey Everyone, I am curently advising two high school students for a programing project for some german student competition.
They have inserted the German google n-gram data set several GB of natural language to a neo4j data base and used this to make sentence prediction to improve typing speed. The entire project is far from being complete but there is some code available on how we modelled n-grams in neo4j and what we used for prediction Both approaches very basic and as you would expect them. Still they already work in a decent way showing again the power of neo4j. We would be happy for some feedback thoghts and suggestions for further improvement. Find more info in my blog post: http://www.rene-pickhardt.de/download-google-n-gram-data-set-and-neo4j-source-code-for-storing-it/ or in the source code: http://code.google.com/p/complet/source/browse/trunk/Completion_DataCollector/src/completion_datacollector/Main.java?spec=svn64&r=64 by the way. even though the code is just hacked down it uses hashmaps to store nodes in memory and increase inserting speed. and builds the lucene index later. Of course it would be even better to use the batch inserter. best regards René -- -- mobile: +49 (0)176 6433 2481 Skype: +49 (0)6131 / 4958926 Skype: rene.pickhardt www.rene-pickhardt.de <http://www.beijing-china-blog.com> _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user