Hi,

I am Rishi Dua, final-year undergraduate at Indian Institute of Technology
Delhi (IIT Delhi), India with a background in big data and machine learning.

The project "DBPedia Live Scaling and User Interface" sounds the most
relevant to my experience but I'd be open to exploring other relevant and
challenging projects. During my internships at NICTA (Australia) and Sony
(Japan), I have worked with data-sets containing billions of Tweets using
the noSQL database Apache Lucene. After graduating, I'll be joining the Big
Data team at Sony in October so I have no other commitments for the
duration of GSOC.

I started the warm-up tasks by improving the documentation. I added about
15 pages to the new wiki and edited a couple of already existing ones. Most
of it was content copied from old wiki (wrote simple scripts to crawl old
wiki and parse html to markdown) with minor additions at some places.

An older thread for this topic suggested setting up DBpedia Live. I have
set up MediaWiki on my local system with the required extensions. Since the
Wikipedia dump is ~10GB it'll take another day or two for me to have the
local MediaWiki running. Additionally I'm writing a tutorial and scripts to
automate the process that I'll upload by Wednesday.

While I continue working on the 2 things, it'd be great if you could
suggest issues or other things I could take up.

Also, is the old wiki down for maintenance?

Thanks,
Rishi Dua
Dual Degree Student | IIT Delhi
http://www.rishidua.com
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
Dbpedia-gsoc@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to