Hi, I would recommend indexing wikipedia xml dump. Check out dataimport hander example of indexing wikipedia(http://wiki.apache.org/solr/DataImportHandler#Example%3a_Indexing_wikipedia). Thanks Vineet Yadav
On Sun, Jul 8, 2012 at 9:15 AM, kiran kumar <kirankumarsm...@gmail.com> wrote: > Hi, > In our office we have wikipedia setup for intranet. I want to index the > wikipedia, I have been recently studying that all the wiki pages are stored > in database and the schema is a bit of standard followed from mediawiki. I > am also thinking of whether to use xmldumper to dump all the wiki pages > into xml and index from there. > Have anybody done something like this. If so, which way is more efficient > and easy to implement. > For me the DB schema look quite a bit complicated. Can somebody please help > me in understanding what is the better implementation for this. > > Thanks, > Kiran Bushireddy.