hi, you can use LUKE also to see the index.
just open the index folder using LUKE, and you will see all your indexed documents > From: [email protected] > Subject: Re: converting nutch crawl output to human readable content > Date: Tue, 15 Dec 2009 11:24:58 +0000 > To: [email protected] > > Hi, > > I would use the following command to dump out the crawl database in a human > readable format: > > nutch readdb crawl/crawldb -dump fooDir -format csv > > I hope this helps, > > Mischa > On 14 Dec 2009, at 22:30, Ted Yu wrote: > > > Hi, > > I used crawl command of bin/nutch and obtained the following: > > > > ls crawl/crawldb/current/part-00000/ > > data .data.crc index .index.crc > > > > How do I convert the output to human readable format ? > > > > Thanks > > ___________________________________ > Mischa Tuffield > Email: [email protected] > Homepage - http://mmt.me.uk/ > Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK > +44(0)20 8973 2465 http://www.garlik.com/ > Registered in England and Wales 535 7233 VAT # 849 0517 11 > Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD > _________________________________________________________________ Eligible CDN College & University students can upgrade to Windows 7 before Jan 3 for only $39.99. Upgrade now! http://go.microsoft.com/?linkid=9691819
