hi,

you can use LUKE also to see the index.

just open the index folder using LUKE, and you will see all your indexed 
documents





> From: [email protected]
> Subject: Re: converting nutch crawl output to human readable content
> Date: Tue, 15 Dec 2009 11:24:58 +0000
> To: [email protected]
> 
> Hi, 
> 
> I would use the following command to dump out the crawl database in a human 
> readable format:
> 
> nutch readdb crawl/crawldb -dump fooDir -format csv
> 
> I hope this helps, 
> 
> Mischa
> On 14 Dec 2009, at 22:30, Ted Yu wrote:
> 
> > Hi,
> > I used crawl command of bin/nutch and obtained the following:
> > 
> > ls crawl/crawldb/current/part-00000/
> > data        .data.crc   index       .index.crc
> > 
> > How do I convert the output to human readable format ?
> > 
> > Thanks
> 
> ___________________________________
> Mischa Tuffield
> Email: [email protected]
> Homepage - http://mmt.me.uk/
> Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
> +44(0)20 8973 2465  http://www.garlik.com/
> Registered in England and Wales 535 7233 VAT # 849 0517 11
> Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
> 
                                          
_________________________________________________________________
Eligible CDN College & University students can upgrade to Windows 7 before Jan 
3 for only $39.99. Upgrade now!
http://go.microsoft.com/?linkid=9691819

Reply via email to