Kashif Khadim wrote:
Hi,
Iam using SegmentMergeTool and it is taking so long, i want to know how much time to expect for this tool to finish.After reading segment with 200000 enteries it just sits there for two days, is this normal ?.

Definitely not normal... Is the process swapping? You can get a list of threads and their state by using Ctrl-E. If you get any info, it means the process is not hanging, just taking its time... ;-)


Probably you need to kill the process anyway. Can you throw in some LOG.info() here and there and see where it's hanging?

Oh, one more thing: SegmentMergeTool does a lot of random seeking in the last stage of processing. However, seeking on segments with partially truncated MapFile "index" files takes a LOT of time... If you suspect some of your "index" files are truncated (e.g. because of a crashed fetcher), it's better just to remove them from the offending directory, and run SegmentReader -fix on them. The speed improvement in seeking will be like 20-50 times. Just make sure all your segments have correct "index" files (e.g. by running SegmentReader -list) and re-run the SegmentMergeTool.

--
Best regards,
Andrzej Bialecki
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to