Gal Nitzan wrote:
If I understand correctly, having one segment or a hundred is not important?

It depends. If you have hundreds of segments and are trying to search them with a single JVM then you will probably run out of file handles.

What happens when a page is fetched a second time? is there something to deduplicate it?

The dedup command has not yet been implemented in the mapred branch. Coming soon.

Doug


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to