[Nutch-general] Speed up indexing....

Briggs Wed, 30 May 2007 09:46:45 -0700

Anyone have any good configuration ideas for indexing/merging with 0.9
using hadoop on a local fs?  Our segment merging is taking an
extremely long time compared with nutch 0.7.  Currently, I am trying
to merge 300 segments, which amounts to about 1gig of data.  It has
taken hours to merge, and it's still not done. This box has dual zeon
2.8ghz processors with 4 gigs of ram.


So, I figure there must be a better setup in the mapred-default.xml
for a single machine.  Do I increase the file size for I/O buffers,
sort buffers, etc.?  Do I reduce the number of tasks or increase them?
 I'm at a loss.

Any advice would be greatly appreciated.


-- 
"Conscious decisions by conscious minds are what make reality real"

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Speed up indexing....

Reply via email to