I'm running Nutch 0.8.1 on 3 servers. Everything works fine, but I'm
confused about some Fetcher behavior. I'll generate a list of 100k urls
to fetch, that works fine. However, only 1 server in the cluster
actually fetches a reasonable number. 2 out of three go get at most 20
pages. I've gotta believe I'm just missing some important configuration
settings.
Patrik
>From my hadoop-site.xml
<property>
<name>mapred.map.tasks</name>
<value>3</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>3</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
>From my mapred-default.xml
<property>
<name>dfs.block.size</name>
<value>256000</value>
<description>
</description>
</property>
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general