hi everyone,

i'm new to nutch and i have some trouble to get a good working nutch-cluster 
setup.. (nutch 1.2)

my setup:
1 master (namenode, jobtracker, secondarynamenode)
2 nodes (datanode, tasktracker)

all pc's are virtual machines and have 500mb ram each

nutch-config:
- mapred.map.tasks 2
- mapred.reduce.tasks 2
- mapred.child.java.opts -Xmx256mb 
- fetcher.threads.fetch 20
- fetcher.server.delay 1.0
- fetcher.threads.per.host 3 
- replication 2

if i increase the map and reduce tasks the crawl doesn't work..because the 
tasktracker kills himself..(sometimes also the datanode)..so i changed it back 
to 2..

but my questions are more general..looks my setup/config ok? and is there any 
best practice hardware requirement for running a namenode or datanodes?

thanks for help and i appreciate all your answers..

bart


-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de

Reply via email to