hi everyone, i'm new to nutch and i have some trouble to get a good working nutch-cluster setup.. (nutch 1.2)
my setup: 1 master (namenode, jobtracker, secondarynamenode) 2 nodes (datanode, tasktracker) all pc's are virtual machines and have 500mb ram each nutch-config: - mapred.map.tasks 2 - mapred.reduce.tasks 2 - mapred.child.java.opts -Xmx256mb - fetcher.threads.fetch 20 - fetcher.server.delay 1.0 - fetcher.threads.per.host 3 - replication 2 if i increase the map and reduce tasks the crawl doesn't work..because the tasktracker kills himself..(sometimes also the datanode)..so i changed it back to 2.. but my questions are more general..looks my setup/config ok? and is there any best practice hardware requirement for running a namenode or datanodes? thanks for help and i appreciate all your answers.. bart -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de

