Hi all. I have detected that in big nutch crawl process(depth:10 topN:100 000) some threads are hunged in some part of crawl cicle for example normalizing by regex and fetching urls to. Im using nutch 1.5.1 and solr 3.6. Ram:2GB CPU:CoreI3. OS:Ubuntu 12.04(server)
I have a doubt, How nutch manipulate the threads in a cicle of crawl process ?. Is multithread the generation,fetching,parsing process ? PD:Sorry for my english. Is not my native language. 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci