Thank markus for your anwer. I always have used nutch with console making a complete cycle bin/nutch crawl urls -dir crawl -depth 10 -topN 100000 -solr http://localhost:8080/solr Could you explain me how to use a separately process. I was reading the wiki but not function for me because I don’t understand the commands. I want to use nutch in distribuited mode, could you give me a good documentation of it.
_____________________________________________________________________ Ing. Eyeris Rodriguez Rueda Teléfono:837-3370 Universidad de las Ciencias Informáticas _____________________________________________________________________ -----Mensaje original----- De: Markus Jelsma [mailto:markus.jel...@openindex.io] Enviado el: lunes, 03 de diciembre de 2012 1:42 PM Para: user@nutch.apache.org Asunto: RE: hung threads in big nutch crawl process Hi - Hadoop organizes some threads but in Nutch the only job that uses threads is the fetcher. Parses are done using the executor service. It is very well possible that you have some regexes that are very complex and Nutch can take a long time processing those, especially if you parse in the fetcher job. You should run the Nutch jobs separate to find out which job is giving you trouble. -----Original message----- > From:Eyeris Rodriguez Rueda <eru...@uci.cu> > Sent: Mon 03-Dec-2012 20:31 > To: user@nutch.apache.org > Subject: hung threads in big nutch crawl process > > Hi all. > I have detected that in big nutch crawl process(depth:10 topN:100 000) some > threads are hunged in some part of crawl cicle for example normalizing by > regex and fetching urls to. > Im using nutch 1.5.1 and solr 3.6. > Ram:2GB > CPU:CoreI3. > OS:Ubuntu 12.04(server) > > I have a doubt, How nutch manipulate the threads in a cicle of crawl process > ?. > Is multithread the generation,fetching,parsing process ? > > PD:Sorry for my english. Is not my native language. 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci