Thank markus for your anwer.
I always have used nutch with console making a complete cycle
bin/nutch crawl urls -dir crawl -depth 10 -topN 100000 -solr 
http://localhost:8080/solr
Could you explain me how to use a separately process. I was reading the wiki 
but not function for me because I don’t understand the commands. I want to use 
nutch in distribuited mode, could you give me a good documentation of it.

_____________________________________________________________________
Ing. Eyeris Rodriguez Rueda
Teléfono:837-3370
Universidad de las Ciencias Informáticas
_____________________________________________________________________

-----Mensaje original-----
De: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Enviado el: lunes, 03 de diciembre de 2012 1:42 PM
Para: user@nutch.apache.org
Asunto: RE: hung threads in big nutch crawl process

Hi - Hadoop organizes some threads but in Nutch the only job that uses threads 
is the fetcher. Parses are done using the executor service.

It is very well possible that you have some regexes that are very complex and 
Nutch can take a long time processing those, especially if you parse in the 
fetcher job.

You should run the Nutch jobs separate to find out which job is giving you 
trouble.

-----Original message-----
> From:Eyeris Rodriguez Rueda <eru...@uci.cu>
> Sent: Mon 03-Dec-2012 20:31
> To: user@nutch.apache.org
> Subject: hung threads in big nutch crawl process
> 
> Hi all.
> I have detected that in big nutch crawl process(depth:10 topN:100 000) some 
> threads are hunged in some part of crawl cicle for example normalizing by 
> regex and fetching urls to.
> Im using nutch 1.5.1 and solr 3.6.
> Ram:2GB
> CPU:CoreI3.
> OS:Ubuntu 12.04(server)
> 
> I have a doubt, How nutch manipulate the threads in a cicle of crawl process 
> ?.
> Is multithread the generation,fetching,parsing process ? 
> 
> PD:Sorry for my english. Is not my native language.


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to