On Thu, Feb 7, 2013 at 12:47 PM, Eyeris Rodriguez Rueda <eru...@uci.cu>wrote:
> Thank to all for your replies. > If i want to change the default location for hadoop job(/tmp), where i can > do that ?, because my nutch-site.xml not include nothing pointing to /tmp. > Add this property to nutch-site.xml with appropriate value: <property> <name>hadoop.tmp.dir</name> <value>XXXXXXXXXX</value> </property> > So I have readed about nutch and hadoop but im not sure to understand at > all. Is posible to use nutch 1.5.1 in distributed mode ? yes > In this case what i need to do for that, I really appreciated your answer > because I canĀ“t find a good documentation for this topic. > For distributed mode, Nutch is called from runtime/deploy. The conf files should be modified in runtime/local/conf, not in $NUTCH_HOME/conf. So modify the runtime/local/conf/nutch-site.xml to set http.agent.nameproperly. I am assuming that the hadoop setup is in place and hadoop variables are exported. Now, run the nutch commands from runtime/deploy. Thanks, Tejas Patil > > > > ----- Mensaje original ----- > De: "Tejas Patil" <tejas.patil...@gmail.com> > Para: user@nutch.apache.org > Enviados: Jueves, 7 de Febrero 2013 14:04:26 > Asunto: Re: Could not find any valid local directory for output/file.out > > Nutch jobs are executed by Hadoop. "/tmp" is the default location used by > hadoop to store temporary data required for a job. If you dont over-ride > hadoop.tmp.dir in any config file, it will use /tmp by default. In your > case, /tmp doesnt have ample space left so better over-ride that property > and point it to some other location which has ample space. > > Thanks, > Tejas Patil > > > On Thu, Feb 7, 2013 at 10:38 AM, Eyeris Rodriguez Rueda <eru...@uci.cu > >wrote: > > > Thanks lewis by your answer. > > My doubt is why /tmp is increasing while crawl process is doing, and why > > nutch use that folder. Im using nutch 1.5.1 in single mode and my nutch > > site not have properties hadoop.tmp.dir. I need reduce the space used for > > that folder because I only have 40 GB for nutch machine and 50 GB for > solr > > machine. Please some advice or explanation will be accepted. > > Thanks for your time. > > > > > > > > ----- Mensaje original ----- > > De: "Lewis John Mcgibbney" <lewis.mcgibb...@gmail.com> > > Para: user@nutch.apache.org > > Enviados: Jueves, 7 de Febrero 2013 13:06:11 > > Asunto: Re: Could not find any valid local directory for output/file.out > > > > Hi, > > > > > > > https://wiki.apache.org/nutch/NutchGotchas#DiskErrorException_while_fetching > > > > On Thursday, February 7, 2013, Eyeris Rodriguez Rueda <eru...@uci.cu> > > wrote: > > > Hi all. > > > I have a problem when i do a crawl for few hour or days, im using nutch > > 1.5.1 and solr 3.6, but the crawl process fails and i dont know how to > fix > > this problem, im intersted in make a crawl process without limit with 10 > > cicles or more but i have problem with space on hard disk, i have > detected > > that /etc/tmp have 29 GB used and is not good for me, any body can help > me > > or give some advices for configure nutch to make at least one crawl > process > > without problems ? > > > > > > here some features of my environment > > > Ram 2 GB > > > CPU:QuadCore(but im using only 2 cores) > > > Hard Disk:40 GB > > > Threads:50 > > > db.fetch.interval.default=2 days > > > > > > > > > > > > this is a part of my log file when nutch fails: > > > > > > **************************************************************** > > > 2013-02-06 18:45:25,961 INFO fetcher.Fetcher - fetching > > http://bibliodoc.uci.cu/TD/TD_03349_10.pdf > > > 2013-02-06 18:45:25,964 INFO fetcher.Fetcher - fetching > > http://bibliodoc.uci.cu/TD/TD_0442_07.pdf > > > 2013-02-06 18:45:25,977 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=49 > > > 2013-02-06 18:45:26,109 INFO fetcher.Fetcher - -activeThreads=49, > > spinWaiting=39, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:26,180 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=48 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=47 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=46 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=44 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=45 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=40 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=39 > > > 2013-02-06 18:45:26,332 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=38 > > > 2013-02-06 18:45:26,332 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=37 > > > 2013-02-06 18:45:26,332 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=36 > > > 2013-02-06 18:45:26,332 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=35 > > > 2013-02-06 18:45:26,332 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=34 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=33 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=32 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=31 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=30 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=29 > > > 2013-02-06 18:45:26,333 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=28 > > > 2013-02-06 18:45:26,334 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=27 > > > 2013-02-06 18:45:26,334 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=26 > > > 2013-02-06 18:45:26,334 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=25 > > > 2013-02-06 18:45:26,334 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=24 > > > 2013-02-06 18:45:26,334 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=23 > > > 2013-02-06 18:45:26,335 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=22 > > > 2013-02-06 18:45:26,335 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=21 > > > 2013-02-06 18:45:26,335 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=20 > > > 2013-02-06 18:45:26,335 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=19 > > > 2013-02-06 18:45:26,335 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=18 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=41 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=17 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=15 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=13 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=12 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=42 > > > 2013-02-06 18:45:26,331 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=43 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=9 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=10 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=11 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=14 > > > 2013-02-06 18:45:26,336 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=16 > > > 2013-02-06 18:45:26,404 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=8 > > > 2013-02-06 18:45:26,630 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=7 > > > 2013-02-06 18:45:27,069 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=6 > > > 2013-02-06 18:45:27,110 INFO fetcher.Fetcher - -activeThreads=6, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:27,129 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=5 > > > 2013-02-06 18:45:28,110 INFO fetcher.Fetcher - -activeThreads=5, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:28,502 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=4 > > > 2013-02-06 18:45:29,111 INFO fetcher.Fetcher - -activeThreads=4, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:30,123 INFO fetcher.Fetcher - -activeThreads=4, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:31,127 INFO fetcher.Fetcher - -activeThreads=4, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:31,187 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=3 > > > 2013-02-06 18:45:32,171 INFO fetcher.Fetcher - -activeThreads=3, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:32,206 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=2 > > > 2013-02-06 18:45:33,173 INFO fetcher.Fetcher - -activeThreads=2, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:34,173 INFO fetcher.Fetcher - -activeThreads=2, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:34,205 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=1 > > > 2013-02-06 18:45:34,457 INFO fetcher.Fetcher - -finishing thread > > FetcherThread, activeThreads=0 > > > 2013-02-06 18:45:35,174 INFO fetcher.Fetcher - -activeThreads=0, > > spinWaiting=0, fetchQueues.totalSize=0 > > > 2013-02-06 18:45:35,174 INFO fetcher.Fetcher - -activeThreads=0 > > > 2013-02-06 18:45:35,742 WARN mapred.LocalJobRunner - job_local_0015 > > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > any > > valid local directory for output/file.out > > > at > > > > > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381) > > > at > > > > > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > > > at > > > > > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) > > > at > > > > > org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:69) > > > at > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1640) > > > at > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1323) > > > at > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437) > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > > > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > > > > > > > -- > > *Lewis* > > >