On Thu, Feb 7, 2013 at 12:47 PM, Eyeris Rodriguez Rueda <eru...@uci.cu>wrote:

> Thank to all for your replies.
> If i want to change the default location for hadoop job(/tmp), where i can
> do that ?, because my nutch-site.xml not include nothing pointing to /tmp.
>
Add this property to nutch-site.xml with appropriate value:

<property>
<name>hadoop.tmp.dir</name>
<value>XXXXXXXXXX</value>
</property>



> So I have readed about nutch and hadoop but im not sure to understand at
> all. Is posible to use nutch 1.5.1 in distributed mode ?

yes


> In this case what i need to do for that, I really appreciated your answer
> because I canĀ“t find a good documentation for this topic.
>
For distributed mode, Nutch is called from runtime/deploy. The conf files
should be modified in runtime/local/conf, not in $NUTCH_HOME/conf.
So modify the runtime/local/conf/nutch-site.xml to set
http.agent.nameproperly.  I am assuming that the hadoop setup is in
place and hadoop
variables are exported. Now, run the nutch commands from runtime/deploy.

Thanks,
Tejas Patil

>
>
>
> ----- Mensaje original -----
> De: "Tejas Patil" <tejas.patil...@gmail.com>
> Para: user@nutch.apache.org
> Enviados: Jueves, 7 de Febrero 2013 14:04:26
> Asunto: Re: Could not find any valid local directory for output/file.out
>
> Nutch jobs are executed by Hadoop. "/tmp" is the default location used by
> hadoop to store temporary data required for a job. If you dont over-ride
> hadoop.tmp.dir in any config file, it will use /tmp by default. In your
> case, /tmp doesnt have ample space left so better over-ride that property
> and point it to some other location which has ample space.
>
> Thanks,
> Tejas Patil
>
>
> On Thu, Feb 7, 2013 at 10:38 AM, Eyeris Rodriguez Rueda <eru...@uci.cu
> >wrote:
>
> > Thanks lewis by your answer.
> > My doubt is why /tmp is increasing while crawl process is doing, and why
> > nutch use that folder. Im using nutch 1.5.1 in single mode and my nutch
> > site not have properties hadoop.tmp.dir. I need reduce the space used for
> > that folder because I only have 40 GB for nutch machine and 50 GB for
> solr
> > machine. Please some advice or explanation will be accepted.
> > Thanks for your time.
> >
> >
> >
> > ----- Mensaje original -----
> > De: "Lewis John Mcgibbney" <lewis.mcgibb...@gmail.com>
> > Para: user@nutch.apache.org
> > Enviados: Jueves, 7 de Febrero 2013 13:06:11
> > Asunto: Re: Could not find any valid local directory for output/file.out
> >
> > Hi,
> >
> >
> >
> https://wiki.apache.org/nutch/NutchGotchas#DiskErrorException_while_fetching
> >
> > On Thursday, February 7, 2013, Eyeris Rodriguez Rueda <eru...@uci.cu>
> > wrote:
> > > Hi all.
> > > I have a problem when i do a crawl for few hour or days, im using nutch
> > 1.5.1 and solr 3.6, but the crawl process fails and i dont know how to
> fix
> > this problem, im intersted in make a crawl process without limit with 10
> > cicles or more but i have problem with space on hard disk, i have
> detected
> > that /etc/tmp have 29 GB used and is not good for me, any body can help
> me
> > or give some advices for configure nutch to make at least one crawl
> process
> > without problems ?
> > >
> > > here some features of my environment
> > > Ram 2 GB
> > > CPU:QuadCore(but im using only 2 cores)
> > > Hard Disk:40 GB
> > > Threads:50
> > > db.fetch.interval.default=2 days
> > >
> > >
> > >
> > > this is a part of my log file when nutch fails:
> > >
> > > ****************************************************************
> > > 2013-02-06 18:45:25,961 INFO  fetcher.Fetcher - fetching
> > http://bibliodoc.uci.cu/TD/TD_03349_10.pdf
> > > 2013-02-06 18:45:25,964 INFO  fetcher.Fetcher - fetching
> > http://bibliodoc.uci.cu/TD/TD_0442_07.pdf
> > > 2013-02-06 18:45:25,977 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=49
> > > 2013-02-06 18:45:26,109 INFO  fetcher.Fetcher - -activeThreads=49,
> > spinWaiting=39, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:26,180 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=48
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=47
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=46
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=44
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=45
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=40
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=39
> > > 2013-02-06 18:45:26,332 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=38
> > > 2013-02-06 18:45:26,332 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=37
> > > 2013-02-06 18:45:26,332 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=36
> > > 2013-02-06 18:45:26,332 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=35
> > > 2013-02-06 18:45:26,332 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=34
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=33
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=32
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=31
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=30
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=29
> > > 2013-02-06 18:45:26,333 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=28
> > > 2013-02-06 18:45:26,334 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=27
> > > 2013-02-06 18:45:26,334 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=26
> > > 2013-02-06 18:45:26,334 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=25
> > > 2013-02-06 18:45:26,334 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=24
> > > 2013-02-06 18:45:26,334 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=23
> > > 2013-02-06 18:45:26,335 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=22
> > > 2013-02-06 18:45:26,335 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=21
> > > 2013-02-06 18:45:26,335 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=20
> > > 2013-02-06 18:45:26,335 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=19
> > > 2013-02-06 18:45:26,335 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=18
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=41
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=17
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=15
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=13
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=12
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=42
> > > 2013-02-06 18:45:26,331 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=43
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=9
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=10
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=11
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=14
> > > 2013-02-06 18:45:26,336 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=16
> > > 2013-02-06 18:45:26,404 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=8
> > > 2013-02-06 18:45:26,630 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=7
> > > 2013-02-06 18:45:27,069 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=6
> > > 2013-02-06 18:45:27,110 INFO  fetcher.Fetcher - -activeThreads=6,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:27,129 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=5
> > > 2013-02-06 18:45:28,110 INFO  fetcher.Fetcher - -activeThreads=5,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:28,502 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=4
> > > 2013-02-06 18:45:29,111 INFO  fetcher.Fetcher - -activeThreads=4,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:30,123 INFO  fetcher.Fetcher - -activeThreads=4,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:31,127 INFO  fetcher.Fetcher - -activeThreads=4,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:31,187 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=3
> > > 2013-02-06 18:45:32,171 INFO  fetcher.Fetcher - -activeThreads=3,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:32,206 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=2
> > > 2013-02-06 18:45:33,173 INFO  fetcher.Fetcher - -activeThreads=2,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:34,173 INFO  fetcher.Fetcher - -activeThreads=2,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:34,205 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=1
> > > 2013-02-06 18:45:34,457 INFO  fetcher.Fetcher - -finishing thread
> > FetcherThread, activeThreads=0
> > > 2013-02-06 18:45:35,174 INFO  fetcher.Fetcher - -activeThreads=0,
> > spinWaiting=0, fetchQueues.totalSize=0
> > > 2013-02-06 18:45:35,174 INFO  fetcher.Fetcher - -activeThreads=0
> > > 2013-02-06 18:45:35,742 WARN  mapred.LocalJobRunner - job_local_0015
> > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> any
> > valid local directory for output/file.out
> > >         at
> >
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > >         at
> >
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > >         at
> >
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > >         at
> >
> >
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:69)
> > >         at
> >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1640)
> > >         at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1323)
> > >         at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
> > >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> > >         at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> > >
> >
> > --
> > *Lewis*
> >
>

Reply via email to