Re: pointing mapred.local.dir to a ramdisk

Edward Capriolo Mon, 03 Oct 2011 11:04:03 -0700

This directory can get very large, in many cases I doubt it would fit on a
ram disk.


Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
vino...@hortonworks.com> wrote:

> Must be related to some kind of permissions problems.
>
> It will help if you can paste the corresponding source code for
> FileUtil.copy(). Hard to track it with different versions, so.
>
> Thanks,
> +Vinod
>
>
> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <rajv...@yahoo.com> wrote:
>
> > Eric
> >
> > Yes. The owner is hdfs and group is hadoop and the directory is group
> > writable(775).  This is tehe exact same configuration I have when I use
> real
> > disks.But let me give it a try again to see if I overlooked something.
> > Thanks
> >
> > Raj
> >
> > >________________________________
> > >From: Eric Caspole <eric.casp...@amd.com>
> > >To: common-user@hadoop.apache.org
> > >Sent: Monday, October 3, 2011 8:44 AM
> > >Subject: Re: pointing mapred.local.dir to a ramdisk
> > >
> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
> > writeable by your hadoop user? I have played with this in the past and it
> > should basically work.
> > >
> > >
> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
> > >
> > >> Sending it to the hadoop mailing list - I think this is a hadoop
> related
> > problem and not related to Cloudera distribution.
> > >>
> > >> Raj
> > >>
> > >>
> > >> ----- Forwarded Message -----
> > >>> From: Raj V <rajv...@yahoo.com>
> > >>> To: CDH Users <cdh-u...@cloudera.org>
> > >>> Sent: Friday, September 30, 2011 5:21 PM
> > >>> Subject: pointing mapred.local.dir to a ramdisk
> > >>>
> > >>>
> > >>> Hi all
> > >>>
> > >>>
> > >>> I have been trying some experiments to improve performance. One of
> the
> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
> I
> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
> but
> > I have not been able to get the task tracker to start.
> > >>>
> > >>>
> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
> > from the task tracker log.
> > >>>
> > >>>
> > >>> Tasktracker logs
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
> > global filtersafety
> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
> -1.
> > Opening the listener on 50060
> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
> > listener.getLocalPort() returned 50060
> > webServer.getConnectors()[0].getLocalPort() returned 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
> > bound to port 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
> > SelectChannelConnector@0.0.0.0:50060
> > >>> 2011-09-30 16:50:02,400 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> > with mapRetainSize=-1 and reduceRetainSize=-1
> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
> > Starting tasktracker with owner as mapred
> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
> Can
> > not start task tracker because java.lang.NullPointerException
> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
> > >>>         at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
> > >>>         at
> >
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
> > >>>         at
> >
> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
> > >>>         at
> >
> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
> > SHUTDOWN_MSG:
> > >>> /************************************************************
> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
> > >>>
> > >>>
> > >>> and here is my mapred-site.xml file
> > >>>
> > >>>
> > >>> <property>
> > >>>     <name>mapred.local.dir</name>
> > >>>     <value>/ramdisk1</value>
> > >>>   </property>
> > >>>
> > >>>
> > >>> If I have a regular directory on a regular drive such as below - it
> > works. If I don't mount the ramdisk - it works.
> > >>>
> > >>>
> > >>> <property>
> > >>>     <name>mapred.local.dir</name>
> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
> > >>>   </property>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> The NullPointerException does not tell me what the error is or how to
> > fix it.
> > >>>
> > >>>
> > >>> From the logs it looks like some disk based operation failed. I can't
> > guess I must also confess that this is the first time I am using an ext2
> > file system.
> > >>>
> > >>>
> > >>> Any ideas?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Raj
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >
> > >
> > >
> > >
> > >
> >
>

Re: pointing mapred.local.dir to a ramdisk

Reply via email to