Re: pointing mapred.local.dir to a ramdisk

2011-10-05 Thread Raj V
Thanks Joey, Todd,  Vinod , Edward

I have mixed news. The problem of the task tracker not starting was was indeed 
permssion related. under /ramdisk there was a lost+found that was owned by 
root, eventhough /ramdisk was owned by mapred:hadoop. This was the cause of the 
problem. Now I will see if I can fix the error message to something better than 
Null pointer exception. 

Once I saw all my task trackers were UP I started with my TTT ( trusted teragen 
and terasort :-)).

I ran teragen with a data size of 10GB ( 100MB records). I have 500 nodes and I 
wanted 2000 files.  

It takes 19 minutes to complete - awfully slow  There is no swapping ,, CPU is 
not pegged so things look Ok. I ran it a couple of times and it takes about 
15-19 minutes. It is the not the same same nodes either.  But that could be a 
problem with something local.  We will ignore it for now.

TeraSort doees not ever complete. 

I get the following errors on majority of the nodes.

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid 
local directory for output/spill0.out at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:376)
 at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
 at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
 at 
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1247)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1155) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:392) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at 
org.apache.hadoop.mapred.Child$4.run(Child.java:270) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:396) at
 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
 at org.apache.hadoop.mapred.Child.main

I know this indicates lack of space but I have a df monitoring the disk space 
on all the nodes and all nodes have loads of dissk space available. The 
ramdiisk is never more than 25% full.

So once again - any clues?

Raj



From: Raj V rajv...@yahoo.com
To: common-user@hadoop.apache.org common-user@hadoop.apache.org
Sent: Monday, October 3, 2011 12:31 PM
Subject: Re: pointing mapred.local.dir to a ramdisk

Joey

Thanks. Will try and uppgrade to a newer version and check. I will also change 
the logs to debug and see if more information is available.

Raj




From: Joey Echeverria j...@cloudera.com
To: common-user@hadoop.apache.org; Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 11:49 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

Raj,

I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that will be helpful.

-Joey

On Mon, Oct 3, 2011 at 2:18 PM, Raj V rajv...@yahoo.com wrote:
 Edward

 I understand the size limitations - but for my experiment the ramdisk size 
 I have created is large enough.
 I think there will be substantial benefits by putting the intermediate map 
 outputs on a ramdisk - size permitting, ofcourse, but I can't provide any 
 numbers to substantiate my claim  given that I can't get it to run.

 -best regards

 Raj




From: Edward Capriolo edlinuxg...@gmail.com
To: common-user@hadoop.apache.org
Cc: Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 10:36 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Must be related to some kind of permissions problems.

 It will help if you can paste the corresponding source code for
 FileUtil.copy(). Hard to track it with different versions, so.

 Thanks,
 +Vinod


 On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

  Eric
 
  Yes. The owner is hdfs and group is hadoop and the directory is group
  writable(775).  This is tehe exact same configuration I have when I use
 real
  disks.But let me give it a try again to see if I overlooked something.
  Thanks
 
  Raj
 
  
  From: Eric Caspole eric.casp...@amd.com
  To: common-user@hadoop.apache.org
  Sent: Monday, October 3, 2011 8:44 AM
  Subject: Re: pointing mapred.local.dir to a ramdisk
  
  Are you sure you have chown'd/chmod'd the ramdisk directory

Fw: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Sending it to the hadoop mailing list - I think this is a hadoop related 
problem and not related to Cloudera distribution.

Raj


- Forwarded Message -
From: Raj V rajv...@yahoo.com
To: CDH Users cdh-u...@cloudera.org
Sent: Friday, September 30, 2011 5:21 PM
Subject: pointing mapred.local.dir to a ramdisk


Hi all


I have been trying some experiments to improve performance. One of the 
experiments involved pointing mapred.local.dir to a RAM disk. To this end I 
created a 128MB RAM disk ( each of my map outputs are smaller than this) but I 
have not been able to get the task tracker to start.


I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message from the 
task tracker log.


Tasktracker logs


2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added global 
filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port returned 
by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening 
the listener on 50060
2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer: 
listener.getLocalPort() returned 50060 
webServer.getConnectors()[0].getLocalPort() returned 50060
2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty bound to 
port 50060
2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
2011-09-30 16:50:02,388 INFO org.mortbay.log: Started 
SelectChannelConnector@0.0.0.0:50060
2011-09-30 16:50:02,400 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
tasktracker with owner as mapred
2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can not 
start task tracker because java.lang.NullPointerException
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
        at 
org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
        at 
org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
        at 
org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
        at org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)


2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker: 
SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5


and here is my mapred-site.xml file


property
    namemapred.local.dir/name
    value/ramdisk1/value
  /property


If I have a regular directory on a regular drive such as below - it works. If 
I don't mount the ramdisk - it works.


property
    namemapred.local.dir/name
    value/hadoop-dsk0/local,/hadoop-dsk1/local/value
  /property





The NullPointerException does not tell me what the error is or how to fix it.


From the logs it looks like some disk based operation failed. I can't guess I 
must also confess that this is the first time I am using an ext2 file system.


Any ideas?




Raj









Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Eric Caspole
Are you sure you have chown'd/chmod'd the ramdisk directory to be  
writeable by your hadoop user? I have played with this in the past  
and it should basically work.



On Oct 3, 2011, at 10:37 AM, Raj V wrote:

Sending it to the hadoop mailing list - I think this is a hadoop  
related problem and not related to Cloudera distribution.


Raj


- Forwarded Message -

From: Raj V rajv...@yahoo.com
To: CDH Users cdh-u...@cloudera.org
Sent: Friday, September 30, 2011 5:21 PM
Subject: pointing mapred.local.dir to a ramdisk


Hi all


I have been trying some experiments to improve performance. One of  
the experiments involved pointing mapred.local.dir to a RAM disk.  
To this end I created a 128MB RAM disk ( each of my map outputs  
are smaller than this) but I have not been able to get the task  
tracker to start.



I am running CDH3B3 ( hadoop-0.20.2+737) and here the error  
message from the task tracker log.



Tasktracker logs


2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to  
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via  
org.mortbay.log.Slf4jLog
2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer:  
Added global filtersafety (class=org.apache.hadoop.http.HttpServer 
$QuotingInputFilter)
2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer:  
Port returned by webServer.getConnectors()[0].getLocalPort()  
before open() is -1. Opening the listener on 50060
2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:  
listener.getLocalPort() returned 50060 webServer.getConnectors() 
[0].getLocalPort() returned 50060
2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer:  
Jetty bound to port 50060

2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
2011-09-30 16:50:02,388 INFO org.mortbay.log: Started  
SelectChannelConnector@0.0.0.0:50060
2011-09-30 16:50:02,400 INFO  
org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'  
truncater with mapRetainSize=-1 and reduceRetainSize=-1
2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:  
Starting tasktracker with owner as mapred
2011-09-30 16:50:02,493 ERROR  
org.apache.hadoop.mapred.TaskTracker: Can not start task tracker  
because java.lang.NullPointerException

at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
at org.apache.hadoop.fs.RawLocalFileSystem.rename 
(RawLocalFileSystem.java:253)
at org.apache.hadoop.fs.ChecksumFileSystem.rename 
(ChecksumFileSystem.java:404)
at  
org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath 
(MRAsyncDiskService.java:255)
at  
org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes 
(MRAsyncDiskService.java:311)
at org.apache.hadoop.mapred.TaskTracker.initialize 
(TaskTracker.java:618)
at org.apache.hadoop.mapred.TaskTracker.init 
(TaskTracker.java:1351)
at org.apache.hadoop.mapred.TaskTracker.main 
(TaskTracker.java:3504)



2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:  
SHUTDOWN_MSG:

/
SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5


and here is my mapred-site.xml file


property
namemapred.local.dir/name
value/ramdisk1/value
  /property


If I have a regular directory on a regular drive such as below -  
it works. If I don't mount the ramdisk - it works.



property
namemapred.local.dir/name
value/hadoop-dsk0/local,/hadoop-dsk1/local/value
  /property





The NullPointerException does not tell me what the error is or how  
to fix it.



From the logs it looks like some disk based operation failed. I  
can't guess I must also confess that this is the first time I am  
using an ext2 file system.



Any ideas?




Raj












Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Eric

Yes. The owner is hdfs and group is hadoop and the directory is group 
writable(775).  This is tehe exact same configuration I have when I use real 
disks.But let me give it a try again to see if I overlooked something.
Thanks

Raj


From: Eric Caspole eric.casp...@amd.com
To: common-user@hadoop.apache.org
Sent: Monday, October 3, 2011 8:44 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

Are you sure you have chown'd/chmod'd the ramdisk directory to be writeable by 
your hadoop user? I have played with this in the past and it should basically 
work.


On Oct 3, 2011, at 10:37 AM, Raj V wrote:

 Sending it to the hadoop mailing list - I think this is a hadoop related 
 problem and not related to Cloudera distribution.
 
 Raj
 
 
 - Forwarded Message -
 From: Raj V rajv...@yahoo.com
 To: CDH Users cdh-u...@cloudera.org
 Sent: Friday, September 30, 2011 5:21 PM
 Subject: pointing mapred.local.dir to a ramdisk
 
 
 Hi all
 
 
 I have been trying some experiments to improve performance. One of the 
 experiments involved pointing mapred.local.dir to a RAM disk. To this end I 
 created a 128MB RAM disk ( each of my map outputs are smaller than this) 
 but I have not been able to get the task tracker to start.
 
 
 I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message from 
 the task tracker log.
 
 
 Tasktracker logs
 
 
 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to 
 org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
 org.mortbay.log.Slf4jLog
 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added 
 global filtersafety 
 (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port 
 returned by webServer.getConnectors()[0].getLocalPort() before open() is 
 -1. Opening the listener on 50060
 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer: 
 listener.getLocalPort() returned 50060 
 webServer.getConnectors()[0].getLocalPort() returned 50060
 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty bound 
 to port 50060
 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started 
 SelectChannelConnector@0.0.0.0:50060
 2011-09-30 16:50:02,400 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
 Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker: Starting 
 tasktracker with owner as mapred
 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can not 
 start task tracker because java.lang.NullPointerException
         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
         at 
org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
         at 
org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
         at 
org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
         at 
org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
         at 
org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
         at 
org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
 
 
 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker: 
 SHUTDOWN_MSG:
 /
 SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
 
 
 and here is my mapred-site.xml file
 
 
 property
     namemapred.local.dir/name
     value/ramdisk1/value
   /property
 
 
 If I have a regular directory on a regular drive such as below - it works. 
 If I don't mount the ramdisk - it works.
 
 
 property
     namemapred.local.dir/name
     value/hadoop-dsk0/local,/hadoop-dsk1/local/value
   /property
 
 
 
 
 
 The NullPointerException does not tell me what the error is or how to fix 
 it.
 
 
 From the logs it looks like some disk based operation failed. I can't guess 
 I must also confess that this is the first time I am using an ext2 file 
 system.
 
 
 Any ideas?
 
 
 
 
 Raj
 
 
 
 
 
 
 






Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Vinod Kumar Vavilapalli
Must be related to some kind of permissions problems.

It will help if you can paste the corresponding source code for
FileUtil.copy(). Hard to track it with different versions, so.

Thanks,
+Vinod


On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

 Eric

 Yes. The owner is hdfs and group is hadoop and the directory is group
 writable(775).  This is tehe exact same configuration I have when I use real
 disks.But let me give it a try again to see if I overlooked something.
 Thanks

 Raj

 
 From: Eric Caspole eric.casp...@amd.com
 To: common-user@hadoop.apache.org
 Sent: Monday, October 3, 2011 8:44 AM
 Subject: Re: pointing mapred.local.dir to a ramdisk
 
 Are you sure you have chown'd/chmod'd the ramdisk directory to be
 writeable by your hadoop user? I have played with this in the past and it
 should basically work.
 
 
 On Oct 3, 2011, at 10:37 AM, Raj V wrote:
 
  Sending it to the hadoop mailing list - I think this is a hadoop related
 problem and not related to Cloudera distribution.
 
  Raj
 
 
  - Forwarded Message -
  From: Raj V rajv...@yahoo.com
  To: CDH Users cdh-u...@cloudera.org
  Sent: Friday, September 30, 2011 5:21 PM
  Subject: pointing mapred.local.dir to a ramdisk
 
 
  Hi all
 
 
  I have been trying some experiments to improve performance. One of the
 experiments involved pointing mapred.local.dir to a RAM disk. To this end I
 created a 128MB RAM disk ( each of my map outputs are smaller than this) but
 I have not been able to get the task tracker to start.
 
 
  I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
 from the task tracker log.
 
 
  Tasktracker logs
 
 
  2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
 org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
 org.mortbay.log.Slf4jLog
  2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
 global filtersafety
 (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
  2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
 returned by webServer.getConnectors()[0].getLocalPort() before open() is -1.
 Opening the listener on 50060
  2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
 listener.getLocalPort() returned 50060
 webServer.getConnectors()[0].getLocalPort() returned 50060
  2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
 bound to port 50060
  2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
  2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
 SelectChannelConnector@0.0.0.0:50060
  2011-09-30 16:50:02,400 INFO
 org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
 with mapRetainSize=-1 and reduceRetainSize=-1
  2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
 Starting tasktracker with owner as mapred
  2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can
 not start task tracker because java.lang.NullPointerException
  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
  at
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
  at
 org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
  at
 org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
  at
 org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
  at
 org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
  at
 org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
  at
 org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
 
 
  2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
 SHUTDOWN_MSG:
  /
  SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
 
 
  and here is my mapred-site.xml file
 
 
  property
  namemapred.local.dir/name
  value/ramdisk1/value
/property
 
 
  If I have a regular directory on a regular drive such as below - it
 works. If I don't mount the ramdisk - it works.
 
 
  property
  namemapred.local.dir/name
  value/hadoop-dsk0/local,/hadoop-dsk1/local/value
/property
 
 
 
 
 
  The NullPointerException does not tell me what the error is or how to
 fix it.
 
 
  From the logs it looks like some disk based operation failed. I can't
 guess I must also confess that this is the first time I am using an ext2
 file system.
 
 
  Any ideas?
 
 
 
 
  Raj
 
 
 
 
 
 
 
 
 
 
 
 



Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Edward Capriolo
This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Must be related to some kind of permissions problems.

 It will help if you can paste the corresponding source code for
 FileUtil.copy(). Hard to track it with different versions, so.

 Thanks,
 +Vinod


 On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

  Eric
 
  Yes. The owner is hdfs and group is hadoop and the directory is group
  writable(775).  This is tehe exact same configuration I have when I use
 real
  disks.But let me give it a try again to see if I overlooked something.
  Thanks
 
  Raj
 
  
  From: Eric Caspole eric.casp...@amd.com
  To: common-user@hadoop.apache.org
  Sent: Monday, October 3, 2011 8:44 AM
  Subject: Re: pointing mapred.local.dir to a ramdisk
  
  Are you sure you have chown'd/chmod'd the ramdisk directory to be
  writeable by your hadoop user? I have played with this in the past and it
  should basically work.
  
  
  On Oct 3, 2011, at 10:37 AM, Raj V wrote:
  
   Sending it to the hadoop mailing list - I think this is a hadoop
 related
  problem and not related to Cloudera distribution.
  
   Raj
  
  
   - Forwarded Message -
   From: Raj V rajv...@yahoo.com
   To: CDH Users cdh-u...@cloudera.org
   Sent: Friday, September 30, 2011 5:21 PM
   Subject: pointing mapred.local.dir to a ramdisk
  
  
   Hi all
  
  
   I have been trying some experiments to improve performance. One of
 the
  experiments involved pointing mapred.local.dir to a RAM disk. To this end
 I
  created a 128MB RAM disk ( each of my map outputs are smaller than this)
 but
  I have not been able to get the task tracker to start.
  
  
   I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
  from the task tracker log.
  
  
   Tasktracker logs
  
  
   2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
   2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
  global filtersafety
  (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
  returned by webServer.getConnectors()[0].getLocalPort() before open() is
 -1.
  Opening the listener on 50060
   2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
  listener.getLocalPort() returned 50060
  webServer.getConnectors()[0].getLocalPort() returned 50060
   2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
  bound to port 50060
   2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
   2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
  SelectChannelConnector@0.0.0.0:50060
   2011-09-30 16:50:02,400 INFO
  org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
  with mapRetainSize=-1 and reduceRetainSize=-1
   2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
  Starting tasktracker with owner as mapred
   2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
 Can
  not start task tracker because java.lang.NullPointerException
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
   at
 
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
   at
 
 org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
   at
 
 org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
   at
 
 org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
   at
  org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
   at
  org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
   at
  org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
  
  
   2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
  SHUTDOWN_MSG:
   /
   SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
  
  
   and here is my mapred-site.xml file
  
  
   property
   namemapred.local.dir/name
   value/ramdisk1/value
 /property
  
  
   If I have a regular directory on a regular drive such as below - it
  works. If I don't mount the ramdisk - it works.
  
  
   property
   namemapred.local.dir/name
   value/hadoop-dsk0/local,/hadoop-dsk1/local/value
 /property
  
  
  
  
  
   The NullPointerException does not tell me what the error is or how to
  fix it.
  
  
   From the logs it looks like some disk based operation failed. I can't
  guess I must also

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Vinod

Carefully checked everything again. The permissions are 775 and the owner is 
hdfs:hadoop.  The task tracker creates a directory called toBeDeleted under 
/ramdisk so things do not seem to be permssion related.  The task tracker 
starts happily if I don't mount the ramdisk and leave everything else the same.



Raj




From: Vinod Kumar Vavilapalli vino...@hortonworks.com
To: common-user@hadoop.apache.org; Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 9:07 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

Must be related to some kind of permissions problems.

It will help if you can paste the corresponding source code for
FileUtil.copy(). Hard to track it with different versions, so.

Thanks,
+Vinod


On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

 Eric

 Yes. The owner is hdfs and group is hadoop and the directory is group
 writable(775).  This is tehe exact same configuration I have when I use real
 disks.But let me give it a try again to see if I overlooked something.
 Thanks

 Raj

 
 From: Eric Caspole eric.casp...@amd.com
 To: common-user@hadoop.apache.org
 Sent: Monday, October 3, 2011 8:44 AM
 Subject: Re: pointing mapred.local.dir to a ramdisk
 
 Are you sure you have chown'd/chmod'd the ramdisk directory to be
 writeable by your hadoop user? I have played with this in the past and it
 should basically work.
 
 
 On Oct 3, 2011, at 10:37 AM, Raj V wrote:
 
  Sending it to the hadoop mailing list - I think this is a hadoop related
 problem and not related to Cloudera distribution.
 
  Raj
 
 
  - Forwarded Message -
  From: Raj V rajv...@yahoo.com
  To: CDH Users cdh-u...@cloudera.org
  Sent: Friday, September 30, 2011 5:21 PM
  Subject: pointing mapred.local.dir to a ramdisk
 
 
  Hi all
 
 
  I have been trying some experiments to improve performance. One of the
 experiments involved pointing mapred.local.dir to a RAM disk. To this end I
 created a 128MB RAM disk ( each of my map outputs are smaller than this) but
 I have not been able to get the task tracker to start.
 
 
  I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
 from the task tracker log.
 
 
  Tasktracker logs
 
 
  2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
 org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
 org.mortbay.log.Slf4jLog
  2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
 global filtersafety
 (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
  2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
 returned by webServer.getConnectors()[0].getLocalPort() before open() is -1.
 Opening the listener on 50060
  2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
 listener.getLocalPort() returned 50060
 webServer.getConnectors()[0].getLocalPort() returned 50060
  2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
 bound to port 50060
  2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
  2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
 SelectChannelConnector@0.0.0.0:50060
  2011-09-30 16:50:02,400 INFO
 org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
 with mapRetainSize=-1 and reduceRetainSize=-1
  2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
 Starting tasktracker with owner as mapred
  2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can
 not start task tracker because java.lang.NullPointerException
          at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
          at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
          at
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
          at
 org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
          at
 org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
          at
 org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
          at
 org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
          at
 org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
          at
 org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
 
 
  2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
 SHUTDOWN_MSG:
  /
  SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
 
 
  and here is my mapred-site.xml file
 
 
  property
      namemapred.local.dir/name
      value/ramdisk1/value
    /property
 
 
  If I have a regular directory on a regular drive such as below - it
 works. If I don't mount the ramdisk - it works.
 
 
  property
      namemapred.local.dir/name
      value/hadoop-dsk0/local,/hadoop-dsk1/local/value
    /property
 
 
 
 
 
  The NullPointerException does not tell me what the error is or how to
 fix

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Edward

I understand the size limitations - but for my experiment the ramdisk size I 
have created is large enough. 
I think there will be substantial benefits by putting the intermediate map 
outputs on a ramdisk - size permitting, ofcourse, but I can't provide any 
numbers to substantiate my claim  given that I can't get it to run.

-best regards

Raj




From: Edward Capriolo edlinuxg...@gmail.com
To: common-user@hadoop.apache.org
Cc: Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 10:36 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Must be related to some kind of permissions problems.

 It will help if you can paste the corresponding source code for
 FileUtil.copy(). Hard to track it with different versions, so.

 Thanks,
 +Vinod


 On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

  Eric
 
  Yes. The owner is hdfs and group is hadoop and the directory is group
  writable(775).  This is tehe exact same configuration I have when I use
 real
  disks.But let me give it a try again to see if I overlooked something.
  Thanks
 
  Raj
 
  
  From: Eric Caspole eric.casp...@amd.com
  To: common-user@hadoop.apache.org
  Sent: Monday, October 3, 2011 8:44 AM
  Subject: Re: pointing mapred.local.dir to a ramdisk
  
  Are you sure you have chown'd/chmod'd the ramdisk directory to be
  writeable by your hadoop user? I have played with this in the past and it
  should basically work.
  
  
  On Oct 3, 2011, at 10:37 AM, Raj V wrote:
  
   Sending it to the hadoop mailing list - I think this is a hadoop
 related
  problem and not related to Cloudera distribution.
  
   Raj
  
  
   - Forwarded Message -
   From: Raj V rajv...@yahoo.com
   To: CDH Users cdh-u...@cloudera.org
   Sent: Friday, September 30, 2011 5:21 PM
   Subject: pointing mapred.local.dir to a ramdisk
  
  
   Hi all
  
  
   I have been trying some experiments to improve performance. One of
 the
  experiments involved pointing mapred.local.dir to a RAM disk. To this end
 I
  created a 128MB RAM disk ( each of my map outputs are smaller than this)
 but
  I have not been able to get the task tracker to start.
  
  
   I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
  from the task tracker log.
  
  
   Tasktracker logs
  
  
   2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
   2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
  global filtersafety
  (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
  returned by webServer.getConnectors()[0].getLocalPort() before open() is
 -1.
  Opening the listener on 50060
   2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
  listener.getLocalPort() returned 50060
  webServer.getConnectors()[0].getLocalPort() returned 50060
   2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
  bound to port 50060
   2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
   2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
  SelectChannelConnector@0.0.0.0:50060
   2011-09-30 16:50:02,400 INFO
  org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
  with mapRetainSize=-1 and reduceRetainSize=-1
   2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
  Starting tasktracker with owner as mapred
   2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
 Can
  not start task tracker because java.lang.NullPointerException
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
           at
 
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
           at
 
 org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
           at
 
 org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
           at
 
 org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
           at
  org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
           at
  org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1351)
           at
  org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
  
  
   2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
  SHUTDOWN_MSG:
   /
   SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Joey Echeverria
Raj,

I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that will be helpful.

-Joey

On Mon, Oct 3, 2011 at 2:18 PM, Raj V rajv...@yahoo.com wrote:
 Edward

 I understand the size limitations - but for my experiment the ramdisk size I 
 have created is large enough.
 I think there will be substantial benefits by putting the intermediate map 
 outputs on a ramdisk - size permitting, ofcourse, but I can't provide any 
 numbers to substantiate my claim  given that I can't get it to run.

 -best regards

 Raj




From: Edward Capriolo edlinuxg...@gmail.com
To: common-user@hadoop.apache.org
Cc: Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 10:36 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Must be related to some kind of permissions problems.

 It will help if you can paste the corresponding source code for
 FileUtil.copy(). Hard to track it with different versions, so.

 Thanks,
 +Vinod


 On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

  Eric
 
  Yes. The owner is hdfs and group is hadoop and the directory is group
  writable(775).  This is tehe exact same configuration I have when I use
 real
  disks.But let me give it a try again to see if I overlooked something.
  Thanks
 
  Raj
 
  
  From: Eric Caspole eric.casp...@amd.com
  To: common-user@hadoop.apache.org
  Sent: Monday, October 3, 2011 8:44 AM
  Subject: Re: pointing mapred.local.dir to a ramdisk
  
  Are you sure you have chown'd/chmod'd the ramdisk directory to be
  writeable by your hadoop user? I have played with this in the past and it
  should basically work.
  
  
  On Oct 3, 2011, at 10:37 AM, Raj V wrote:
  
   Sending it to the hadoop mailing list - I think this is a hadoop
 related
  problem and not related to Cloudera distribution.
  
   Raj
  
  
   - Forwarded Message -
   From: Raj V rajv...@yahoo.com
   To: CDH Users cdh-u...@cloudera.org
   Sent: Friday, September 30, 2011 5:21 PM
   Subject: pointing mapred.local.dir to a ramdisk
  
  
   Hi all
  
  
   I have been trying some experiments to improve performance. One of
 the
  experiments involved pointing mapred.local.dir to a RAM disk. To this end
 I
  created a 128MB RAM disk ( each of my map outputs are smaller than this)
 but
  I have not been able to get the task tracker to start.
  
  
   I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
  from the task tracker log.
  
  
   Tasktracker logs
  
  
   2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
   2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
  global filtersafety
  (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
  returned by webServer.getConnectors()[0].getLocalPort() before open() is
 -1.
  Opening the listener on 50060
   2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
  listener.getLocalPort() returned 50060
  webServer.getConnectors()[0].getLocalPort() returned 50060
   2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
  bound to port 50060
   2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
   2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
  SelectChannelConnector@0.0.0.0:50060
   2011-09-30 16:50:02,400 INFO
  org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
  with mapRetainSize=-1 and reduceRetainSize=-1
   2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
  Starting tasktracker with owner as mapred
   2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
 Can
  not start task tracker because java.lang.NullPointerException
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
           at
 
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
           at
 
 org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
           at
 
 org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
           at
 
 org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
           at
  org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Joey

Thanks. Will try and uppgrade to a newer version and check. I will also change 
the logs to debug and see if more information is available.

Raj




From: Joey Echeverria j...@cloudera.com
To: common-user@hadoop.apache.org; Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 11:49 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

Raj,

I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that will be helpful.

-Joey

On Mon, Oct 3, 2011 at 2:18 PM, Raj V rajv...@yahoo.com wrote:
 Edward

 I understand the size limitations - but for my experiment the ramdisk size I 
 have created is large enough.
 I think there will be substantial benefits by putting the intermediate map 
 outputs on a ramdisk - size permitting, ofcourse, but I can't provide any 
 numbers to substantiate my claim  given that I can't get it to run.

 -best regards

 Raj




From: Edward Capriolo edlinuxg...@gmail.com
To: common-user@hadoop.apache.org
Cc: Raj V rajv...@yahoo.com
Sent: Monday, October 3, 2011 10:36 AM
Subject: Re: pointing mapred.local.dir to a ramdisk

This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Must be related to some kind of permissions problems.

 It will help if you can paste the corresponding source code for
 FileUtil.copy(). Hard to track it with different versions, so.

 Thanks,
 +Vinod


 On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:

  Eric
 
  Yes. The owner is hdfs and group is hadoop and the directory is group
  writable(775).  This is tehe exact same configuration I have when I use
 real
  disks.But let me give it a try again to see if I overlooked something.
  Thanks
 
  Raj
 
  
  From: Eric Caspole eric.casp...@amd.com
  To: common-user@hadoop.apache.org
  Sent: Monday, October 3, 2011 8:44 AM
  Subject: Re: pointing mapred.local.dir to a ramdisk
  
  Are you sure you have chown'd/chmod'd the ramdisk directory to be
  writeable by your hadoop user? I have played with this in the past and it
  should basically work.
  
  
  On Oct 3, 2011, at 10:37 AM, Raj V wrote:
  
   Sending it to the hadoop mailing list - I think this is a hadoop
 related
  problem and not related to Cloudera distribution.
  
   Raj
  
  
   - Forwarded Message -
   From: Raj V rajv...@yahoo.com
   To: CDH Users cdh-u...@cloudera.org
   Sent: Friday, September 30, 2011 5:21 PM
   Subject: pointing mapred.local.dir to a ramdisk
  
  
   Hi all
  
  
   I have been trying some experiments to improve performance. One of
 the
  experiments involved pointing mapred.local.dir to a RAM disk. To this end
 I
  created a 128MB RAM disk ( each of my map outputs are smaller than this)
 but
  I have not been able to get the task tracker to start.
  
  
   I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
  from the task tracker log.
  
  
   Tasktracker logs
  
  
   2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
  org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
  org.mortbay.log.Slf4jLog
   2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
  global filtersafety
  (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
  returned by webServer.getConnectors()[0].getLocalPort() before open() is
 -1.
  Opening the listener on 50060
   2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
  listener.getLocalPort() returned 50060
  webServer.getConnectors()[0].getLocalPort() returned 50060
   2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
  bound to port 50060
   2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
   2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
  SelectChannelConnector@0.0.0.0:50060
   2011-09-30 16:50:02,400 INFO
  org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
  with mapRetainSize=-1 and reduceRetainSize=-1
   2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
  Starting tasktracker with owner as mapred
   2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
 Can
  not start task tracker because java.lang.NullPointerException
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
           at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
           at
 
 org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253