Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
I tried modifying the settings, and I'm still running into the same  
issue.  I increased the xceivers count (fs.datanode.max.xcievers) in  
the hadoop-site.xml file.  I also checked to make sure the file  
handles were increased, but they were fairly high to begin with.


I don't think I'm dealing with anything out of the ordinary either.   
I'm process three large 'log' files, totaling around 5 GB, and  
producing around 8000 output files after some data processing,  
probably totals 6 or 7 gig.   In the past, I've produced a lot fewer  
files, and that has been fine.  When I change the process to output to  
just a few files, no problem again.


Anything else beyond the limits?  Is HDFS creating a substantial  
amount of temp files as well?







On Feb 9, 2009, at 8:11 PM, Bryan Duxbury wrote:


Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan

On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:

This would be an addition to the hadoop-site.xml file, to up  
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards  
from the default of 256. We run with 8000 xcievers, and that seems  
to solve our problems. I think that if you have a lot of open  
files, this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.   
I have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in  
the debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number  
of files HDFS can hold?  Any settings to adjust?


Thanks.












Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
This would be an addition to the hadoop-site.xml file, to up  
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards  
from the default of 256. We run with 8000 xcievers, and that seems  
to solve our problems. I think that if you have a lot of open files,  
this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.  I  
have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in the  
debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number of  
files HDFS can hold?  Any settings to adjust?


Thanks.







java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross

Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.  I  
have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output to a  
few file names, everything works.  However, if I output to thousands  
of small files, the above error occurs.  I'm having trouble isolating  
the problem, as the problem doesn't occur in the debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number of  
files HDFS can hold?  Any settings to adjust?


Thanks.

Best practice for using third party libraries in MapReduce Jobs?

2008-12-03 Thread Scott Whitecross
What's the best way to use third party libraries with Hadoop?  For  
example, I want to run a job with both a jar file containing the mob,  
and also extra libraries.  I noticed a couple solutions with a search,  
but I'm hoping for something better:


- Merge the third party jar libraries into the job jar
- Distribute the third party libraries across the cluster in the local  
boxes classpath.


What I'd really like is a way to add an extra option to the hadoop jar  
command, such as hadoop/bin/hadoop jar myJar.jar myJobClass -classpath  
thirdpartyjar1.jar:jar2.jar:etc  args


Anything exist like this?


Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross

Thanks Brian.  So you have had luck w/ log4j?

I haven't tried local mode.  I will try it tonight and see how it goes  
for quick debugging.  More so, I wanted to be able to easily log and  
watch events on a cluster, rather then digging through all the hadoop  
logging levels.  I've also read that you can attach scripts as well?



On Nov 23, 2008, at 10:18 PM, Brian Bockelman wrote:


Hey Scott,

I see nothing wrong offhand; have you tried to run in "local" mode?   
It'd be quicker to debug logging problems that way, as any bad  
misconfigurations (I think) should get printed out to stderr.


Brian

On Nov 23, 2008, at 9:01 PM, Scott Whitecross wrote:

Thanks Brian.  I've played w/ the log4j.properties a bit, and  
haven't had any luck.  Can you share how youve setup log4j?   I am  
probably missing the obvious, but here is what I setup:


log4j.logger.com.mycompany.hadoop=DEBUG,DX,console
log4j.appender.DX=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DX.File=/opt/hadoop/myLogs/test.log
log4j.appender.DX.layout=org.apache.log4j.PatternLayout
log4j.appender.DX.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n

Within my map class:
private static final Logger logger = Logger.getLogger(MyMap.class);

and to log:

logger.info("entering map");

I also tried creating a logger in the code and attaching appenders  
w/out luck.


Thanks.



On Nov 23, 2008, at 9:15 PM, Brian Bockelman wrote:


Hey Scott,

Have you tried configuring things from

$HADOOP_HOME/conf/log4j.properties

?

I'd just use my own logger and set up a separate syslog server.   
It's not an extremely elaborate setup (certainly, would quickly  
become a headache on a large cluster...), but it should be pretty  
easy to set up.


Brian

On Nov 23, 2008, at 7:58 PM, Scott Whitecross wrote:

I've looked around for a while, but it seems there isn't a way to  
log from Hadoop, without going through the logs/userlogs/ and the  
'attempt' directories?  That would mean that for logging I'm  
restricted to writing to System.out and System.err, then  
collecting via scripts?


Thanks.





On Nov 11, 2008, at 9:53 PM, Alex Loddengaard wrote:


Have you seen this:
<http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms>

Alex

On Tue, Nov 11, 2008 at 6:03 PM, ZhiHong Fu <[EMAIL PROTECTED]>  
wrote:



Hello,

  I'm very sorry to trouble you, I'm developing a MapReduce
Application, And I can get Log.INFO in InputFormat ,but In  
Mapper or

Reducer
, I can't get anything

. And Now an error occured in the reduce stage. Because the  
code is a

little
complicated, I can't find where is the mistake just form the  
thrown

exception. I want to use

log4j to log the intermediate info. But I have tried a whole  
day , Now I

can't still get anything. who can help me? Thanks very much.

ddream.











Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross
Thanks Brian.  I've played w/ the log4j.properties a bit, and haven't  
had any luck.  Can you share how youve setup log4j?   I am probably  
missing the obvious, but here is what I setup:


log4j.logger.com.mycompany.hadoop=DEBUG,DX,console
log4j.appender.DX=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DX.File=/opt/hadoop/myLogs/test.log
log4j.appender.DX.layout=org.apache.log4j.PatternLayout
log4j.appender.DX.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n

Within my map class:
private static final Logger logger = Logger.getLogger(MyMap.class);

and to log:

logger.info("entering map");

I also tried creating a logger in the code and attaching appenders w/ 
out luck.


Thanks.



On Nov 23, 2008, at 9:15 PM, Brian Bockelman wrote:


Hey Scott,

Have you tried configuring things from

$HADOOP_HOME/conf/log4j.properties

?

I'd just use my own logger and set up a separate syslog server.   
It's not an extremely elaborate setup (certainly, would quickly  
become a headache on a large cluster...), but it should be pretty  
easy to set up.


Brian

On Nov 23, 2008, at 7:58 PM, Scott Whitecross wrote:

I've looked around for a while, but it seems there isn't a way to  
log from Hadoop, without going through the logs/userlogs/ and the  
'attempt' directories?  That would mean that for logging I'm  
restricted to writing to System.out and System.err, then collecting  
via scripts?


Thanks.





On Nov 11, 2008, at 9:53 PM, Alex Loddengaard wrote:


Have you seen this:
<http://wiki.apache.org/hadoop/HowToDebugMapReducePrograms>

Alex

On Tue, Nov 11, 2008 at 6:03 PM, ZhiHong Fu <[EMAIL PROTECTED]>  
wrote:



Hello,

I'm very sorry to trouble you, I'm developing a MapReduce
Application, And I can get Log.INFO in InputFormat ,but In Mapper  
or

Reducer
, I can't get anything

. And Now an error occured in the reduce stage. Because the code  
is a

little
complicated, I can't find where is the mistake just form the thrown
exception. I want to use

log4j to log the intermediate info. But I have tried a whole  
day , Now I

can't still get anything. who can help me? Thanks very much.

ddream.








Re: Hadoop+log4j

2008-11-23 Thread Scott Whitecross
I've looked around for a while, but it seems there isn't a way to log  
from Hadoop, without going through the logs/userlogs/ and the  
'attempt' directories?  That would mean that for logging I'm  
restricted to writing to System.out and System.err, then collecting  
via scripts?


Thanks.





On Nov 11, 2008, at 9:53 PM, Alex Loddengaard wrote:


Have you seen this:


Alex

On Tue, Nov 11, 2008 at 6:03 PM, ZhiHong Fu <[EMAIL PROTECTED]>  
wrote:



Hello,

  I'm very sorry to trouble you, I'm developing a MapReduce
Application, And I can get Log.INFO in InputFormat ,but In Mapper or
Reducer
, I can't get anything

. And Now an error occured in the reduce stage. Because the code is a
little
complicated, I can't find where is the mistake just form the thrown
exception. I want to use

log4j to log the intermediate info. But I have tried a whole day ,  
Now I

can't still get anything. who can help me? Thanks very much.

ddream.





Re: Debugging / Logging in Hadoop?

2008-10-30 Thread Scott Whitecross
Is the presentation online as well?  (Hard to see some of the slides  
in the video)


On Oct 30, 2008, at 1:34 PM, Alex Loddengaard wrote:

Arun gave a great talk about debugging and tuning at the Rapleaf  
event.

Take a look:
<http://www.vimeo.com/2085477>

Alex

On Thu, Oct 30, 2008 at 6:20 AM, Malcolm Matalka <
[EMAIL PROTECTED]> wrote:

I'm not sure of the correct way, but when I need to log a job I  
have it

print out with some unique identifier and then just do:

for i in list of each box; do ssh $i 'grep -R PREFIX path/to/logs';  
done

results


It works well in a pinch

-Original Message-
From: Scott Whitecross [mailto:[EMAIL PROTECTED]
Sent: Wednesday, October 29, 2008 22:14
To: core-user@hadoop.apache.org
Subject: Debugging / Logging in Hadoop?

I'm curious to what the best method for debugging and logging in
Hadoop?  I put together a small cluster today and a simple  
application

to process log files.  While it worked well, I had trouble trying to
get logging information out.  Is there any way to attach a debugger,
or get log4j to write a log file?  I tried setting up a Logger in the
class I used for the map/reduce, but I had no luck.

Thanks.







Re: Frustrated with Cluster Setup: Reduce Tasks Stop at 16% - could not find taskTracker/jobcache...

2008-10-30 Thread Scott Whitecross
Thanks for the answer.  It looks like the values are setup correctly.   
I see /mapred/local directory created successfully as well.  Do I need  
to explicity define a value in hadoop-site.xml?



hadoop-default.xml:


  mapred.local.dir
  ${hadoop.tmp.dir}/mapred/local


and in hadoop-site.xml:

 
 hadoop.tmp.dir
 /opt/hadoop-datastore
A base for other temporary directories.
 


On Oct 30, 2008, at 2:40 PM, pvvpr wrote:


Hi Scott,
 Your reducer classes are unable to read map outputs. You may check
"mapred.local.dir" property in your conf/hadoop-default.xml and
conf/hadoop-site.xml . These directories should be valid directories  
in
your slaves. You can give multiple folders with comma separated  
values.


- Prasad.


On Thursday 30 October 2008 11:32:02 pm Scott Whitecross wrote:

So its not just at 16%, but depends on the task:
2008-10-30 13:58:29,702 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_200810301345_0001_r_00_0 0.25675678% reduce > copy (57 of
74 at 13.58 MB/s) >

2008-10-30 13:58:29,357 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_200810301345_0001_m_48_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_200810301345_0001/
attempt_200810301345_0001_m_48_0/output/file.out.index in any of
the configured local directories
at org.apache.hadoop.fs.LocalDirAllocator
$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
at
org
.apache
.hadoop 
.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:

138)
at org.apache.hadoop.mapred.TaskTracker
$MapOutputServlet.doGet(TaskTracker.java:2402)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

I'm out of thoughts on what the problem could be..

On Oct 30, 2008, at 12:35 PM, Scott Whitecross wrote:

I'm growing very frustrated with a simple cluster setup.  I can get
the cluster setup on two machines, but have troubles when trying to
extend the installation to 3 or more boxes.  I keep seeing the below
errors.  It seems the reduce tasks can't get access to the data.

I can't seem to figure out how to fix this error.   What amazes me
is that file not found issues appear on the master box, as well as
the slaves.  What causes the reduce tasks to not read find
information via the localhost?

Setup/Errors:

My basic setup comes from:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-No
de_Cluster) (Michael Noll's setup).  I've put the following in the  
my

/etc/ hosts file:

127.0.0.1   localhost
10.1.1.12   master
10.1.1.10   slave
10.1.1.13   slave1

And have setup transparent ssh to all boxes (and it works).  All
boxes can see each other, etc.

My base level hadoop-site.xml is:





  
  hadoop.tmp.dir
  /opt/hadoop-datastore
  
  
  fs.default.name
  hdfs://master:54310
  
  
  mapred.job.tracker
  master:54311
  
  
  dfs.replication
  3
  



Errors:

WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_200810301206_0004_m_01_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
find taskTracker/jobcache/job_200810301206_0004/
attempt_200810301206_0004_m_01_0/output/file.out.index in any of
the configured local directories
at org.apache.hadoop.fs.LocalDirAllocator
$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
at
org
.apache
.hadoop
.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:
138)...

and in the userlog of the attempt:

2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask:
java.io.FileNotFoundException:
http://localhost:50060/mapOutput?job=job_200810301206_0004&map=attempt_20
0810301206_0004_m_01_0&reduce=0 at
sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown  
Source)

at
sun
.reflect
.DelegatingConstructorAccessorImpl
.newInstance(DelegatingConstructorAccessorImpl.java:27)











Re: Frustrated with Cluster Setup: Reduce Tasks Stop at 16% - could not find taskTracker/jobcache...

2008-10-30 Thread Scott Whitecross

So its not just at 16%, but depends on the task:
2008-10-30 13:58:29,702 INFO org.apache.hadoop.mapred.TaskTracker:  
attempt_200810301345_0001_r_00_0 0.25675678% reduce > copy (57 of  
74 at 13.58 MB/s) >


2008-10-30 13:58:29,357 WARN org.apache.hadoop.mapred.TaskTracker:  
getMapOutput(attempt_200810301345_0001_m_48_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find  
taskTracker/jobcache/job_200810301345_0001/ 
attempt_200810301345_0001_m_48_0/output/file.out.index in any of  
the configured local directories
	at org.apache.hadoop.fs.LocalDirAllocator 
$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
	at  
org 
.apache 
.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 
138)
	at org.apache.hadoop.mapred.TaskTracker 
$MapOutputServlet.doGet(TaskTracker.java:2402)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

I'm out of thoughts on what the problem could be..


On Oct 30, 2008, at 12:35 PM, Scott Whitecross wrote:

I'm growing very frustrated with a simple cluster setup.  I can get  
the cluster setup on two machines, but have troubles when trying to  
extend the installation to 3 or more boxes.  I keep seeing the below  
errors.  It seems the reduce tasks can't get access to the data.


I can't seem to figure out how to fix this error.   What amazes me  
is that file not found issues appear on the master box, as well as  
the slaves.  What causes the reduce tasks to not read find  
information via the localhost?


Setup/Errors:

My basic setup comes from: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) 
 (Michael Noll's setup).  I've put the following in the my /etc/ 
hosts file:


127.0.0.1   localhost
10.1.1.12   master
10.1.1.10   slave
10.1.1.13   slave1

And have setup transparent ssh to all boxes (and it works).  All  
boxes can see each other, etc.


My base level hadoop-site.xml is:





   
   hadoop.tmp.dir
   /opt/hadoop-datastore
   
   
   fs.default.name
   hdfs://master:54310
   
   
   mapred.job.tracker
   master:54311
   
   
   dfs.replication
   3
   



Errors:

WARN org.apache.hadoop.mapred.TaskTracker:  
getMapOutput(attempt_200810301206_0004_m_01_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not  
find taskTracker/jobcache/job_200810301206_0004/ 
attempt_200810301206_0004_m_01_0/output/file.out.index in any of  
the configured local directories
	at org.apache.hadoop.fs.LocalDirAllocator 
$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
	at  
org 
.apache 
.hadoop 
.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 
138)...


and in the userlog of the attempt:

2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask:  
java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_200810301206_0004&map=attempt_200810301206_0004_m_01_0&reduce=0
	at sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown  
Source)
	at  
sun 
.reflect 
.DelegatingConstructorAccessorImpl 
.newInstance(DelegatingConstructorAccessorImpl.java:27)






Frustrated with Cluster Setup: Reduce Tasks Stop at 16%

2008-10-30 Thread Scott Whitecross
I'm growing very frustrated with a simple cluster setup.  I can get  
the cluster setup on two machines, but have troubles when trying to  
extend the installation to 3 or more boxes.  I keep seeing the below  
errors.  It seems the reduce tasks can't get access to the data.


I can't seem to figure out how to fix this error.   What amazes me is  
that file not found issues appear on the master box, as well as the  
slaves.  What causes the reduce tasks to not read find information via  
the localhost?


Setup/Errors:

My basic setup comes from: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) 
 (Michael Noll's setup).  I've put the following in the my /etc/hosts  
file:


127.0.0.1   localhost
10.1.1.12   master
10.1.1.10   slave
10.1.1.13   slave1

And have setup transparent ssh to all boxes (and it works).  All boxes  
can see each other, etc.


My base level hadoop-site.xml is:






hadoop.tmp.dir
/opt/hadoop-datastore


fs.default.name
hdfs://master:54310


mapred.job.tracker
master:54311


dfs.replication
3




Errors:

WARN org.apache.hadoop.mapred.TaskTracker:  
getMapOutput(attempt_200810301206_0004_m_01_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find  
taskTracker/jobcache/job_200810301206_0004/ 
attempt_200810301206_0004_m_01_0/output/file.out.index in any of  
the configured local directories
	at org.apache.hadoop.fs.LocalDirAllocator 
$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
	at  
org 
.apache 
.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 
138)...


and in the userlog of the attempt:

2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask:  
java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_200810301206_0004&map=attempt_200810301206_0004_m_01_0&reduce=0
	at sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown  
Source)
	at  
sun 
.reflect 
.DelegatingConstructorAccessorImpl 
.newInstance(DelegatingConstructorAccessorImpl.java:27)




Debugging / Logging in Hadoop?

2008-10-29 Thread Scott Whitecross
I'm curious to what the best method for debugging and logging in  
Hadoop?  I put together a small cluster today and a simple application  
to process log files.  While it worked well, I had trouble trying to  
get logging information out.  Is there any way to attach a debugger,  
or get log4j to write a log file?  I tried setting up a Logger in the  
class I used for the map/reduce, but I had no luck.


Thanks.