Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-10 Thread Wu Wei
We got the same problem as you when using MultipleOutputFormat both on 
hadoop 0.18 and 0.19. On hadoop 0.18, increasing the xceivers count does 
not fix the problem. But we found many error message complaining that 
xceiverCount exceeded the limit of concurrent xcievers in datanode 
(running on hadoop 0.19) log. After we increased the xceivers count, the 
problem was gone.


I guess you are using hadoop 0.18. Please try 0.19.

Good luck.

Scott Whitecross wrote:
I tried modifying the settings, and I'm still running into the same 
issue.  I increased the xceivers count (fs.datanode.max.xcievers) in 
the hadoop-site.xml file.  I also checked to make sure the file 
handles were increased, but they were fairly high to begin with.


I don't think I'm dealing with anything out of the ordinary either.  
I'm process three large 'log' files, totaling around 5 GB, and 
producing around 8000 output files after some data processing, 
probably totals 6 or 7 gig.   In the past, I've produced a lot fewer 
files, and that has been fine.  When I change the process to output to 
just a few files, no problem again.


Anything else beyond the limits?  Is HDFS creating a substantial 
amount of temp files as well?







On Feb 9, 2009, at 8:11 PM, Bryan Duxbury wrote:


Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan

On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:

This would be an addition to the hadoop-site.xml file, to up 
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of 
small files if possible.


That said, that error is something I've seen a lot. It usually 
happens when the number of xcievers hasn't been adjusted upwards 
from the default of 256. We run with 8000 xcievers, and that seems 
to solve our problems. I think that if you have a lot of open 
files, this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143) 

at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735) 

at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889) 



It seems to be related to trying to write to many files to HDFS.  
I have a class extending 
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output 
to a few file names, everything works.  However, if I output to 
thousands of small files, the above error occurs.  I'm having 
trouble isolating the problem, as the problem doesn't occur in the 
debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number 
of files HDFS can hold?  Any settings to adjust?


Thanks.














Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread jason hadoop
You will have to increase the per user file descriptor limit.
For most linux machines the file /etc/security/limits.conf controls this on
a per user basis.
You will need to log in a fresh shell session after making the changes, to
see them. Any login shells started before the change and process started by
those shell will have the old limits.

If you are opening vast numbers of files you may need to increase the per
system limits, via the /etc/sysctl.conf file and the fs.file-max parameter.
This page seems to be a decent reference:
http://bloggerdigest.blogspot.com/2006/10/file-descriptors-vs-linux-performance.html


On Mon, Feb 9, 2009 at 1:01 PM, Scott Whitecross sc...@dataxu.com wrote:

 Hi all -

 I've been running into this error the past few days:
 java.io.IOException: Could not get block locations. Aborting...
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)

 It seems to be related to trying to write to many files to HDFS.  I have a
 class extending org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I
 output to a few file names, everything works.  However, if I output to
 thousands of small files, the above error occurs.  I'm having trouble
 isolating the problem, as the problem doesn't occur in the debugger
 unfortunately.

 Is this a memory issue, or is there an upper limit to the number of files
 HDFS can hold?  Any settings to adjust?

 Thanks.


Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Bryan Duxbury
Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards from  
the default of 256. We run with 8000 xcievers, and that seems to  
solve our problems. I think that if you have a lot of open files,  
this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400 
(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run 
(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.  I  
have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in the  
debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number of  
files HDFS can hold?  Any settings to adjust?


Thanks.




Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
This would be an addition to the hadoop-site.xml file, to up  
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards  
from the default of 256. We run with 8000 xcievers, and that seems  
to solve our problems. I think that if you have a lot of open files,  
this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.  I  
have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in the  
debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number of  
files HDFS can hold?  Any settings to adjust?


Thanks.







Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Bryan Duxbury

Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan

On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:

This would be an addition to the hadoop-site.xml file, to up  
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards  
from the default of 256. We run with 8000 xcievers, and that seems  
to solve our problems. I think that if you have a lot of open  
files, this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400 
(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.   
I have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in  
the debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number  
of files HDFS can hold?  Any settings to adjust?


Thanks.









Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Brian Bockelman


On Feb 9, 2009, at 7:50 PM, jason hadoop wrote:

The other issue you may run into, with many files in your HDFS is  
that you

may end up with more than a few 100k worth of blocks on each of your
datanodes. At present this can lead to instability due to the way the
periodic block reports to the namenode are handled. The more blocks  
per

datanode, the larger the risk of congestion collapse in your hdfs.


Of course, if you stay below, say, 500k, you don't have much of a risk  
of congestion.


In our experience, 500k blocks or less is going to be fine with decent  
hardware.  Between 500k and 750k, you will hit a wall somewhere  
depending on your hardware.  Good luck getting anything above 750k.


The recommendation is that you keep this number as low as possible --  
and explore the limits of your system and hardware in testing before  
you discover them in production :)


Brian




On Mon, Feb 9, 2009 at 5:11 PM, Bryan Duxbury br...@rapleaf.com  
wrote:



Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan


On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:

This would be an addition to the hadoop-site.xml file, to up

dfs.datanode.max.xcievers?

Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small

files if possible.

That said, that error is something I've seen a lot. It usually  
happens
when the number of xcievers hasn't been adjusted upwards from the  
default of
256. We run with 8000 xcievers, and that seems to solve our  
problems. I
think that if you have a lot of open files, this problem happens  
a lot

faster.

-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:

Hi all -


I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
  at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)

  at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)

  at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to  
HDFS.  I have
a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I
output to a few file names, everything works.  However, if I  
output to
thousands of small files, the above error occurs.  I'm having  
trouble
isolating the problem, as the problem doesn't occur in the  
debugger

unfortunately.

Is this a memory issue, or is there an upper limit to the number  
of

files HDFS can hold?  Any settings to adjust?

Thanks.













Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Scott Whitecross
I tried modifying the settings, and I'm still running into the same  
issue.  I increased the xceivers count (fs.datanode.max.xcievers) in  
the hadoop-site.xml file.  I also checked to make sure the file  
handles were increased, but they were fairly high to begin with.


I don't think I'm dealing with anything out of the ordinary either.   
I'm process three large 'log' files, totaling around 5 GB, and  
producing around 8000 output files after some data processing,  
probably totals 6 or 7 gig.   In the past, I've produced a lot fewer  
files, and that has been fine.  When I change the process to output to  
just a few files, no problem again.


Anything else beyond the limits?  Is HDFS creating a substantial  
amount of temp files as well?







On Feb 9, 2009, at 8:11 PM, Bryan Duxbury wrote:


Correct.

+1 to Jason's more unix file handles suggestion. That's a must-have.

-Bryan

On Feb 9, 2009, at 3:09 PM, Scott Whitecross wrote:

This would be an addition to the hadoop-site.xml file, to up  
dfs.datanode.max.xcievers?


Thanks.



On Feb 9, 2009, at 5:54 PM, Bryan Duxbury wrote:

Small files are bad for hadoop. You should avoid keeping a lot of  
small files if possible.


That said, that error is something I've seen a lot. It usually  
happens when the number of xcievers hasn't been adjusted upwards  
from the default of 256. We run with 8000 xcievers, and that seems  
to solve our problems. I think that if you have a lot of open  
files, this problem happens a lot faster.


-Bryan

On Feb 9, 2009, at 1:01 PM, Scott Whitecross wrote:


Hi all -

I've been running into this error the past few days:
java.io.IOException: Could not get block locations. Aborting...
	at org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1735)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1889)


It seems to be related to trying to write to many files to HDFS.   
I have a class extending  
org.apache.hadoop.mapred.lib.MultipleOutputFormat and if I output  
to a few file names, everything works.  However, if I output to  
thousands of small files, the above error occurs.  I'm having  
trouble isolating the problem, as the problem doesn't occur in  
the debugger unfortunately.


Is this a memory issue, or is there an upper limit to the number  
of files HDFS can hold?  Any settings to adjust?


Thanks.












Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-12 Thread lohit
The Could not get block locations exception was gone after a Hadoop
restart, but further down the road our job failed again. I checked the
logs for discarding calls and found a bunch of them, plus the namenode
appeared to have a load spike at that time, so it seems it is getting
overloaded. Do you know how can we prevent this? Currently the namenode
machine is not running anything but the namenode and the secondary
namenode, and the cluster only has 16 machines.

Typically secondary namenode should be running on a different machine. It 
requires the same amount of resources as a namenode.
So, if you have say 8G ram node and your namenode is taking like 2-3G of space, 
your secondary namenode would also take up so much of space.
To cross check try to see if the load spike on namenode was during the time 
secondary namenode was checkpointing (by looking at secondary namenode logs).

Thanks,
Lohit



- Original Message 
From: Piotr Kozikowski [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Sent: Monday, August 11, 2008 12:20:05 PM
Subject: Re: java.io.IOException: Could not get block locations. Aborting...

Hi again,

The Could not get block locations exception was gone after a Hadoop
restart, but further down the road our job failed again. I checked the
logs for discarding calls and found a bunch of them, plus the namenode
appeared to have a load spike at that time, so it seems it is getting
overloaded. Do you know how can we prevent this? Currently the namenode
machine is not running anything but the namenode and the secondary
namenode, and the cluster only has 16 machines.

Thank you

Piotr

On Fri, 2008-08-08 at 17:31 -0700, Dhruba Borthakur wrote:
 It is possible that your namenode is overloaded and is not able to
 respond to RPC requests from clients. Please check the namenode logs
 to see if you see lines of the form discarding calls
 
 dhrua
 
 On Fri, Aug 8, 2008 at 3:41 AM, Alexander Aristov
 [EMAIL PROTECTED] wrote:
  I come across the same issue and also with hadoop 0.17.1
 
  would be interesting if someone say the cause of the issue.
 
  Alex
 
  2008/8/8 Steve Loughran [EMAIL PROTECTED]
 
  Piotr Kozikowski wrote:
 
  Hi there:
 
  We would like to know what are the most likely causes of this sort of
  error:
 
  Exception closing
  file
  /data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
  java.io.IOException: Could not get block locations. Aborting...
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
 
  Our map-reduce job does not fail completely but over 50% of the map tasks
  fail with this same error.
  We recently migrated our cluster from 0.16.4 to 0.17.1, previously we
  didn't have this problem using the same input data in a similar map-reduce
  job
 
  Thank you,
 
  Piotr
 
 
  When I see this, its because the filesystem isnt completely up: there are
  no locations for a specific file, meaning the client isn't getting back the
  names of any datanodes holding the data from the name nodes.
 
  I've got a patch in JIRA that prints out the name of the file in question,
  as that could be useful.
 
 
 
 
  --
  Best Regards
  Alexander Aristov
 


Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-11 Thread Piotr Kozikowski
Hi again,

The Could not get block locations exception was gone after a Hadoop
restart, but further down the road our job failed again. I checked the
logs for discarding calls and found a bunch of them, plus the namenode
appeared to have a load spike at that time, so it seems it is getting
overloaded. Do you know how can we prevent this? Currently the namenode
machine is not running anything but the namenode and the secondary
namenode, and the cluster only has 16 machines.

Thank you

Piotr

On Fri, 2008-08-08 at 17:31 -0700, Dhruba Borthakur wrote:
 It is possible that your namenode is overloaded and is not able to
 respond to RPC requests from clients. Please check the namenode logs
 to see if you see lines of the form discarding calls
 
 dhrua
 
 On Fri, Aug 8, 2008 at 3:41 AM, Alexander Aristov
 [EMAIL PROTECTED] wrote:
  I come across the same issue and also with hadoop 0.17.1
 
  would be interesting if someone say the cause of the issue.
 
  Alex
 
  2008/8/8 Steve Loughran [EMAIL PROTECTED]
 
  Piotr Kozikowski wrote:
 
  Hi there:
 
  We would like to know what are the most likely causes of this sort of
  error:
 
  Exception closing
  file
  /data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
  java.io.IOException: Could not get block locations. Aborting...
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
 
  Our map-reduce job does not fail completely but over 50% of the map tasks
  fail with this same error.
  We recently migrated our cluster from 0.16.4 to 0.17.1, previously we
  didn't have this problem using the same input data in a similar map-reduce
  job
 
  Thank you,
 
  Piotr
 
 
  When I see this, its because the filesystem isnt completely up: there are
  no locations for a specific file, meaning the client isn't getting back the
  names of any datanodes holding the data from the name nodes.
 
  I've got a patch in JIRA that prints out the name of the file in question,
  as that could be useful.
 
 
 
 
  --
  Best Regards
  Alexander Aristov
 



Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-08 Thread Steve Loughran

Piotr Kozikowski wrote:

Hi there:

We would like to know what are the most likely causes of this sort of
error:

Exception closing
file 
/data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
java.io.IOException: Could not get block locations. Aborting...
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)

Our map-reduce job does not fail completely but over 50% of the map tasks fail with this same error. 


We recently migrated our cluster from 0.16.4 to 0.17.1, previously we didn't 
have this problem using the same input data in a similar map-reduce job

Thank you,

Piotr



When I see this, its because the filesystem isnt completely up: there 
are no locations for a specific file, meaning the client isn't getting 
back the names of any datanodes holding the data from the name nodes.


I've got a patch in JIRA that prints out the name of the file in 
question, as that could be useful.


Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-08 Thread Alexander Aristov
I come across the same issue and also with hadoop 0.17.1

would be interesting if someone say the cause of the issue.

Alex

2008/8/8 Steve Loughran [EMAIL PROTECTED]

 Piotr Kozikowski wrote:

 Hi there:

 We would like to know what are the most likely causes of this sort of
 error:

 Exception closing
 file
 /data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
 java.io.IOException: Could not get block locations. Aborting...
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)

 Our map-reduce job does not fail completely but over 50% of the map tasks
 fail with this same error.
 We recently migrated our cluster from 0.16.4 to 0.17.1, previously we
 didn't have this problem using the same input data in a similar map-reduce
 job

 Thank you,

 Piotr


 When I see this, its because the filesystem isnt completely up: there are
 no locations for a specific file, meaning the client isn't getting back the
 names of any datanodes holding the data from the name nodes.

 I've got a patch in JIRA that prints out the name of the file in question,
 as that could be useful.




-- 
Best Regards
Alexander Aristov


Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-08 Thread Dhruba Borthakur
It is possible that your namenode is overloaded and is not able to
respond to RPC requests from clients. Please check the namenode logs
to see if you see lines of the form discarding calls

dhrua

On Fri, Aug 8, 2008 at 3:41 AM, Alexander Aristov
[EMAIL PROTECTED] wrote:
 I come across the same issue and also with hadoop 0.17.1

 would be interesting if someone say the cause of the issue.

 Alex

 2008/8/8 Steve Loughran [EMAIL PROTECTED]

 Piotr Kozikowski wrote:

 Hi there:

 We would like to know what are the most likely causes of this sort of
 error:

 Exception closing
 file
 /data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
 java.io.IOException: Could not get block locations. Aborting...
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)

 Our map-reduce job does not fail completely but over 50% of the map tasks
 fail with this same error.
 We recently migrated our cluster from 0.16.4 to 0.17.1, previously we
 didn't have this problem using the same input data in a similar map-reduce
 job

 Thank you,

 Piotr


 When I see this, its because the filesystem isnt completely up: there are
 no locations for a specific file, meaning the client isn't getting back the
 names of any datanodes holding the data from the name nodes.

 I've got a patch in JIRA that prints out the name of the file in question,
 as that could be useful.




 --
 Best Regards
 Alexander Aristov



Re: java.io.IOException: Could not get block locations. Aborting...

2008-08-08 Thread Piotr Kozikowski
Thank you for the reply. Apparently whatever it was is now gone after a
hadoop restart, but I'll keep that in mind should it happen again.

Piotr

On Fri, 2008-08-08 at 17:31 -0700, Dhruba Borthakur wrote:
 It is possible that your namenode is overloaded and is not able to
 respond to RPC requests from clients. Please check the namenode logs
 to see if you see lines of the form discarding calls
 
 dhrua
 
 On Fri, Aug 8, 2008 at 3:41 AM, Alexander Aristov
 [EMAIL PROTECTED] wrote:
  I come across the same issue and also with hadoop 0.17.1
 
  would be interesting if someone say the cause of the issue.
 
  Alex
 
  2008/8/8 Steve Loughran [EMAIL PROTECTED]
 
  Piotr Kozikowski wrote:
 
  Hi there:
 
  We would like to know what are the most likely causes of this sort of
  error:
 
  Exception closing
  file
  /data1/hdfs/tmp/person_url_pipe_59984_3405334/_temporary/_task_200807311534_0055_m_22_0/part-00022
  java.io.IOException: Could not get block locations. Aborting...
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
 at
  org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
 
  Our map-reduce job does not fail completely but over 50% of the map tasks
  fail with this same error.
  We recently migrated our cluster from 0.16.4 to 0.17.1, previously we
  didn't have this problem using the same input data in a similar map-reduce
  job
 
  Thank you,
 
  Piotr
 
 
  When I see this, its because the filesystem isnt completely up: there are
  no locations for a specific file, meaning the client isn't getting back the
  names of any datanodes holding the data from the name nodes.
 
  I've got a patch in JIRA that prints out the name of the file in question,
  as that could be useful.
 
 
 
 
  --
  Best Regards
  Alexander Aristov