subject:"Re\: Error reading task output"

Re: Error reading task output

2009-04-21 Thread Steve Loughran


Aaron Kimball wrote:

Cam,

This isn't Hadoop-specific, it's how Linux treats its network configuration.
If you look at /etc/host.conf, you'll probably see a line that says "order
hosts, bind" -- this is telling Linux's DNS resolution library to first read
your /etc/hosts file, then check an external DNS server.

You could probably disable local hostfile checking, but that means that
every time a program on your system queries the authoritative hostname for
"localhost", it'll go out to the network. You'll probably see a big
performance hit. The better solution, I think, is to get your nodes'
/etc/hosts files squared away.


I agree


You only need to do so once :)


No, you need to detect whenever the Linux networking stack has decided 
to add new entries to resolv.conf  or /etc/hosts and detect when they 
are inappropriate. Which is a tricky thing to do as there are some cases 
where you may actually be grateful that someone in the debian codebase 
decided that adding the local hostname as 127.0.0.1 is actually a 
feature. I ended up writing a new SmartFrog component that can be 
configured to fail to start if the network is a mess, which is something 
worth pushing out.


as part of hadoop diagnostics, this test would be one of the things to 
deal with and at least warn on. "your hostname is local, you will not be 
visible over the network".


-steve

Re: Error reading task output

2009-04-21 Thread Steve Loughran


Cam Macdonell wrote:



Well, for future googlers, I'll answer my own post.  Watch our for the 
hostname at the end of "localhost" lines on slaves.  One of my slaves 
was registering itself as "localhost.localdomain" with the jobtracker.


Is there a way that Hadoop could be made to not be so dependent on 
/etc/hosts, but on more dynamic hostname resolution?




DNS is trouble in Java; there are some (outstanding) bugreps/hadoop 
patches on the topic, mostly showing up on a machine of mine with a bad 
hosts entry. I also encountered some fun last month with ubuntu linux 
adding the local hostname to /etc/hosts along the 127.0.0.1 entry, which 
is precisely what you dont want for a cluster of vms with no DNS at all. 
This sounds like your problem too, in which case I have shared your pain


http://www.1060.org/blogxter/entry?publicid=121ED68BB21DB8C060FE88607222EB52

Re: Error reading task output

2009-04-16 Thread Aaron Kimball

Cam,

This isn't Hadoop-specific, it's how Linux treats its network configuration.
If you look at /etc/host.conf, you'll probably see a line that says "order
hosts, bind" -- this is telling Linux's DNS resolution library to first read
your /etc/hosts file, then check an external DNS server.

You could probably disable local hostfile checking, but that means that
every time a program on your system queries the authoritative hostname for
"localhost", it'll go out to the network. You'll probably see a big
performance hit. The better solution, I think, is to get your nodes'
/etc/hosts files squared away. You only need to do so once :)


-- Aaron


On Thu, Apr 16, 2009 at 11:31 AM, Cam Macdonell  wrote:

> Cam Macdonell wrote:
>
>>
>> Hi,
>>
>> I'm getting the following warning when running the simple wordcount and
>> grep examples.
>>
>> 09/04/15 16:54:16 INFO mapred.JobClient: Task Id :
>> attempt_200904151649_0001_m_19_0, Status : FAILED
>> Too many fetch-failures
>> 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task
>> outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stdout
>>
>> 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task
>> outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stderr
>>
>>
>> The only advice I could find from other posts with similar errors is to
>> setup /etc/hosts with all slaves and the host IPs.  I did this, but I still
>> get the warning above.  The output seems to come out alright however (I
>> guess that's why it is a warning).
>>
>> I tried running a wget on the http:// address in the warning message and
>> I get the following back
>>
>> 2009-04-15 16:53:46 ERROR 400: Argument taskid is required.
>>
>> So perhaps the wrong task ID is being passed to the http request.  Any
>> ideas on what can get rid of these warnings?
>>
>> Thanks,
>> Cam
>>
>
> Well, for future googlers, I'll answer my own post.  Watch our for the
> hostname at the end of "localhost" lines on slaves.  One of my slaves was
> registering itself as "localhost.localdomain" with the jobtracker.
>
> Is there a way that Hadoop could be made to not be so dependent on
> /etc/hosts, but on more dynamic hostname resolution?
>
> Cam
>

Re: Error reading task output

2009-04-16 Thread Cam Macdonell

Cam Macdonell wrote:

Hi,

I'm getting the following warning when running the simple wordcount and
grep examples.

09/04/15 16:54:16 INFO mapred.JobClient: Task Id :
attempt_200904151649_0001_m_19_0, Status : FAILED

Too many fetch-failures
09/04/15 16:54:16 WARN mapred.JobClient: Error reading task
outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stdout

09/04/15 16:54:16 WARN mapred.JobClient: Error reading task
outputhttp://localhost.localdomain:50060/tasklog?plaintext=true&taskid=attempt_200904151649_0001_m_19_0&filter=stderr

The only advice I could find from other posts with similar errors is to
setup /etc/hosts with all slaves and the host IPs. I did this, but I
still get the warning above. The output seems to come out alright
however (I guess that's why it is a warning).

I tried running a wget on the http:// address in the warning message and
I get the following back

2009-04-15 16:53:46 ERROR 400: Argument taskid is required.

So perhaps the wrong task ID is being passed to the http request. Any
ideas on what can get rid of these warnings?

Thanks,
Cam

Well, for future googlers, I'll answer my own post. Watch our for the
hostname at the end of "localhost" lines on slaves. One of my slaves
was registering itself as "localhost.localdomain" with the jobtracker.

Is there a way that Hadoop could be made to not be so dependent on
/etc/hosts, but on more dynamic hostname resolution?

Cam

Re: Error reading task output

Re: Error reading task output

Re: Error reading task output

Re: Error reading task output

4 matches

Site Navigation

Mail list logo

Footer information