Re: master trying fetch data from slave using "localhost" hostname :)
what does /etc/host look like now? I hit some problems with ubuntu and localhost last week; the hostname was set up in /etc/hosts not just to point to the loopback address, but to a different loopback address (127.0.1.1) from the normal value (127.0.0.1), so breaking everything. http://www.1060.org/blogxter/entry?publicid=121ED68BB21DB8C060FE88607222EB52 "/etc/hosts" now on both machines: 192.168.0.28master1 192.168.0.199 slave1
Re: master trying fetch data from slave using "localhost" hostname :)
pavelkolo...@gmail.com wrote: On Fri, 06 Mar 2009 14:41:57 -, jason hadoop wrote: I see that when the host name of the node is also on the localhost line in /etc/hosts I erased all records with "localhost" from all "/etc/hosts" files and all fine now :) Thank you :) what does /etc/host look like now? I hit some problems with ubuntu and localhost last week; the hostname was set up in /etc/hosts not just to point to the loopback address, but to a different loopback address (127.0.1.1) from the normal value (127.0.0.1), so breaking everything. http://www.1060.org/blogxter/entry?publicid=121ED68BB21DB8C060FE88607222EB52
Re: master trying fetch data from slave using "localhost" hostname :)
On Fri, 06 Mar 2009 14:41:57 -, jason hadoop wrote: I see that when the host name of the node is also on the localhost line in /etc/hosts I erased all records with "localhost" from all "/etc/hosts" files and all fine now :) Thank you :) -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
Re: master trying fetch data from slave using "localhost" hostname :)
I see that when the host name of the node is also on the localhost line in /etc/hosts On Fri, Mar 6, 2009 at 9:38 AM, wrote: > > I see the same strange behavior on 2-node cluster with 0.18.3, 0.19.1 and > snv's branch-0.20.0... > 2 nodes: > "master1" running NameNode, JobTracker, DataNode, TaskTracker. > "slave1" running DataNode, TaskTracker. > > PROBLEM: "master" trying fetch data of "attempt" that running on slave, BUT > connecting to "localhost" for unknown reason: > > (master's console:) > 09/03/06 17:15:01 WARN mapred.JobClient: Error reading task > outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_200903061711_0001_m_00_0&filter=stdout > > But "attempt_200903061711_0001_m_00_0" i have found in "logs/userlogs" > on "slave"! > "master" trying to fetch it, but connects to itself and, of course, can't > find if (HTTP 410) > > wget " > http://localhost:50060/tasklog?plaintext=true&taskid=attempt_200903061711_0001_m_00_0&filter=stdout > " > "Failed to retrieve stderr log for task: > attempt_200903061711_0001_m_01_0" > > In the "logs/userlogs" on master there are some other "attempt"s. > > (Of course, little by little all work "migrates" to "master" and all the > Job finishing successfully). >
master trying fetch data from slave using "localhost" hostname :)
I see the same strange behavior on 2-node cluster with 0.18.3, 0.19.1 and snv's branch-0.20.0... 2 nodes: "master1" running NameNode, JobTracker, DataNode, TaskTracker. "slave1" running DataNode, TaskTracker. PROBLEM: "master" trying fetch data of "attempt" that running on slave, BUT connecting to "localhost" for unknown reason: (master's console:) 09/03/06 17:15:01 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&taskid=attempt_200903061711_0001_m_00_0&filter=stdout But "attempt_200903061711_0001_m_00_0" i have found in "logs/userlogs" on "slave"! "master" trying to fetch it, but connects to itself and, of course, can't find if (HTTP 410) wget "http://localhost:50060/tasklog?plaintext=true&taskid=attempt_200903061711_0001_m_00_0&filter=stdout"; "Failed to retrieve stderr log for task: attempt_200903061711_0001_m_01_0" In the "logs/userlogs" on master there are some other "attempt"s. (Of course, little by little all work "migrates" to "master" and all the Job finishing successfully).