I am able to browse the web UI and telnet/netcat the tasktracker host and port, so the connection is being established. Is there any way I can confirm whether it is really some kind of version conflict? The EOF when doing readInt() seems like a protocol incompatibility.
By the way, the tastracker is killed every time this happens, and I am left with some kind of JVM dump in a hs_err_*.log file. The tasktracker logs show nothing. Some facts that may help find the problem are: 1) I am not running with a "hadoop" user as it is usually suggested in tutorials 2) There is an older version of hadoop which I am absolutely sure is not running, and even so, it is configured on different ports. Thank you for your help and regards, Caetano Sauer On Wed, Aug 29, 2012 at 10:08 AM, Hemanth Yamijala <yhema...@gmail.com>wrote: > Are you able to browse the web UI for the jobtracker. If not > configured separately, it should be at hostname:50030 ? It would also > help if you can telnet to the jobtracker server port and see if it is > able to connect. > > Thanks > hemanth > > On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <caetanosa...@gmail.com> > wrote: > > The host on top of the stack trace contains the host and port I defined > on > > mapred.job.tracker in mapred-site.xml > > > > Other than that, I don't know how to verify what you asked me. Any tips? > > > > > > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >> Are you sure you're reaching the right port for your JobTrcker? > >> > >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <caetanosa...@gmail.com> > >> wrote: > >> > Hello, > >> > > >> > I am getting the following error when trying to execute a hadoop job > on > >> > a > >> > 5-node cluster: > >> > > >> > Caused by: java.io.IOException: Call to *** failed on local exception: > >> > java.io.EOFException > >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) > >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) > >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) > >> > at java.security.AccessController.doPrivileged(Native Method) > >> > at javax.security.auth.Subject.doAs(Subject.java:396) > >> > at > >> > > >> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > >> > at > >> > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) > >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) > >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) > >> > ... 9 more > >> > Caused by: java.io.EOFException > >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) > >> > at > >> > > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) > >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) > >> > > >> > (My jobtracker host was substituted by ***) > >> > > >> > After 3 hours of searching, everything points to an incompatibility > >> > between > >> > the hadoop versions of the client and the server, but this is not the > >> > case, > >> > since I can run the job on a pseudo-distributed setup on a different > >> > machine. Both are running the exact same version (same svn revision > and > >> > source checksum). > >> > > >> > Does anyone have a solution or a suggestion on how to find more debug > >> > information? > >> > > >> > Thank you in advance, > >> > Caetano Sauer > >> > >> > >> > >> -- > >> Harsh J > > > > >