Re: How to tell my Hadoop cluster to read data from an external server

Hemanth Yamijala Tue, 26 Mar 2013 22:14:00 -0700

The stack trace indicates the job client is trying to submit a job to the
MR cluster and it is failing. Are you certain that at the time of
submitting the job, the JobTracker is running ? (On localhost:54312) ?


Regarding using a different file system - it depends a lot on what file
system you are using, and whether it will match the requirements of large
scale distributed processing that Hadoop MR can offer. Suggest you be very
sure about this, before you take that route.

Thanks
Hemanth


On Tue, Mar 26, 2013 at 4:22 PM, Agarwal, Nikhil
<nikhil.agar...@netapp.com>wrote:

>  Hi,****
>
> Thanks for your reply. I do not know about cascading. Should I google it
> as “cascading in hadoop”? Also, what I was thinking is to implement a file
> system which overrides the functions provided by fs.FileSystem interface in
> Hadoop. I tried to write some portions of the filesystem (for my external
> server) so that it recompiles successfully but when I submit a MR job I get
> the following error:****
>
> ** **
>
> 13/03/26 06:09:10 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 0 time(s).
> 13/03/26 06:09:11 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 1 time(s).
> 13/03/26 06:09:12 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 2 time(s).
> 13/03/26 06:09:13 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 3 time(s).
> 13/03/26 06:09:14 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 4 time(s).
> 13/03/26 06:09:15 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 5 time(s).
> 13/03/26 06:09:16 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 6 time(s).
> 13/03/26 06:09:17 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 7 time(s).
> 13/03/26 06:09:18 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 8 time(s).
> 13/03/26 06:09:19 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54312. Already tried 9 time(s).
> 13/03/26 06:10:20 ERROR security.UserGroupInformation:
> PriviledgedActionException as:nikhil cause:java.net.ConnectException: Call
> to localhost/127.0.0.1:54312 failed on connection exception:
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to localhost/127.0.0.1:54312 failed on
> connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1075)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$Proxy2.getProtocolVersion(Unknown Source)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>     at
> org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:480)
>     at org.apache.hadoop.mapred.JobClient.init(JobClient.java:474)
>     at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:457)
>     at org.apache.hadoop.mapreduce.Job$1.run(Job.java:513)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:416)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>     at org.apache.hadoop.mapreduce.Job.connect(Job.java:511)
>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:499)
>     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
>     at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:616)
>     at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>     at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:616)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
>     at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1050)
>     ... 27 more****
>
> ** **
>
> Basically, my job tracker is running at localhost:54312 and I have set the
> value of fs.default.name parameter as myexternalserver://ip:port and
> fs.myexternalserver.impl as the filesystem class which I made. I am not
> able to figure out why this error is there. Why is it trying to connect to
> localhost:54312. Please suggest where am I going wrong. ****
>
> ** **
>
> Also, if you feel cascading would be better for this then please do let me
> know.****
>
> ** **
>
> Thanks & Regards,****
>
> Nikhil****
>
> ** **
>
> *From:* Agarwal, Nikhil
> *Sent:* Tuesday, March 26, 2013 2:49 PM
> *To:* 'user@hadoop.apache.org'
> *Subject:* How to tell my Hadoop cluster to read data from an external
> server****
>
> ** **
>
> Hi,****
>
> ** **
>
> I have a Hadoop cluster up and running. I want to submit an MR job to it
> but the input data is kept on an external server (outside the hadoop
> cluster). Can anyone please suggest how do I tell my hadoop cluster to load
> the input data from the external servers and then do a MR on it ?****
>
> ** **
>
> Thanks & Regards,****
>
> Nikhil****
>

Re: How to tell my Hadoop cluster to read data from an external server

Reply via email to