Re: documentation of hadoop implementation
there is someone else like me who had problems to find it:-) I thought I was the only one who had the problem, so I didn't send the link. http://developer.yahoo.com/blogs/hadoop/posts/2010/01/hadoop_bay_area_january_2010_u/ Best, Da On 12/30/10 12:24 AM, Mark Kerzner wrote: Da, where did you find it? Thank you, Mark On Wed, Dec 29, 2010 at 11:22 PM, Da Zheng zhengda1...@gmail.com wrote: Hi Todd, It's exactly what I was looking for. Thanks. Best, Da On 12/29/10 5:02 PM, Todd Lipcon wrote: Hi Da, Chris Douglas had an excellent presentation at the Hadoop User Group last year on just this topic. Maybe you can find his slides or a recording on YDN/google? -Todd On Wed, Dec 29, 2010 at 10:20 AM, Da Zheng zhengda1...@gmail.com wrote: Hello, Is the implementation of Hadoop documented somewhere? especially that part where the output of mappers is partitioned, sorted and spilled to the disk. I tried to understand it, but it's rather complex. Is there any document that can help me understand it? Thanks, Da
Re: documentation of hadoop implementation
Thanks, Da, this makes you a better Googler, and an expert one. Cheers, Mark On Thu, Dec 30, 2010 at 9:25 AM, Da Zheng zhengda1...@gmail.com wrote: there is someone else like me who had problems to find it:-) I thought I was the only one who had the problem, so I didn't send the link. http://developer.yahoo.com/blogs/hadoop/posts/2010/01/hadoop_bay_area_january_2010_u/ Best, Da On 12/30/10 12:24 AM, Mark Kerzner wrote: Da, where did you find it? Thank you, Mark On Wed, Dec 29, 2010 at 11:22 PM, Da Zheng zhengda1...@gmail.com wrote: Hi Todd, It's exactly what I was looking for. Thanks. Best, Da On 12/29/10 5:02 PM, Todd Lipcon wrote: Hi Da, Chris Douglas had an excellent presentation at the Hadoop User Group last year on just this topic. Maybe you can find his slides or a recording on YDN/google? -Todd On Wed, Dec 29, 2010 at 10:20 AM, Da Zheng zhengda1...@gmail.com wrote: Hello, Is the implementation of Hadoop documented somewhere? especially that part where the output of mappers is partitioned, sorted and spilled to the disk. I tried to understand it, but it's rather complex. Is there any document that can help me understand it? Thanks, Da
Retrying connect to server
I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more
Re: Retrying connect to server
Hello Cavus, is your Job Tracker running on localhost? It would be great if you can provide more information about your current Hadoop setup. cheers, esteban. estebangutierrez.com — twitter.com/esteban 2010/12/30 Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/ 127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more
Re: Retrying connect to server
Hi Cavus, Please check that hadoop JobTracker and other daemons are running by typing jps. If you see one of (JobTracker,TaskTracker,namenode,datanode) missing then you need to 'stop-all' then format the namenode and start-all again. Maha On Dec 30, 2010, at 7:52 AM, Cavus,M.,Fa. Post Direkt wrote: I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more
Flow of control
Hi, (1) I declared a global variable in my hadoop mainClass which gets initialized in the 'run' function of this mainClass. When I try to access this global static variable from the MapperClass, it appears to be uninitialized. Why is that? Is it because of the parallel execution of Hadoop functions ? but , isn't the 'run' function supposed to be the one to run first and prepare all the job configurations before the Maps even start? (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html shows the inputFormat to be the one running before the maps. My question is in which node? The JobTracker node? Thank you, Maha
Re: Retrying connect to server
make sure your /etc/hosts file contains the correct ip/hostname pair. This is very important 2010/12/30 Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/ 127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more -- -李平
Re: Flow of control
On Fri, Dec 31, 2010 at 9:28 AM, maha m...@umail.ucsb.edu wrote: Hi, (1) I declared a global variable in my hadoop mainClass which gets initialized in the 'run' function of this mainClass. When I try to access this global static variable from the MapperClass, it appears to be uninitialized. Why is that? Is it because of the parallel execution of Hadoop functions ? but , isn't the 'run' function supposed to be the one to run first and prepare all the job configurations before the Maps even start? The Mapper will run on a remote machine, in other JVM, so the variable you set in Main class can not be shared with other VM. (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html shows the inputFormat to be the one running before the maps. My question is in which node? The JobTracker node? I think it should run on JobTracker, The inputFormat will split the file and the map function will read the every splited file. Thank you, Maha -- -李平
Re: Retrying connect to server
Cavus,M.,Fa. Post Direkt wrote: I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more This is the most common issue occured after configuring Hadoop Cluster. Reason : 1. Your NameNode, JobTracker is not running. Verify through Web UI and jps commands. 2. DNS Resolution. You must have IP hostname enteries if all nodes in /etc/hosts file. Best Regards Adarsh Sharma
Re: Retrying connect to server
Or 3) The configuration (or lack thereof) on the machine you are trying to run this, has no idea where your DFS or JobTracker is :) Cheers James. On 2010-12-30, at 8:53 PM, Adarsh Sharma wrote: Cavus,M.,Fa. Post Direkt wrote: I process this ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberg-output I get this Dıd anyone know why I get this Error? 10/12/30 16:48:59 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 0 time(s). 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 1 time(s). 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 2 time(s). 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 3 time(s). 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 4 time(s). 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 5 time(s). 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 6 time(s). 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 7 time(s). 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 8 time(s). 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9001. Already tried 9 time(s). Exception in thread main java.net.ConnectException: Call to localhost/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:932) at org.apache.hadoop.ipc.Client.call(Client.java:908) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70) at org.apache.hadoop.mapreduce.Job.init(Job.java:129) at org.apache.hadoop.mapreduce.Job.init(Job.java:134) at org.postdirekt.hadoop.WordCount.main(WordCount.java:19) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:192) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417) at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025) at org.apache.hadoop.ipc.Client.call(Client.java:885) ... 15 more This is the most common issue occured after configuring Hadoop Cluster. Reason : 1. Your NameNode, JobTracker is not running. Verify through Web UI and jps commands. 2. DNS Resolution. You must have IP hostname enteries if all nodes in /etc/hosts file. Best Regards Adarsh Sharma
Re: Flow of control
Very helpful :) thanks Ping. Maha On Dec 30, 2010, at 6:13 PM, li ping wrote: On Fri, Dec 31, 2010 at 9:28 AM, maha m...@umail.ucsb.edu wrote: Hi, (1) I declared a global variable in my hadoop mainClass which gets initialized in the 'run' function of this mainClass. When I try to access this global static variable from the MapperClass, it appears to be uninitialized. Why is that? Is it because of the parallel execution of Hadoop functions ? but , isn't the 'run' function supposed to be the one to run first and prepare all the job configurations before the Maps even start? The Mapper will run on a remote machine, in other JVM, so the variable you set in Main class can not be shared with other VM. (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html shows the inputFormat to be the one running before the maps. My question is in which node? The JobTracker node? I think it should run on JobTracker, The inputFormat will split the file and the map function will read the every splited file. Thank you, Maha -- -李平
how to build hadoop in Linux
Hello, I need to build hadoop in Linux as I need to make some small changes in the code, but I don't know what is the simplest way to build hadoop. I googled it and so far I only found two places that tell how to build hadoop. One is http://bigdata.wordpress.com/2010/05/27/hadoop-cookbook-3-how-to-build-your-own-hadoop-distribution/. I downloaded apache forrest, and do as it ant -Djava5.home=/usr/lib/jvm/java-1.5.0-gcj-4.4/ -Dforrest.home=/home/zhengda/apache-forrest-0.8 compile-core tar and get an error: [exec] BUILD FAILED [exec] /home/zhengda/apache-forrest-0.8/main/targets/validate.xml:158: java.lang.NullPointerException What does this error mean? it seems apache forrest is used to create hadoop document and I just want to rebuild hadoop java code. Is there a way for me to just rebuild java code? I ran ant, it seems to work successfully, but I don't know if it really compiled the code. the other place I found is to show how to build hadoop with eclipse. I use macbook and I have to ssh to linux boxes to work on hadoop, so it's not a very good option even if it can really work. Best, Da
Re: how to build hadoop in Linux
The Java5 dependency is about to go from Hadoop. See HADOOP-7072. I will try to commit it first thing next year. So, wait a couple of days and you'll be all right. Happy New Year everyone! On Thu, Dec 30, 2010 at 22:08, Da Zheng zhengda1...@gmail.com wrote: Hello, I need to build hadoop in Linux as I need to make some small changes in the code, but I don't know what is the simplest way to build hadoop. I googled it and so far I only found two places that tell how to build hadoop. One is http://bigdata.wordpress.com/2010/05/27/hadoop-cookbook-3-how-to-build-your-own-hadoop-distribution/. I downloaded apache forrest, and do as it ant -Djava5.home=/usr/lib/jvm/java-1.5.0-gcj-4.4/ -Dforrest.home=/home/zhengda/apache-forrest-0.8 compile-core tar and get an error: [exec] BUILD FAILED [exec] /home/zhengda/apache-forrest-0.8/main/targets/validate.xml:158: java.lang.NullPointerException What does this error mean? it seems apache forrest is used to create hadoop document and I just want to rebuild hadoop java code. Is there a way for me to just rebuild java code? I ran ant, it seems to work successfully, but I don't know if it really compiled the code. the other place I found is to show how to build hadoop with eclipse. I use macbook and I have to ssh to linux boxes to work on hadoop, so it's not a very good option even if it can really work. Best, Da
Re: Multiple Input Data Processing using MapReduce
I am faced with a similar problem. I want to process an entire set of bugs including their entire history. Once. Then, incrementally process a combination of the latest output + the changes since last processed. I hit upon a way of handling multiple outputs. Perhaps if there was something in the data format that told you where the data came from, the same mapper could process both and the reducer could merge them? Michael Toback SMTS, VMWare -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-Input-Data-Processing-using-MapReduce-tp1701199p2165027.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.