Re: documentation of hadoop implementation

2010-12-30 Thread Da Zheng
there is someone else like me who had problems to find it:-) I thought I was the
only one who had the problem, so I didn't send the link.
http://developer.yahoo.com/blogs/hadoop/posts/2010/01/hadoop_bay_area_january_2010_u/

Best,
Da

On 12/30/10 12:24 AM, Mark Kerzner wrote:
 Da, where did you find it?
 
 Thank you,
 Mark
 
 On Wed, Dec 29, 2010 at 11:22 PM, Da Zheng zhengda1...@gmail.com wrote:
 
 Hi Todd,

 It's exactly what I was looking for. Thanks.

 Best,
 Da

 On 12/29/10 5:02 PM, Todd Lipcon wrote:
 Hi Da,

 Chris Douglas had an excellent presentation at the Hadoop User Group last
 year on just this topic. Maybe you can find his slides or a recording on
 YDN/google?

 -Todd

 On Wed, Dec 29, 2010 at 10:20 AM, Da Zheng zhengda1...@gmail.com
 wrote:

 Hello,

 Is the implementation of Hadoop documented somewhere? especially that
 part
 where
 the output of mappers is partitioned, sorted and spilled to the disk. I
 tried to
 understand it, but it's rather complex. Is there any document that can
 help
 me
 understand it?

 Thanks,
 Da






 



Re: documentation of hadoop implementation

2010-12-30 Thread Mark Kerzner
Thanks, Da, this makes you a better Googler, and an expert one.

Cheers,
Mark

On Thu, Dec 30, 2010 at 9:25 AM, Da Zheng zhengda1...@gmail.com wrote:

 there is someone else like me who had problems to find it:-) I thought I
 was the
 only one who had the problem, so I didn't send the link.

 http://developer.yahoo.com/blogs/hadoop/posts/2010/01/hadoop_bay_area_january_2010_u/

 Best,
 Da

 On 12/30/10 12:24 AM, Mark Kerzner wrote:
  Da, where did you find it?
 
  Thank you,
  Mark
 
  On Wed, Dec 29, 2010 at 11:22 PM, Da Zheng zhengda1...@gmail.com
 wrote:
 
  Hi Todd,
 
  It's exactly what I was looking for. Thanks.
 
  Best,
  Da
 
  On 12/29/10 5:02 PM, Todd Lipcon wrote:
  Hi Da,
 
  Chris Douglas had an excellent presentation at the Hadoop User Group
 last
  year on just this topic. Maybe you can find his slides or a recording
 on
  YDN/google?
 
  -Todd
 
  On Wed, Dec 29, 2010 at 10:20 AM, Da Zheng zhengda1...@gmail.com
  wrote:
 
  Hello,
 
  Is the implementation of Hadoop documented somewhere? especially that
  part
  where
  the output of mappers is partitioned, sorted and spilled to the disk.
 I
  tried to
  understand it, but it's rather complex. Is there any document that can
  help
  me
  understand it?
 
  Thanks,
  Da
 
 
 
 
 
 
 




Retrying connect to server

2010-12-30 Thread Cavus,M.,Fa. Post Direkt
I process this

./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg 
gutenberg-output

I get this
Dıd anyone know why I get this Error?

10/12/30 16:48:59 INFO security.Groups: Group mapping 
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30
10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 0 time(s).
10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 1 time(s).
10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 2 time(s).
10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 3 time(s).
10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 4 time(s).
10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 5 time(s).
10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 6 time(s).
10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 7 time(s).
10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 8 time(s).
10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 9 time(s).
Exception in thread main java.net.ConnectException: Call to 
localhost/127.0.0.1:9001 failed on connection exception: 
java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
at org.apache.hadoop.ipc.Client.call(Client.java:908)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
at org.apache.hadoop.ipc.Client.call(Client.java:885)
... 15 more


Re: Retrying connect to server

2010-12-30 Thread Esteban Gutierrez Moguel
Hello Cavus,

is your Job Tracker running on localhost? It would be great if you can
provide more information about your current Hadoop setup.

cheers,
esteban.


estebangutierrez.com — twitter.com/esteban


2010/12/30 Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de

 I process this

 ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount
 gutenberg gutenberg-output

 I get this
 Dıd anyone know why I get this Error?

 10/12/30 16:48:59 INFO security.Groups: Group mapping
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=30
 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 0 time(s).
 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 1 time(s).
 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 2 time(s).
 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 3 time(s).
 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 4 time(s).
 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 5 time(s).
 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 6 time(s).
 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 7 time(s).
 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 8 time(s).
 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 9 time(s).
 Exception in thread main java.net.ConnectException: Call to localhost/
 127.0.0.1:9001 failed on connection exception: java.net.ConnectException:
 Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
at org.apache.hadoop.ipc.Client.call(Client.java:908)
at
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.getProtocolVersion(Unknown Source)
at
 org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
at
 org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
 Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
at
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
at
 org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
at org.apache.hadoop.ipc.Client.call(Client.java:885)
... 15 more



Re: Retrying connect to server

2010-12-30 Thread maha
Hi Cavus,

   Please check that hadoop JobTracker and other daemons are running by typing 
jps. If you see one of (JobTracker,TaskTracker,namenode,datanode) missing 
then you need to 'stop-all' then format the namenode and start-all again. 

 Maha

On Dec 30, 2010, at 7:52 AM, Cavus,M.,Fa. Post Direkt wrote:

 I process this
 
 ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg 
 gutenberg-output
 
 I get this
 Dıd anyone know why I get this Error?
 
 10/12/30 16:48:59 INFO security.Groups: Group mapping 
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; 
 cacheTimeout=30
 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 0 time(s).
 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 1 time(s).
 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 2 time(s).
 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 3 time(s).
 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 4 time(s).
 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 5 time(s).
 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 6 time(s).
 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 7 time(s).
 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 8 time(s).
 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 9 time(s).
 Exception in thread main java.net.ConnectException: Call to 
 localhost/127.0.0.1:9001 failed on connection exception: 
 java.net.ConnectException: Connection refused
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
   at org.apache.hadoop.ipc.Client.call(Client.java:908)
   at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
   at $Proxy0.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
   at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
   at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
   at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
   at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
   at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
   at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
   at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
   at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
   at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
   at org.apache.hadoop.ipc.Client.call(Client.java:885)
   ... 15 more



Flow of control

2010-12-30 Thread maha
Hi,

  (1) I declared a global variable in my hadoop mainClass which gets 
initialized in the 'run' function of this mainClass. When I try to access this 
global static variable from the MapperClass, it appears to be uninitialized. 

Why is that? Is it because of the parallel execution of Hadoop 
functions ? but , isn't the 'run' function supposed to be the one to run first 
and prepare all the job configurations before the Maps even start?

  (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html  shows 
the inputFormat to be the one running before the maps. My question is in which 
node? The JobTracker node?

  Thank you,
  Maha

Re: Retrying connect to server

2010-12-30 Thread li ping
make sure your /etc/hosts file contains the correct ip/hostname pair. This
is very important

2010/12/30 Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de

 I process this

 ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount
 gutenberg gutenberg-output

 I get this
 Dıd anyone know why I get this Error?

 10/12/30 16:48:59 INFO security.Groups: Group mapping
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=30
 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 0 time(s).
 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 1 time(s).
 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 2 time(s).
 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 3 time(s).
 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 4 time(s).
 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 5 time(s).
 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 6 time(s).
 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 7 time(s).
 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 8 time(s).
 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: localhost/
 127.0.0.1:9001. Already tried 9 time(s).
 Exception in thread main java.net.ConnectException: Call to localhost/
 127.0.0.1:9001 failed on connection exception: java.net.ConnectException:
 Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
at org.apache.hadoop.ipc.Client.call(Client.java:908)
at
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.getProtocolVersion(Unknown Source)
at
 org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
at
 org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
 Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
at
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
at
 org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
at org.apache.hadoop.ipc.Client.call(Client.java:885)
... 15 more




-- 
-李平


Re: Flow of control

2010-12-30 Thread li ping
On Fri, Dec 31, 2010 at 9:28 AM, maha m...@umail.ucsb.edu wrote:

 Hi,

  (1) I declared a global variable in my hadoop mainClass which gets
 initialized in the 'run' function of this mainClass. When I try to access
 this global static variable from the MapperClass, it appears to be
 uninitialized.

Why is that? Is it because of the parallel execution of Hadoop
 functions ? but , isn't the 'run' function supposed to be the one to run
 first and prepare all the job configurations before the Maps even start?

The Mapper will run on a remote machine, in other JVM, so the variable you
set in Main class can not be shared with other VM.


  (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html shows 
 the inputFormat to be the one running before the maps. My question is
 in which node? The JobTracker node?
 I think it should run on JobTracker, The inputFormat will split the file
 and the map function will read the every splited file.
  Thank you,
   Maha




-- 
-李平


Re: Retrying connect to server

2010-12-30 Thread Adarsh Sharma

Cavus,M.,Fa. Post Direkt wrote:

I process this

./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount gutenberg 
gutenberg-output

I get this
Dıd anyone know why I get this Error?

10/12/30 16:48:59 INFO security.Groups: Group mapping 
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30
10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 0 time(s).
10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 1 time(s).
10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 2 time(s).
10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 3 time(s).
10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 4 time(s).
10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 5 time(s).
10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 6 time(s).
10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 7 time(s).
10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 8 time(s).
10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: 
localhost/127.0.0.1:9001. Already tried 9 time(s).
Exception in thread main java.net.ConnectException: Call to 
localhost/127.0.0.1:9001 failed on connection exception: java.net.ConnectException: 
Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
at org.apache.hadoop.ipc.Client.call(Client.java:908)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
at $Proxy0.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
at org.apache.hadoop.ipc.Client.call(Client.java:885)
... 15 more
  

This is the most common issue occured after configuring Hadoop Cluster.

Reason :

1. Your NameNode, JobTracker is not running. Verify through Web UI and 
jps commands.
2. DNS Resolution. You must have IP hostname enteries if all nodes in 
/etc/hosts file.




Best Regards

Adarsh Sharma


Re: Retrying connect to server

2010-12-30 Thread James Seigel
Or
3) The configuration (or lack thereof) on the machine you are trying to 
run this, has no idea where your DFS or JobTracker  is :)

Cheers
James.

On 2010-12-30, at 8:53 PM, Adarsh Sharma wrote:

 Cavus,M.,Fa. Post Direkt wrote:
 I process this
 
 ./hadoop jar ../../hadoopjar/hd.jar org.postdirekt.hadoop.WordCount 
 gutenberg gutenberg-output
 
 I get this
 Dıd anyone know why I get this Error?
 
 10/12/30 16:48:59 INFO security.Groups: Group mapping 
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; 
 cacheTimeout=30
 10/12/30 16:49:01 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 0 time(s).
 10/12/30 16:49:02 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 1 time(s).
 10/12/30 16:49:03 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 2 time(s).
 10/12/30 16:49:04 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 3 time(s).
 10/12/30 16:49:05 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 4 time(s).
 10/12/30 16:49:06 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 5 time(s).
 10/12/30 16:49:07 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 6 time(s).
 10/12/30 16:49:08 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 7 time(s).
 10/12/30 16:49:09 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 8 time(s).
 10/12/30 16:49:10 INFO ipc.Client: Retrying connect to server: 
 localhost/127.0.0.1:9001. Already tried 9 time(s).
 Exception in thread main java.net.ConnectException: Call to 
 localhost/127.0.0.1:9001 failed on connection exception: 
 java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:932)
  at org.apache.hadoop.ipc.Client.call(Client.java:908)
  at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
  at $Proxy0.getProtocolVersion(Unknown Source)
  at 
 org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:228)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
  at org.apache.hadoop.mapreduce.Cluster.createRPCProxy(Cluster.java:82)
  at org.apache.hadoop.mapreduce.Cluster.createClient(Cluster.java:94)
  at org.apache.hadoop.mapreduce.Cluster.init(Cluster.java:70)
  at org.apache.hadoop.mapreduce.Job.init(Job.java:129)
  at org.apache.hadoop.mapreduce.Job.init(Job.java:134)
  at org.postdirekt.hadoop.WordCount.main(WordCount.java:19)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:192)
 Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
  at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
  at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:417)
  at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:207)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1025)
  at org.apache.hadoop.ipc.Client.call(Client.java:885)
  ... 15 more
  
 This is the most common issue occured after configuring Hadoop Cluster.
 
 Reason :
 
 1. Your NameNode, JobTracker is not running. Verify through Web UI and jps 
 commands.
 2. DNS Resolution. You must have IP hostname enteries if all nodes in 
 /etc/hosts file.
 
 
 
 Best Regards
 
 Adarsh Sharma



Re: Flow of control

2010-12-30 Thread maha
Very helpful :) thanks Ping.

Maha

On Dec 30, 2010, at 6:13 PM, li ping wrote:

 On Fri, Dec 31, 2010 at 9:28 AM, maha m...@umail.ucsb.edu wrote:
 
 Hi,
 
 (1) I declared a global variable in my hadoop mainClass which gets
 initialized in the 'run' function of this mainClass. When I try to access
 this global static variable from the MapperClass, it appears to be
 uninitialized.
 
   Why is that? Is it because of the parallel execution of Hadoop
 functions ? but , isn't the 'run' function supposed to be the one to run
 first and prepare all the job configurations before the Maps even start?
 
 The Mapper will run on a remote machine, in other JVM, so the variable you
 set in Main class can not be shared with other VM.
 
 
 (2) Fig 4.5 in http://developer.yahoo.com/hadoop/tutorial/module4.html shows 
 the inputFormat to be the one running before the maps. My question is
 in which node? The JobTracker node?
 I think it should run on JobTracker, The inputFormat will split the file
 and the map function will read the every splited file.
 Thank you,
  Maha
 
 
 
 
 -- 
 -李平



how to build hadoop in Linux

2010-12-30 Thread Da Zheng
Hello,

I need to build hadoop in Linux as I need to make some small changes in the
code, but I don't know what is the simplest way to build hadoop. I googled it
and so far I only found two places that tell how to build hadoop. One is
http://bigdata.wordpress.com/2010/05/27/hadoop-cookbook-3-how-to-build-your-own-hadoop-distribution/.
I downloaded apache forrest, and do as it
ant -Djava5.home=/usr/lib/jvm/java-1.5.0-gcj-4.4/
-Dforrest.home=/home/zhengda/apache-forrest-0.8 compile-core tar
and get an error:
 [exec] BUILD FAILED
 [exec] /home/zhengda/apache-forrest-0.8/main/targets/validate.xml:158:
java.lang.NullPointerException
What does this error mean? it seems apache forrest is used to create hadoop
document and I just want to rebuild hadoop java code. Is there a way for me to
just rebuild java code? I ran ant, it seems to work successfully, but I don't
know if it really compiled the code.

the other place I found is to show how to build hadoop with eclipse. I use
macbook and I have to ssh to linux boxes to work on hadoop, so it's not a very
good option even if it can really work.

Best,
Da


Re: how to build hadoop in Linux

2010-12-30 Thread Konstantin Boudnik
The Java5 dependency is about to go from Hadoop. See HADOOP-7072. I
will try to commit it first thing next year. So, wait a couple of days
and you'll be all right.

Happy New Year everyone!


On Thu, Dec 30, 2010 at 22:08, Da Zheng zhengda1...@gmail.com wrote:
 Hello,

 I need to build hadoop in Linux as I need to make some small changes in the
 code, but I don't know what is the simplest way to build hadoop. I googled it
 and so far I only found two places that tell how to build hadoop. One is
 http://bigdata.wordpress.com/2010/05/27/hadoop-cookbook-3-how-to-build-your-own-hadoop-distribution/.
 I downloaded apache forrest, and do as it
 ant -Djava5.home=/usr/lib/jvm/java-1.5.0-gcj-4.4/
 -Dforrest.home=/home/zhengda/apache-forrest-0.8 compile-core tar
 and get an error:
     [exec] BUILD FAILED
     [exec] /home/zhengda/apache-forrest-0.8/main/targets/validate.xml:158:
 java.lang.NullPointerException
 What does this error mean? it seems apache forrest is used to create hadoop
 document and I just want to rebuild hadoop java code. Is there a way for me to
 just rebuild java code? I ran ant, it seems to work successfully, but I 
 don't
 know if it really compiled the code.

 the other place I found is to show how to build hadoop with eclipse. I use
 macbook and I have to ssh to linux boxes to work on hadoop, so it's not a very
 good option even if it can really work.

 Best,
 Da



Re: Multiple Input Data Processing using MapReduce

2010-12-30 Thread Michael Toback

I am faced with a similar problem.

I want to process an entire set of bugs including their entire history.
Once. Then, incrementally process a combination of the latest output + the
changes since last processed.

I hit upon a way of handling multiple outputs. Perhaps if there was
something in the data format that told you where the data came from, the
same mapper could process both and the reducer could merge them?

Michael Toback
SMTS, VMWare
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-Input-Data-Processing-using-MapReduce-tp1701199p2165027.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.