Hey Edson,

Unfortunately I'm not sure what's going on here - for whatever reason, the
kernel isn't allowing Java NIO to use epoll, and thus the IPC framework from
Hadoop isn't working correctly. I don't think this is a hadoop specific bug.

Does this issue occur on all of the nodes?

-Todd

On Mon, Mar 29, 2010 at 2:26 PM, Edson Ramiro <erlfi...@gmail.com> wrote:

> I'm not involved with Debian community :(
>
> ram...@h02:~/hadoop$ cat /proc/sys/fs/epoll/max_user_watches
> 3373957
>
> and the Java is not the OpenSDK.
> The version is:
>
> ram...@lcpad:/usr/lib/jvm/java-6-sun$ java -version
> java version "1.6.0_17"
> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01, mixed mode)
>
> Edson Ramiro
>
>
> On 29 March 2010 17:14, Todd Lipcon <t...@cloudera.com> wrote:
>
> > Hi Edson,
> >
> > It looks like for some reason your kernel does not have epoll enabled.
> It's
> > very strange, since your kernel is very recent (in fact, bleeding edge!)
> >
> > Can you check the contents of /proc/sys/fs/epoll/max_user_watches
> >
> > Are you involved with the Debian community? This sounds like a general
> Java
> > bug. Can you also please verify that you're using the Sun JVM and not
> > OpenJDK (the debian folks like OpenJDK but it has subtle issues with
> > Hadoop)
> > You'll have to add a non-free repository and install sun-java6-jdk
> >
> > -Todd
> >
> > On Mon, Mar 29, 2010 at 1:05 PM, Edson Ramiro <erlfi...@gmail.com>
> wrote:
> >
> > > I'm using
> > >
> > > Linux h02 2.6.32.9 #2 SMP Sat Mar 6 19:09:13 BRT 2010 x86_64 GNU/Linux
> > >
> > > ram...@h02:~/hadoop$ cat /etc/debian_version
> > > squeeze/sid
> > >
> > > Thanks for reply
> > >
> > > Edson Ramiro
> > >
> > >
> > > On 29 March 2010 16:56, Todd Lipcon <t...@cloudera.com> wrote:
> > >
> > > > Hi Edson,
> > > >
> > > > What operating system are you on? What kernel version?
> > > >
> > > > Thanks
> > > > -Todd
> > > >
> > > > On Mon, Mar 29, 2010 at 12:01 PM, Edson Ramiro <erlfi...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'm trying to install Hadoop on a cluster, but I'm getting this
> > error.
> > > > >
> > > > > I'm using java version "1.6.0_17" and hadoop-0.20.1+169.56.tar.gz
> > from
> > > > > Cloudera.
> > > > >
> > > > > Its running in a NFS home shared between the nodes and masters.
> > > > >
> > > > > The NameNode works well, but all nodes try to connect and fail.
> > > > >
> > > > > Any Idea ?
> > > > >
> > > > > Thanks in Advance.
> > > > >
> > > > > ==> logs/hadoop-ramiro-datanode-a05.log <==
> > > > > 2010-03-29 15:56:00,168 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 0 time(s).
> > > > > 2010-03-29 15:56:01,172 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 1 time(s).
> > > > > 2010-03-29 15:56:02,176 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 2 time(s).
> > > > > 2010-03-29 15:56:03,180 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 3 time(s).
> > > > > 2010-03-29 15:56:04,184 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 4 time(s).
> > > > > 2010-03-29 15:56:05,188 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 5 time(s).
> > > > > 2010-03-29 15:56:06,192 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 6 time(s).
> > > > > 2010-03-29 15:56:07,196 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 7 time(s).
> > > > > 2010-03-29 15:56:08,200 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 8 time(s).
> > > > > 2010-03-29 15:56:09,204 INFO org.apache.hadoop.ipc.Client: Retrying
> > > > connect
> > > > > to server: lcpad/192.168.1.51:9000. Already tried 9 time(s).
> > > > > 2010-03-29 15:56:09,204 ERROR
> > > > > org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.io.IOException:
> > > > Call
> > > > > to lcpad/192.168.1.51:9000 failed on local exception:
> > > > java.io.IOException:
> > > > > Function not implemented
> > > > >        at
> org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
> > > > >        at org.apache.hadoop.ipc.Client.call(Client.java:743)
> > > > >        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> > > > >        at $Proxy4.getProtocolVersion(Unknown Source)
> > > > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> > > > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
> > > > >        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
> > > > >        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314)
> > > > >        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:278)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:225)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1309)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1264)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1272)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1394)
> > > > > Caused by: java.io.IOException: Function not implemented
> > > > >        at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
> > > > >        at
> > > sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:68)
> > > > >        at
> > > sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)
> > > > >        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:407)
> > > > >        at
> > > > >
> > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:304)
> > > > >        at
> > > > >
> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
> > > > >        at
> org.apache.hadoop.ipc.Client.getConnection(Client.java:860)
> > > > >        at org.apache.hadoop.ipc.Client.call(Client.java:720)
> > > > >        ... 13 more
> > > > >
> > > > > Edson Ramiro
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Todd Lipcon
> > > > Software Engineer, Cloudera
> > > >
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to