It actually failed on another node brick5 too. I do have the write permission of /tmp dir. But the most weird thing is: the snappy..so file is generated every time I run the example, but its owner is another user who has once run hama on the same machine, even when I manually rm and re-generate the file! And I don't have the permission to rm it!
2012/9/11 Thomas Jungblut <[email protected]> > Oh, or just remove the snappy compression from the configuration ;) > > 2012/9/11 Thomas Jungblut <[email protected]> > > > Can you give him permission to delete in /tmp/? > > Why is it not failing on the other hosts? > > Otherwise you have to manually add the snappy library to the lib path of > > the task. > > > > > > 2012/9/11 Sandy Ding <[email protected]> > > > >> the IOException stack: > >> > >> java.io.IOException: failed to remove existing native library file: > >> /tmp/snappy-1.0.4.1-libsna > >> ppyjava.so > >> at > >> org.xerial.snappy.SnappyLoader.extractLibraryFile(SnappyLoader.java:376) > >> at > >> org.xerial.snappy.SnappyLoader.findNativeLibrary(SnappyLoader.java:446) > >> at > >> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:308) > >> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) > >> at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) > >> at > >> org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79) > >> at > >> org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:66) > >> at > >> > >> > org.apache.hama.bsp.message.compress.SnappyCompressor.compressBundle(SnappyCompress > >> or.java:43) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.serializeMessage(AvroMessageMana > >> gerImpl.java:135) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl. > >> java:79) > >> at org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> at > >> > org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:69) > >> at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166) > >> at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143) > >> at > >> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1158) > >> java.lang.reflect.InvocationTargetException > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> at > >> > >> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >> at > >> > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:2 > >> 5) > >> at java.lang.reflect.Method.invoke(Method.java:597) > >> at > >> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317) > >> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219) > >> at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44) > >> at > >> org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79) > >> at > >> org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:66) > >> at > >> > >> > org.apache.hama.bsp.message.compress.SnappyCompressor.compressBundle(SnappyCompressor.java:43) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.serializeMessage(AvroMessageManagerImpl.java:135) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl.java:79) > >> at org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> at > >> > org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:69) > >> at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166) > >> at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143) > >> at > >> org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1158) > >> Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in > >> java.library.path > >> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738) > >> at java.lang.Runtime.loadLibrary0(Runtime.java:823) > >> at java.lang.System.loadLibrary(System.java:1028) > >> at > >> > >> > org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52) > >> ... 17 more > >> 12/09/11 21:27:14 ERROR bsp.BSPTask: Error running bsp setup and bsp > >> function. > >> java.lang.NullPointerException > >> at > >> > >> > org.apache.hama.bsp.message.compress.SnappyCompressor.compressBundle(SnappyCompressor.java:56) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.serializeMessage(AvroMessageManagerImpl.java:135) > >> at > >> > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl.java:79) > >> at org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> > >> the user that I currently run hama doesn't have permissions to remove > >> /tmp/. files. > >> but I cannot run hama using sudo. > >> doesn't what to do... > >> > >> 2012/9/11 Thomas Jungblut <[email protected]> > >> > >> > If you don't have additional stack information it is difficult to > guess. > >> > According to > >> > http://docs.oracle.com/javase/7/docs/api/java/io/File.html#delete() > >> > it should throw an IOException that contains the underlying error. If > >> the > >> > shared object can't be deleted, I would try to start by checking > rights > >> or > >> > what on the other machines is not like on the machine that failed. > >> > > >> > 2012/9/11 Sandy Ding <[email protected]> > >> > > >> > > Any possible reasons for that? > >> > > > >> > > 2012/9/11 Thomas Jungblut <[email protected]> > >> > > > >> > > > Hi, > >> > > > > >> > > > that is actually a good question. > >> > > > According to the exception and the code [1] it was unable to > delete > >> the > >> > > > file. > >> > > > Actually this could have various reasons, what operating system > are > >> you > >> > > > running? > >> > > > > >> > > > [1] > >> > > > > >> > > > > >> > > > >> > > >> > https://github.com/xerial/snappy-java/blob/develop/src/main/java/org/xerial/snappy/SnappyLoader.java#L374 > >> > > > > >> > > > > >> > > > 2012/9/11 Sandy Ding <[email protected]> > >> > > > > >> > > > > What actually happened in brick4 (the node that refused > >> > r910(master)'s > >> > > > > connection) is that it > >> > > > > " failed to remove existing native library file: /tmp/ > >> > > > > snappy-1.0.4.1-libsnappyjava.so > >> > > > > at > >> > > > > > >> > > > >> > org.xerial.snappy.SnappyLoader.extractLibraryFile(SnappyLoader.java:376)" > >> > > > > then > >> > > > > java.lang.NullPointerException > >> > > > > at > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.hama.bsp.message.compress.SnappyCompressor.compressBundle(SnappyCompressor.java:56) > >> > > > > > >> > > > > But I actually run in the Hama directory and > >> snappy-java-1.0.4.1.jar > >> > is > >> > > > > included in hama/lib dir. > >> > > > > I even include the hama/lib/snappy-java-1.0.4.1.jar in > CLASSPATH. > >> > > > > What happened? > >> > > > > > >> > > > > > >> > > > > 2012/9/11 Sandy Ding <[email protected]> > >> > > > > > >> > > > > > Hi, all, > >> > > > > > > >> > > > > > I newly set up a 3-node hama cluster following the > >> > > > HamaInstallationGuide. > >> > > > > > But I got some confusing errors when running the pi examples, > >> the > >> > > > tasklog > >> > > > > > is as follows: > >> > > > > > > >> > > > > > 12/09/11 19:23:39 INFO zookeeper.ZooKeeper: Initiating client > >> > > > connection, > >> > > > > > connectString=brick4: > >> > > > > > 21810,r910:21810 sessionTimeout=1200000 > >> > > > > > watcher=org.apache.hama.bsp.sync.ZooKeeperSyncClientImp > >> > > > > > l@e61a35 > >> > > > > > 12/09/11 19:23:39 INFO zookeeper.ClientCnxn: Opening socket > >> > > connection > >> > > > to > >> > > > > > server brick4/10.131. > >> > > > > > 201.14:21810 > >> > > > > > 12/09/11 19:23:39 INFO sync.ZooKeeperSyncClientImpl: Start > >> > connecting > >> > > > to > >> > > > > > Zookeeper! At r910.ppi > >> > > > > > /10.131.201.90:61001 > >> > > > > > 12/09/11 19:23:39 INFO zookeeper.ClientCnxn: Socket connection > >> > > > > established > >> > > > > > to brick4/10.131.201 > >> > > > > > .14:21810, initiating session > >> > > > > > 12/09/11 19:23:39 INFO zookeeper.ClientCnxn: Session > >> establishment > >> > > > > > complete on server brick4/10 > >> > > > > > .131.201.14:21810, sessionid = 0x139b47286bf000c, negotiated > >> > timeout > >> > > = > >> > > > > > 1200000 > >> > > > > > 12/09/11 19:23:40 INFO ipc.NettyTransceiver: Connecting to > >> > > brick4.r715/ > >> > > > > > 10.131.201.14:61003 > >> > > > > > 12/09/11 19:23:40 INFO ipc.NettyTransceiver: [id: 0x0118278a] > >> OPEN > >> > > > > > java.net.ConnectException: Connection refused > >> > > > > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> Method) > >> > > > > > at > >> > > > > > > >> > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClien > >> > > > > > tSocketPipelineSink.java:384) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedK > >> > > > > > eys(NioClientSocketPipelineSink.java:354) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSoc > >> > > > > > ketPipelineSink.java:276) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> > > > > > at java.lang.Thread.run(Thread.java:662) > >> > > > > > 12/09/11 19:23:40 INFO ipc.NettyTransceiver: [id: 0x0118278a] > >> > CLOSED > >> > > > > > 12/09/11 19:23:40 INFO ipc.NettyTransceiver: Remote peer > >> > brick4.r715/ > >> > > > > > 10.131.201.14:61003 closed > >> > > > > > connection. > >> > > > > > 12/09/11 19:23:40 ERROR bsp.BSPTask: Error running bsp setup > and > >> > bsp > >> > > > > > function. > >> > > > > > java.io.IOException: Error connecting to brick4.r715/ > >> > > > 10.131.201.14:61003 > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:163) > >> > > > > > at > >> > > > > > > >> > > > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:128) > >> > > > > > at > >> > > > > > > >> > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:91) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl.j > >> > > > > > ava:83) > >> > > > > > at > >> > org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:69) > >> > > > > > at > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166) > >> > > > > > at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143) > >> > > > > > at > >> > > > > > > >> > > > > >> > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1158) > >> > > > > > Caused by: java.net.ConnectException: Connection refused > >> > > > > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> Method) > >> > > > > > at > >> > > > > > > >> > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> > > > > > at java.lang.Thread.run(Thread.java:662) > >> > > > > > 12/09/11 19:23:40 WARN ipc.NettyTransceiver: Unexpected > >> exception > >> > > from > >> > > > > > downstream. > >> > > > > > java.net.ConnectException: Connection refused > >> > > > > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> Method) > >> > > > > > at > >> > > > > > >> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> > > > > > at java.lang.Thread.run(Thread.java:662) > >> > > > > > 12/09/11 19:23:40 INFO zookeeper.ZooKeeper: Session: > >> > > 0x139b47286bf000c > >> > > > > > closed > >> > > > > > 12/09/11 19:23:40 INFO zookeeper.ClientCnxn: EventThread shut > >> down > >> > > > > > 12/09/11 19:23:40 ERROR bsp.BSPTask: Shutting down ping > service. > >> > > > > > 12/09/11 19:23:40 FATAL bsp.GroomServer: Error running child > >> > > > > > java.io.IOException: Error connecting to brick4.r715/ > >> > > > 10.131.201.14:61003 > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:163) > >> > > > > > at > >> > > > > > > >> > > > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:128) > >> > > > > > at > >> > > > > > > >> > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:91) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl.java:83) > >> > > > > > at > >> > org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:69) > >> > > > > > at > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166) > >> > > > > > at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143) > >> > > > > > at > >> > > > > > > >> > > > > >> > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1158) > >> > > > > > Caused by: java.net.ConnectException: Connection refused > >> > > > > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> Method) > >> > > > > > at > >> > > > > > > >> > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> > > > > > at java.lang.Thread.run(Thread.java:662) > >> > > > > > java.io.IOException: Error connecting to brick4.r715/ > >> > > > 10.131.201.14:61003 > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:163) > >> > > > > > at > >> > > > > > > >> > > > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:128) > >> > > > > > at > >> > > > > > > >> > org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:91) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.apache.hama.bsp.message.AvroMessageManagerImpl.transfer(AvroMessageManagerImpl.java:83) > >> > > > > > at > >> > org.apache.hama.bsp.BSPPeerImpl.sync(BSPPeerImpl.java:328) > >> > > > > > at > >> > > > > > > >> > > > > >> > > >> > org.apache.hama.examples.PiEstimator$MyEstimator.bsp(PiEstimator.java:69) > >> > > > > > at > org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:166) > >> > > > > > at org.apache.hama.bsp.BSPTask.run(BSPTask.java:143) > >> > > > > > at > >> > > > > > > >> > > > > >> > > org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1158) > >> > > > > > Caused by: java.net.ConnectException: Connection refused > >> > > > > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native > >> Method) > >> > > > > > at > >> > > > > > > >> > > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:384) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:354) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:276) > >> > > > > > at > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> > > > > > ..... > >> > > > > > > >> > > > > > Can anybody help? I am really desparate... > >> > > > > > > >> > > > > > Thanks in advance, > >> > > > > > > >> > > > > > Sandy > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >
