Hi Ming, This looks like a bug. Feel free to dig in and try and fix it :)
The cross region stuff in hedwig was never tested extensively, so there's probably quite a few bugs in there. Regards Ivan On Mon, Jan 12, 2015 at 7:42 PM, Ming Chen <[email protected]> wrote: > FYI, the cross-region communication is working now after I used the > latest code from git and enabled SSL in conf. > > Even though there seems to be an infinite loop when I do "sub mytopic > myid1-1 2" in "hedwig console": > [hedwig: (reg1) 164] sub mytopic myid1-1 2 > SUB DONE AND RECEIVE > Finished 0.031 s. > [hedwig: (reg1) 165] Received message from topic mytopic for subscriber > myid1-1 : neeeeew-msg-from-reg2 > Received message from topic mytopic for subscriber myid1-1 : mysg-1-2 > Received message from topic mytopic for subscriber myid1-1 : > abs-new-msg-from-reg1 > Received message from topic mytopic for subscriber myid1-1 : mysg-1-2 > Received message from topic mytopic for subscriber myid1-1 : > neeeeew-msg-from-reg2 > Received message from topic mytopic for subscriber myid1-1 : msg-2-1 > Received message from topic mytopic for subscriber myid1-1 : > abs-new-msg-from-reg1 > Received message from topic mytopic for subscriber myid1-1 : mysg-1-2 > Received message from topic mytopic for subscriber myid1-1 : > neeeeew-msg-from-reg2 > ... > > Thanks, > Ming > > On Thu, Jan 8, 2015 at 11:24 AM, Ming Chen <[email protected]> > wrote: > >> Hi Ivan, >> >> Thanks for the heads-up. Sorry that I didn't make it clear, but I did >> set the region option in hw_server.conf to "reg1" and "reg2" for the two >> regions, respectively. >> >> I tried some more experiments, and got some error message with the >> following operations on just one region: >> (1) format >> (2) show topics # it throws an IOException, which is probably okay as we >> did not have any topic to show >> (3) pub mytopic1 hello-topic1 >> (4) sub mytopic1 myid1 2 >> >> [hedwig: (reg1) 88] format >> You ask to format hedwig metadata stored in >> org.apache.hedwig.server.meta.ZkMetadataManagerFactory. >> Press <Return> to continue, or Q to cancel ... >> 2015-01-08 00:09:45,752 - INFO - [main:HedwigAdmin@541] - Formatted >> Hedwig metadata successfully. >> 2015-01-08 00:09:45,757 - INFO - [main:HedwigAdmin@544] - Removed old >> factory layout. >> 2015-01-08 00:09:45,770 - INFO - [main:HedwigAdmin@548] - Created new >> factory layout. >> Formatted hedwig metadata successfully. >> Finished 2.352 s. >> [hedwig: (reg1) 89] show topics >> Unable to fetch the list of topics >> java.io.IOException: Failed to get topics list : >> at >> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:98) >> at >> org.apache.hedwig.admin.HedwigAdmin.getTopics(HedwigAdmin.java:331) >> at >> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.showTopics(HedwigConsole.java:588) >> at >> org.apache.hedwig.admin.console.HedwigConsole$ShowCmd.runCmd(HedwigConsole.java:564) >> at >> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966) >> at >> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937) >> at >> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021) >> at >> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036) >> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >> KeeperErrorCode = NoNode for /hedwig/reg1/topics >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111) >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) >> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500) >> at >> org.apache.hedwig.server.meta.ZkMetadataManagerFactory.getTopics(ZkMetadataManagerFactory.java:96) >> ... 7 more >> Finished 0.015 s. >> [hedwig: (reg1) 90] pub mytopic1 hello-topic1 >> PUB DONE >> Finished 0.472 s. >> [hedwig: (reg1) 91] sub mytopic1 myid1 2 >> 2015-01-08 00:13:38,021 - INFO - [New I/O worker #6:HChannelHandler@228] >> - Channel [id: 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] >> was disconnected to host localhost/1 >> 27.0.0.1:4080. >> 2015-01-08 00:13:38,022 - INFO - [New I/O worker >> #6:AbstractHChannelManager@357] - NonSubscription Channel [id: >> 0x50aa85e6, /127.0.0.1:52095 :> localhost/127.0.0.1:4080] to localhost >> /127.0.0.1:4080 disconnected. >> 2015-01-08 00:13:38,030 - INFO - [New I/O worker #7:HChannelHandler@228] >> - Channel [id: 0x9615a67b, /127.0.0.1:52098 :> localhost/127.0.0.1:4080] >> was disconnected to host localhost/1 >> 27.0.0.1:4080. >> 2015-01-08 00:13:38,031 - INFO - [New I/O worker >> #7:SimpleHChannelManager@191] - Subscription Channel [id: 0x9615a67b, / >> 127.0.0.1:52098 :> localhost/127.0.0.1:4080] disconnected from >> localhost/127.0.0.1:4080. >> 2015-01-08 00:13:38,037 - ERROR - [main:HedwigSubscriber@130] - >> Unexpected PubSubException thrown: >> org.apache.hedwig.exceptions.PubSubException$UncertainStateException: >> Server ack response never received before server connection disconnected! >> at >> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252) >> at >> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) >> at >> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) >> at >> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493) >> at >> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365) >> at >> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) >> at >> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360) >> at >> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) >> at >> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) >> at >> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) >> at >> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> SUB FAILED >> org.apache.hedwig.exceptions.PubSubException$ServiceDownException: >> org.apache.hedwig.exceptions.PubSubException$UncertainStateException: >> Server ack response never received before server connection disconnected! >> at >> org.apache.hedwig.client.netty.HedwigSubscriber.subUnsub(HedwigSubscriber.java:133) >> at >> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:194) >> at >> org.apache.hedwig.client.netty.HedwigSubscriber.subscribe(HedwigSubscriber.java:181) >> at >> org.apache.hedwig.admin.console.HedwigConsole$SubCmd.runCmd(HedwigConsole.java:291) >> at >> org.apache.hedwig.admin.console.HedwigConsole.processCmd(HedwigConsole.java:966) >> at >> org.apache.hedwig.admin.console.HedwigConsole.executeLine(HedwigConsole.java:937) >> at >> org.apache.hedwig.admin.console.HedwigConsole.run(HedwigConsole.java:1021) >> at >> org.apache.hedwig.admin.console.HedwigConsole.main(HedwigConsole.java:1036) >> Caused by: >> org.apache.hedwig.exceptions.PubSubException$UncertainStateException: >> Server ack response never received before server connection disconnected! >> at >> org.apache.hedwig.client.netty.impl.HChannelHandler.channelDisconnected(HChannelHandler.java:252) >> at >> org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:120) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) >> at >> org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:60) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) >> at >> org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:493) >> at >> org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365) >> at >> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) >> at >> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) >> at >> org.jboss.netty.channel.Channels.fireChannelDisconnected(Channels.java:396) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:360) >> at >> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) >> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) >> at >> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) >> at >> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) >> at >> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> >> Thanks, >> Ming >> >> >> On Thu, Jan 8, 2015 at 6:05 AM, Ivan Kelly <[email protected]> wrote: >> > Hi Ming, >> > >> > It's been a long time since I looked at the region stuff in hedwig, but >> I >> > think it could be that you don't seem to be setting the region >> identifier in >> > hw_server.conf. You need to change "region" in hw_server to some >> identifier, >> > like reg1 and reg2 for your example. >> > >> > Hope this helps, >> > Ivan >> > >> >> >
