[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Vasu Mariyala (JIRA) Fri, 16 Aug 2013 09:35:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742367#comment-13742367
 ]


Vasu Mariyala commented on HBASE-7709:
--------------------------------------

I ran all the test cases on my local machine with the trunk patch and they are 
successful. But everytime it is run on jenkins, it throws 

FATAL: Unable to delete script file /tmp/hudson5964600500647866956.sh
hudson.util.IOException2: remote file operation failed: 
/tmp/hudson5964600500647866956.sh at hudson.remoting.Channel@5ce45886:hadoop1
        at hudson.FilePath.act(FilePath.java:902)
        at hudson.FilePath.act(FilePath.java:879)
        at hudson.FilePath.delete(FilePath.java:1288)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
        at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
        at hudson.model.Build$BuildExecution.build(Build.java:199)
        at hudson.model.Build$BuildExecution.doRun(Build.java:160)
        at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
        at hudson.model.Run.execute(Run.java:1597)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:247)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
        at hudson.remoting.Channel.send(Channel.java:516)
        at hudson.remoting.Request.call(Request.java:129)
        at hudson.remoting.Channel.call(Channel.java:714)
        at hudson.FilePath.act(FilePath.java:895)
        ... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
        at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at hudson.remoting.Command.readFrom(Command.java:92)
        at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:72)
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
hudson.remoting.RequestAbortedException: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
        at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
        at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
        at hudson.remoting.Request.call(Request.java:174)
        at hudson.remoting.Channel.call(Channel.java:714)
        at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
        at com.sun.proxy.$Proxy40.join(Unknown Source)
        at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
        at hudson.Launcher$ProcStarter.join(Launcher.java:360)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
        at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
        at hudson.model.Build$BuildExecution.build(Build.java:199)
        at hudson.model.Build$BuildExecution.doRun(Build.java:160)
        at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
        at hudson.model.Run.execute(Run.java:1597)
        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
        at hudson.model.ResourceController.execute(ResourceController.java:88)
        at hudson.model.Executor.run(Executor.java:247)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: 
Unexpected termination of the channel
        at hudson.remoting.Request.abort(Request.java:299)
        at hudson.remoting.Channel.terminate(Channel.java:774)
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
        at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at hudson.remoting.Command.readFrom(Command.java:92)
        at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:72)
        at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

https://builds.apache.org/job/PreCommit-HBASE-Build/6784/console
https://builds.apache.org/job/PreCommit-HBASE-Build/6781/console

Can any one please let me how I can resolve this issue?

                
> Infinite loop possible in Master/Master replication
> ---------------------------------------------------
>
>                 Key: HBASE-7709
>                 URL: https://issues.apache.org/jira/browse/HBASE-7709
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.6, 0.95.1
>            Reporter: Lars Hofhansl
>             Fix For: 0.98.0, 0.94.12, 0.96.0
>
>         Attachments: HBASE-7709-095-trunk.patch, HBASE-7709.patch, 
> HBASE-7709-rev1.patch, HBASE-7709-rev2.patch
>
>
>  We just discovered the following scenario:
> # Cluster A and B are setup in master/master replication
> # By accident we had Cluster C replicate to Cluster A.
> Now all edit originating from C will be bouncing between A and B. Forever!
> The reason is that when the edit come in from C the cluster ID is already set 
> and won't be reset.
> We have a couple of options here:
> # Optionally only support master/master (not cycles of more than two 
> clusters). In that case we can always reset the cluster ID in the 
> ReplicationSource. That means that now cycles > 2 will have the data cycle 
> forever. This is the only option that requires no changes in the HLog format.
> # Instead of a single cluster id per edit maintain a (unordered) set of 
> cluster id that have seen this edit. Then in ReplicationSource we drop any 
> edit that the sink has seen already. The is the cleanest approach, but it 
> might need a lot of data stored per edit if there are many clusters involved.
> # Maintain a configurable counter of the maximum cycle side we want to 
> support. Could default to 10 (even maybe even just). Store a hop-count in the 
> WAL and the ReplicationSource increases that hop-count on each hop. If we're 
> over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Reply via email to