[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485968#comment-15485968
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/10#discussion_r78485992
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/runtime/DefaultTransactionManagerProvider.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.runtime;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.inject.AbstractModule;
+import com.google.inject.Guice;
+import com.google.inject.Inject;
+import com.google.inject.Injector;
+import com.google.inject.Module;
+import com.google.inject.Provider;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.tephra.TransactionManager;
+import org.apache.twill.zookeeper.ZKClient;
+import org.apache.twill.zookeeper.ZKClientService;
+
+/**
+ * A provider for {@link TransactionManager} that provides a new instance 
every time.
+ */
+public class DefaultTransactionManagerProvider implements 
Provider {
+  private final Configuration conf;
+  private final ZKClientService zkClientService;
+
+  @Inject
+  public DefaultTransactionManagerProvider(Configuration conf, 
ZKClientService zkClientService) {
+this.conf = conf;
+this.zkClientService = zkClientService;
+  }
+
+  @Override
+  public TransactionManager get() {
+// Create a new injector every time since Guice services cannot be 
restarted TEPHRA-179
+Injector injector = Guice.createInjector(
--- End diff --

This will require more decoupling. We'll need to break the class hierarchy 
into HA Transaction Service, Transaction Service and Transaction Manager. For 
now I have added some get methods to help testing in PR 
https://github.com/apache/incubator-tephra/pull/11


> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 

[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484575#comment-15484575
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/10#discussion_r78407204
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/runtime/DefaultTransactionManagerProvider.java
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.tephra.runtime;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.inject.AbstractModule;
+import com.google.inject.Guice;
+import com.google.inject.Inject;
+import com.google.inject.Injector;
+import com.google.inject.Module;
+import com.google.inject.Provider;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.tephra.TransactionManager;
+import org.apache.twill.zookeeper.ZKClient;
+import org.apache.twill.zookeeper.ZKClientService;
+
+/**
+ * A provider for {@link TransactionManager} that provides a new instance 
every time.
+ */
+public class DefaultTransactionManagerProvider implements 
Provider {
+  private final Configuration conf;
+  private final ZKClientService zkClientService;
+
+  @Inject
+  public DefaultTransactionManagerProvider(Configuration conf, 
ZKClientService zkClientService) {
+this.conf = conf;
+this.zkClientService = zkClientService;
+  }
+
+  @Override
+  public TransactionManager get() {
+// Create a new injector every time since Guice services cannot be 
restarted TEPHRA-179
+Injector injector = Guice.createInjector(
--- End diff --

This is quite hacky in the way that usual provider doesn't create instance 
with a different injector. If all we need is a new instance, why not new it 
directly in here?


> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 64
> -p: processes  unlimited
> -n: file descriptors   65536
> -v: 

[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484568#comment-15484568
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/10#discussion_r78406517
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/distributed/TransactionService.java 
---
@@ -153,4 +193,31 @@ protected void internalStop() {
   }
 }
   }
+
+  private void undoRegister() {
+if (cancelDiscovery != null) {
+  cancelDiscovery.cancel();
+}
+  }
+
+  private void doRegister() {
--- End diff --

`register`?


> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 64
> -p: processes  unlimited
> -n: file descriptors   65536
> -v: address space (kb) unlimited
> -w: locks  unlimited
> -e: scheduling priority0
> -r: real-time priority 0
> Command:  /usr/lib/jvm/java-1.8-openjdk/bin/java -XX:+UseConcMarkSweepGC -cp 
> 

[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15484555#comment-15484555
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

Github user chtyim commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/10#discussion_r78405767
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/distributed/TransactionService.java 
---
@@ -42,28 +45,64 @@
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.TimeoutException;
+import javax.annotation.Nullable;
 
 /**
  *
  */
-public final class TransactionService extends InMemoryTransactionService {
+public class TransactionService extends AbstractService {
--- End diff --

Add a javadoc about this class to tell what does it do and when it should 
be used.


> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 64
> -p: processes  unlimited
> -n: file descriptors   65536
> -v: address space (kb) unlimited
> -w: locks  unlimited
> -e: scheduling priority0
> -r: real-time priority 0
> Command:  /usr/lib/jvm/java-1.8-openjdk/bin/java -XX:+UseConcMarkSweepGC -cp 
> 

[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478777#comment-15478777
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

GitHub user poornachandra opened a pull request:

https://github.com/apache/incubator-tephra/pull/10

TEPHRA-179 Transaction service high availability changes

Restructuring the Transaction Service classes to allow for HA restart while 
binding Transaction Manager and other classes as singletons

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/poornachandra/incubator-tephra 
feature/tx-service-ha

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-tephra/pull/10.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10


commit 4b2bfe6a6733440fa73c7f5e00ff499662387911
Author: poorna 
Date:   2016-09-09T23:44:22Z

TEPHRA-179 Transaction service high availability changes

commit 7efff83009675a512f39cf8b0e94d1c6a1cde20a
Author: poorna 
Date:   2016-09-10T00:24:26Z

Add HA test




> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 64
> -p: processes  unlimited
> -n: file descriptors   65536
> -v: address space (kb) unlimited
> -w: locks  unlimited
> -e: scheduling priority0
> -r: real-time priority 0
> Command:  /usr/lib/jvm/java-1.8-openjdk/bin/java -XX:+UseConcMarkSweepGC -cp 
> 

[jira] [Commented] (TEPHRA-179) Tephra transaction manager breaks on zookeeper restart

2016-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TEPHRA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471730#comment-15471730
 ] 

ASF GitHub Bot commented on TEPHRA-179:
---

Github user poornachandra commented on a diff in the pull request:

https://github.com/apache/incubator-tephra/pull/2#discussion_r77899348
  
--- Diff: 
tephra-core/src/main/java/org/apache/tephra/runtime/TransactionDistributedModule.java
 ---
@@ -41,14 +41,15 @@
 
   @Override
   protected void configure() {
+// some of these classes need to be non-singleton in order to create a 
new instance during leader() in
+// TransactionService
 bind(SnapshotCodecProvider.class).in(Singleton.class);
-
bind(TransactionStateStorage.class).annotatedWith(Names.named("persist"))
-  .to(HDFSTransactionStateStorage.class).in(Singleton.class);
-
bind(TransactionStateStorage.class).toProvider(TransactionStateStorageProvider.class).in(Singleton.class);
+
bind(TransactionStateStorage.class).annotatedWith(Names.named("persist")).to(HDFSTransactionStateStorage.class);
+
bind(TransactionStateStorage.class).toProvider(TransactionStateStorageProvider.class);
 
-bind(TransactionManager.class).in(Singleton.class);
-
bind(TransactionSystemClient.class).to(TransactionServiceClient.class).in(Singleton.class);
--- End diff --

Since `TransactionServiceClient` can contain pool of thrift clients, it is 
better to have it as a singleton.


> Tephra transaction manager breaks on zookeeper restart
> --
>
> Key: TEPHRA-179
> URL: https://issues.apache.org/jira/browse/TEPHRA-179
> Project: Tephra
>  Issue Type: Bug
>  Components: manager
>Affects Versions: 0.8.0-incubating
> Environment: OpenJDK 8 (JDK) on Alpine Linux 3.4 in Docker
>Reporter: Francis Chuang
>Assignee: Ali Anwar
> Fix For: 0.9.0-incubating
>
>
> I am running HBase 1.2.2 with Phoenix 4.8.0 with the tephra transaction 
> server in 1 docker container. In another docker container, I have Zookeeper 
> 3.4.8 manage by Netflix Exhibitor.
> When everything first starts, I am able to create transactional table and run 
> transactional queries.
> However, once Exhibitor restarts zookeeper and tephra reconnects to 
> zookeeper, it no longer works correctly. 
> Running transactional queries result in this error:
> {code}
> Error: Error -1 (0) : Error while executing SQL "CREATE TABLE my_table321 
> (k BIGINT PRIMARY KEY, v VARCHAR) TRANSACTIONAL=true": Remote driver error: 
> RuntimeException: java.lang.Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> Exception: Thrift error for 
> org.apache.tephra.distributed.TransactionServiceClient$2@2361d7ab: Internal 
> error processing startShort -> TApplicationException: Internal error 
> processing startShort
> SQLState:  0
> ErrorCode: -1
> {code}
> This is the full log:
> {code}
> Fri Sep  2 00:26:50 UTC 2016 Starting tephra service on 
> m9edd51-hmaster1.m9edd51
> -f: file size (blocks) unlimited
> -t: cpu time (seconds) unlimited
> -d: data seg size (kb) unlimited
> -s: stack size (kb)8192
> -c: core file size (blocks)unlimited
> -m: resident set size (kb) unlimited
> -l: locked memory (kb) 64
> -p: processes  unlimited
> -n: file descriptors   65536
> -v: address space (kb) unlimited
> -w: locks  unlimited
> -e: scheduling priority0
> -r: real-time priority 0
> Command:  /usr/lib/jvm/java-1.8-openjdk/bin/java -XX:+UseConcMarkSweepGC -cp 
>