[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348030#comment-17348030 ] Andrew Kyle Purtell commented on HBASE-24623: - Not sure. The idea didn’t pan out, so if it hasn’t been committed the issue can just be closed. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348028#comment-17348028 ] Anoop Sam John commented on HBASE-24623: So ur jira was never committed Andy? > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+40 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348027#comment-17348027 ] Anoop Sam John commented on HBASE-24623: Oh.. Thanks for those details Andy. Now I remember ur comments abt its only at client end. I see.. Ya perf wise for sure it will be a problem. I was thinking about turning it off temp at server end and watch for few days. So seems the way is to tune to the cluster size and configs so as not to make RS memory heavy. We saw issues in RS where memory was well above 95%! > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347793#comment-17347793 ] Andrew Kyle Purtell commented on HBASE-24623: - {quote}There was a jira that [~andrew.purt...@gmail.com] did for turning off the usage of Unsafe. {quote} We did this for the client side. The thought was our application server, running on Java 11, embedding the HBase client did not need to use Unsafe there, so Unsafe was an unnecessary risk. Well, turns out I was wrong, even on the client side we need Unsafe for performance. As soon as we tried it we were dinged for a significant performance regression. Turning off Unsafe on the server would be a nonstarter, in terms of performance loss. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347682#comment-17347682 ] Anoop Sam John commented on HBASE-24623: So this is not the case of early release of BB (which was the case what Duo mentioned). That wont cause a JVM crash with issue with memory copy. I too faced this issue last week. I believe this happens when there is heavy memory usage in RS side and lots of GC activity. I could see the RS memory was >95% Now when replication sink side one RS received the data to be replicated in replicateWALEntry() call. This is received into offheap BB (Netty's as NettyRpcServer is the default. We wont copy from there to onheap for creating CellScanner right [~zhangduo] ?) Now the ReplicationSink will act like HBase client and issue table.batch() call for writing the replicated rows. As part of this, we will create CellBlocks. This include write of Cells /encode to KVCodec#Encoder . So here we will have copy of data from offheap to offheap (Ya the cellblock build will use DBB in RS side). So here we will use Unsafe memory copy API. My guess is we might be hitting some JDK bug with this Unsafe copy when there is heavy memory usage and GC activity. Thoughts? There was a jira that [~andrew.purt...@gmail.com] did for turning off the usage of Unsafe. Am not able to remember that Jira id though. Need to use it in my cluster case and see whether we see the issue. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147477#comment-17147477 ] Michael Stack commented on HBASE-24623: --- bq. IIRC we've fixed a bug related to this area before, where we release the byte buffer before actually writing it out, which causes WAL splitting to fail. Did it cause JVM crash? > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146861#comment-17146861 ] Anoop Sam John commented on HBASE-24623: bq.So a broken WAL file? Seems not as Stack says "If heavy-reads and no Replication, all is fine too." This means somewhere we miss the accounting part of the BBs.. A BB is getting reused before its actually being released. And looks like the leak is in Replication area. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146736#comment-17146736 ] Duo Zhang commented on HBASE-24623: --- So a broken WAL file? IIRC we've fixed a bug related to this area before, where we release the byte buffer before actually writing it out, which causes WAL splitting to fail. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146698#comment-17146698 ] Michael Stack commented on HBASE-24623: --- bq. Coming to your case, so when you see the issue in write, this comes while replication? Yeah. Replication alone, all is fine. If same cluster gets really heavy reads, then issue. If heavy-reads and no Replication, all is fine too. We are running a RSRpcServices.multi call provoked by a Replication #replay from somewhere else in the cluster (hard to confirm). > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146151#comment-17146151 ] Anoop Sam John commented on HBASE-24623: That will come into pic when we read from HFiles. Previously we will read into on heap byte[] when we have to read from HDFS or from BC over SSD. From 2.3.0, for this also, we will use the BBs form the pool. Coming to your case, so when you see the issue in write, this comes while replication? I can see the Mutation#add cells are backed by off heap. Those are ByteBuffer backed KVs. When src cluster calls RPC to destination cluster, at the destination, it will be the Netty RPC server in pic. So the req bytes will be read into DBBs from its pool right? > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146019#comment-17146019 ] Michael Stack commented on HBASE-24623: --- No offheap BC, no offheap write path, all defaults. I do notice though that hbase-2.3.0 is first release with ByteBufferAllocator vs ByteBufferPool; the former does refcounting where the latter did not. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144663#comment-17144663 ] Michael Stack commented on HBASE-24623: --- bq. Is this with BB Pool enabled and with offheap ? This is default hbase-2.3.0RC0 so... that means these are on by default? Whats the issue [~ram_krish]? For netty I can turn on its leak detector but we don't have anything like that for our internal BB impl? Replication is enabled. This looks like replay of edits received from the remote being forwarded around the local cluster and the crash happens when the terminal RegionServer is undoing the edits it got from the replay. Concurrently there is heavy read load. Will try some more experiments tomorrow. Thanks for taking a look. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144652#comment-17144652 ] ramkrishna.s.vasudevan commented on HBASE-24623: bq.We might be taking the WrongRowIOException patch because the Cell is 'corrupt' Exactly that is what I meant. Is this with BB Pool enabled and with offheap ? > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144479#comment-17144479 ] Michael Stack commented on HBASE-24623: --- We might be taking the WrongRowIOException patch because the Cell is 'corrupt'... then the rows wouldn't match. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144286#comment-17144286 ] Michael Stack commented on HBASE-24623: --- Looking at a bunch of machines, I notice that the hotspot error file always includes: {code} Deoptimization events (10 events): Event: 712292.464 Thread 0x7f3c27293800 Uncommon trap: reason=unstable_if action=reinterpret pc=0x7f65f10a889c method=org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation; @ 8 {code} I think it is because this path inside Mutation is almost never taken -- WrongRowIOException -- so the deoptimization gets queued. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143536#comment-17143536 ] Michael Stack commented on HBASE-24623: --- I can't tell from stack trace what sort of Cells are in flight here or backing buffer. Will try and get more info on this tomorrow. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j >
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143523#comment-17143523 ] Michael Stack commented on HBASE-24623: --- [~ram_krish] bq. Oh the incoming cell is already corrupted then right. Where is the corruption? Or how we guard against it? As is it crashes out the JVM. Maybe I can manufacture the crash? > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143520#comment-17143520 ] Michael Stack commented on HBASE-24623: --- Been seen before? https://community.cloudera.com/t5/Support-Questions/RegionServer-Crashes-With-Java-Error/td-p/215219 A bug shut as a lucene mmap problem is here: https://bugs.openjdk.java.net/browse/JDK-8179671 Perhaps of use, the jbyte_disjoint_arraycopy reference is an intrinsic for arraycopy: https://stackoverflow.com/questions/52318540/what-is-stubroutinesjbyte-disjoint-arraycopy is arraycopy intrinsic Cassandra had an issue that looked the same but it passed -- they weren't sure what fix was. Attached a patch that moves stuff stuff around (we've just read the row to do the compare ... perhaps this will work) > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v
[jira] [Commented] (HBASE-24623) SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy
[ https://issues.apache.org/jira/browse/HBASE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143514#comment-17143514 ] ramkrishna.s.vasudevan commented on HBASE-24623: [~stack] - Oh the incoming cell is already corrupted then right. > SIGSEGV v ~StubRoutines::jbyte_disjoint_arraycopy > -- > > Key: HBASE-24623 > URL: https://issues.apache.org/jira/browse/HBASE-24623 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Michael Stack >Priority: Major > > In testing, 1% of a decent cluster went down with this seg fault in the vm: > {code} > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x7f6659052410, pid=37208, tid=0x7f3c89453700 > # > # JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-b09) > # Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 ) > # Problematic frame: > # v ~StubRoutines::jbyte_disjoint_arraycopy > {code} > Looking in the hs_err log, the crash happens in the same area. Here are a few > of the stack traces: > {code} > Stack: [0x7f3c89353000,0x7f3c89454000], sp=0x7f3c89452110, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 17674 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f665af000d1 [0x7f665aefffe0+0xf1] > J 17732 C1 > org.apache.hadoop.hbase.CellUtil.copyQualifierTo(Lorg/apache/hadoop/hbase/Cell;[BI)I > (59 bytes) @ 0x7f665bc440dc [0x7f665bc43b80+0x55c] > j > org.apache.hadoop.hbase.CellUtil.cloneQualifier(Lorg/apache/hadoop/hbase/Cell;)[B+12 > J 22278 C2 org.apache.hadoop.hbase.ByteBufferKeyValue.getQualifierArray()[B > (5 bytes) @ 0x7f6659bd4784 [0x7f6659bd4760+0x24] > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;Ljava/util/function/Function;)Ljava/lang/String;+97 > j > org.apache.hadoop.hbase.CellUtil.getCellKeyAsString(Lorg/apache/hadoop/hbase/Cell;)Ljava/lang/String;+6 > j > org.apache.hadoop.hbase.CellUtil.toString(Lorg/apache/hadoop/hbase/Cell;Z)Ljava/lang/String;+16 > j org.apache.hadoop.hbase.ByteBufferKeyValue.toString()Ljava/lang/String;+2 > j > org.apache.hadoop.hbase.client.Mutation.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Mutation;+28 > J 22605 C2 > org.apache.hadoop.hbase.client.Put.add(Lorg/apache/hadoop/hbase/Cell;)Lorg/apache/hadoop/hbase/client/Put; > (8 bytes) @ 0x7f665a982a04 [0x7f665a9829e0+0x24] > J 22112 C2 > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$MutationProto;Lorg/apache/hadoop/hbase/CellScanner;)Lorg/apache/hadoop/hbase/client/Put; > (910 bytes) @ 0x7f665c706700 [0x7f665c706000+0x700] > J 24084 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Ljava/util/List;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;Z)V > (646 bytes) @ 0x7f665cc21100 [0x7f665cc20c80+0x480] > J 14696 C2 > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(Lorg/apache/hadoop/hbase/regionserver/HRegion;Lorg/apache/hadoop/hbase/quotas/OperationQuota;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionAction;Lorg/apache/hadoop/hbase/CellScanner;Lorg/apache/hadoop/hbase/shaded/protobuf/generated/ClientProtos$RegionActionResult$Builder;Ljava/util/List;JLorg/apache/hadoop/hbase/regionserver/RSRpcServices$RegionScannersCloseCallBack;Lorg/apache/hadoop/hbase/ipc/RpcCallContext;Lorg/apache/hadoop/hbase/quotas/ActivePolicyEnforcement;)Ljava/util/List; > (901 bytes) @ 0x7f665b722148 [0x7f665b7218e0+0x868] > {code} > Here's another: > {code} > Stack: [0x7edd015e2000,0x7edd016e3000], sp=0x7edd016e11b0, free > space=1020k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > v ~StubRoutines::jbyte_disjoint_arraycopy > J 18255 C2 > org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray([BLjava/nio/ByteBuffer;III)V > (69 bytes) @ 0x7f06d2593551 [0x7f06d2593460+0xf1] > j > org.apache.hadoop.hbase.PrivateCellUtil.copyTagsTo(Lorg/apache/hadoop/hbase/Cell;[BI)I+31 > j > org.apache.hadoop.hbase.CellUtil.cloneTags(Lorg/apache/hadoop/hbase/Cell;)[B+12 > j org.apache.hadoop.hbase.ByteBufferKeyValue.getTagsArray()[B+1 > j >