[ https://issues.apache.org/jira/browse/CASSANDRA-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-19668: ----------------------------------------- Fix Version/s: 4.1.x 5.0-rc 5.x > SIGSEV originating in Paxos V2 Scheduled Task > --------------------------------------------- > > Key: CASSANDRA-19668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19668 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions > Reporter: Jon Haddad > Assignee: Jon Haddad > Priority: Urgent > Fix For: 4.1.x, 5.0-rc, 5.x > > > I haven't gotten to the root cause of this yet. Several 4.1 nodes have > crashed in in production. I'm not sure if this is related to Paxos v2 or > not, but it is enabled. offheap_objects also enabled. > I'm not sure if this affects 5.0, yet. > Most of the crashes don't have a stacktrace - they only reference this > {noformat} > Stack: [0x00007fabf4c34000,0x00007fabf4d34000], sp=0x00007fabf4d31f00, free > space=1015k > Native frames: (J=compiled Java code, A=aot compiled Java code, > j=interpreted, Vv=VM code, C=native code) > v ~StubRoutines::jint_disjoint_arraycopy > {noformat} > They all are in the {{ScheduledTasks}} thread. > However, one node does have this in the crash log: > {noformat} > --------------- T H R E A D --------------- > Current thread (0x000078b375eac800): JavaThread "ScheduledTasks:1" daemon > [_thread_in_Java, id=151791, stack(0x000078b34b780000,0x000078b34b880000)] > Stack: [0x000078b34b780000,0x000078b34b880000], sp=0x000078b34b87c350, free > space=1008k > Native frames: (J=compiled Java code, A=aot compiled Java code, > j=interpreted, Vv=VM code, C=native code) > J 29467 c2 > org.apache.cassandra.db.rows.AbstractCell.clone(Lorg/apache/cassandra/utils/memory/ByteBufferCloner;)Lorg/apache/cassandra/db/rows/Cell; > (50 bytes) @ 0x000078b3dd40a42f [0x000078b3dd409de0+0x000000000000064f] > J 17669 c2 > org.apache.cassandra.db.rows.Cell.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/ColumnData; > (6 bytes) @ 0x000078b3dc54edc0 [0x000078b3dc54ed40+0x0000000000000080] > J 17816 c2 > org.apache.cassandra.db.rows.BTreeRow$$Lambda$845.apply(Ljava/lang/Object;)Ljava/lang/Object; > (12 bytes) @ 0x000078b3dbed01a4 [0x000078b3dbed0120+0x0000000000000084] > J 17828 c2 > org.apache.cassandra.utils.btree.BTree.transform([Ljava/lang/Object;Ljava/util/function/Function;)[Ljava/lang/Object; > (194 bytes) @ 0x000078b3dc5f35f0 [0x000078b3dc5f34a0+0x0000000000000150] > J 35096 c2 > org.apache.cassandra.db.rows.BTreeRow.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/Row; > (37 bytes) @ 0x000078b3dda9111c [0x000078b3dda90fe0+0x000000000000013c] > J 30500 c2 > org.apache.cassandra.utils.memory.EnsureOnHeap$CloneToHeap.applyToRow(Lorg/apache/cassandra/db/rows/Row;)Lorg/apache/cassandra/db/rows/Row; > (16 bytes) @ 0x000078b3dd59b91c [0x000078b3dd59b8c0+0x000000000000005c] > J 26498 c2 org.apache.cassandra.db.transform.BaseRows.hasNext()Z (215 bytes) > @ 0x000078b3dcf1c454 [0x000078b3dcf1c180+0x00000000000002d4] > J 30775 c2 > org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext()Ljava/lang/Object; > (49 bytes) @ 0x000078b3dc789020 [0x000078b3dc788fc0+0x0000000000000060] > J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ > 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104] > J 35593 c2 > org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Lorg/apache/cassandra/service/paxos/uncommitted/PaxosKeyState; > (126 bytes) @ 0x000078b3dc7ceeec [0x000078b3dc7cee20+0x00000000000000cc] > J 35591 c2 > org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Ljava/lang/Object; > (5 bytes) @ 0x000078b3dc7d09e4 [0x000078b3dc7d09a0+0x0000000000000044] > J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ > 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104] > J 34146 c2 > com.google.common.collect.Iterators.addAll(Ljava/util/Collection;Ljava/util/Iterator;)Z > (41 bytes) @ 0x000078b3dd9197e8 [0x000078b3dd919680+0x0000000000000168] > J 38256 c1 > org.apache.cassandra.service.paxos.uncommitted.PaxosRows.toIterator(Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;Lorg/apache/cassandra/schema/TableId;Z)Lorg/apache/cassandra/utils/CloseableIterator; > (49 bytes) @ 0x000078b3d6b677ac [0x000078b3d6b672e0+0x00000000000004cc] > J 34823 c1 > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedIndex.repairIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator; > (212 bytes) @ 0x000078b3d5675e0c [0x000078b3d5673be0+0x000000000000222c] > J 38259 c1 > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.uncommittedKeyIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator; > (116 bytes) @ 0x000078b3d6b6bc54 [0x000078b3d6b6b7e0+0x0000000000000474] > J 38257 c1 > org.apache.cassandra.service.StorageService.autoRepairPaxos(Lorg/apache/cassandra/schema/TableId;)Lorg/apache/cassandra/utils/concurrent/Future; > (57 bytes) @ 0x000078b3d6b6902c [0x000078b3d6b68e00+0x000000000000022c] > j > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.schedulePaxosAutoRepairs()V+146 > j > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1773.run()V+4 > J 39703 c1 > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.runAndLogException(Ljava/lang/String;Ljava/lang/Runnable;)V > (39 bytes) @ 0x000078b3d435adfc [0x000078b3d435ad00+0x00000000000000fc] > j > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.maintenance()V+19 > j > org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1534.run()V+4 > J 30376 c2 > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V > java.base@11.0.22 (57 bytes) @ 0x000078b3dd56543c > [0x000078b3dd565100+0x000000000000033c] > J 27255% c2 > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V > java.base@11.0.22 (187 bytes) @ 0x000078b3dd114d58 > [0x000078b3dd114ac0+0x0000000000000298] > j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@11.0.22 > j io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4 > j java.lang.Thread.run()V+11 java.base@11.0.22 > v ~StubRoutines::call_stub > V [libjvm.so+0x877453] JavaCalls::call_helper(JavaValue*, methodHandle > const&, JavaCallArguments*, Thread*)+0x373 > V [libjvm.so+0x875a96] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, > Symbol*, Symbol*, Thread*)+0x186 > V [libjvm.so+0x925653] thread_entry(JavaThread*, Thread*)+0xa3 > V [libjvm.so+0xe41391] JavaThread::thread_main_inner()+0x131 > V [libjvm.so+0xe3d790] Thread::call_run()+0x140 > V [libjvm.so+0xbf97de] thread_native_entry(Thread*)+0xee > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org