[ https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931810#comment-16931810 ]
Jordan West edited comment on CASSANDRA-15295 at 9/18/19 5:10 PM: ------------------------------------------------------------------ Happy to [~gzh1992n]. A few comments: * I verified it does not affect 3.0 branch because {{CommitLogSegmentManager#start}} exits immediately after starting {{managerThread}} instead of calling {{advaceAllocatingFrom(null)}}; * The database doesn’t start, which causes many tests to fail as well, because there is no default commit log segment manager factory set. Test runs: https://circleci.com/gh/jrwest/cassandra/tree/bug-commitlog-deadlock * CommitLogInitWithExpcetionTest#L63 - should check prior to this call that initThread is not null Minor naming nits (do with them what you please): * Rename KillerHook => OnKillHook, and onKill => execute * Drop the “I*” naming for the CommitLogSegmentMgrFactoryInterface. Consider renaming it CommitLogSegmentManagerFactory was (Author: jrwest): Happy to [~gzh1992n]. A few comments: * I verified it does not affect 3.0 branch because CommitLogSegmentManager#start exits immediately after starting managerThread instead of calling advaceAllocatingFrom(null); * The database doesn’t start, which causes many tests to fail as well, because there is no default commit log segment manager factory set. Test runs: https://circleci.com/gh/jrwest/cassandra/tree/bug-commitlog-deadlock * CommitLogInitWithExpcetionTest#L63 - should check prior to this call that initThread is not null Minor naming nits (do with them what you please): * Rename KillerHook => OnKillHook, and onKill => execute * Drop the “I*” naming for the CommitLogSegmentMgrFactoryInterface. Consider renaming it CommitLogSegmentManagerFactory > Running into deadlock when do CommitLog initialization > ------------------------------------------------------ > > Key: CASSANDRA-15295 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15295 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log > Reporter: Zephyr Guo > Assignee: Zephyr Guo > Priority: Normal > Attachments: jstack.log, pstack.log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a > long time. > I used jstack to saw what happened. The main thread stuck in > *AbstractCommitLogSegmentManager.awaitAvailableSegment* > !screenshot-1.png! > The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it > was not actually running. > !screenshot-2.png! > And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on > java class initialization. > !screenshot-3.png! > This is a deadlock obviously. CommitLog waits for a CommitLogSegment when > initializing. In this moment, the CommitLog class is not initialized and the > main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a > CommitLogSegment with exception and call *CommitLog.handleCommitError*(static > method). COMMIT-LOG-ALLOCATOR will block on this line because CommitLog > class is still initializing. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org