[jira] [Comment Edited] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249 ] Jaydeepkumar Chovatia edited comment on CASSANDRA-13740 at 8/11/17 11:56 PM: - Hi [~iamaleksey] I have modified code as per your review comments, please find it attached "13740-3.0.15.txt" Also please find same patch here: https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e I will create patch for 3.11 as well as will run {{circleci}} after receiving your review comments. Jaydeep was (Author: chovatia.jayd...@gmail.com): Hi [~iamaleksey] I have modified code as per your review comments, please find it attached "13740_3.0.15.txt" Also please find same patch here: https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e I will create patch for 3.11 as well as will run {{circleci}} after receiving your review comments. Jaydeep > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249 ] Jaydeepkumar Chovatia edited comment on CASSANDRA-13740 at 8/11/17 11:56 PM: - Hi [~iamaleksey] I have modified code as per your review comments, please find it attached "13740_3.0.15.txt" Also please find same patch here: https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e I will create patch for 3.11 as well as will run {{circleci}} after receiving your review comments. Jaydeep was (Author: chovatia.jayd...@gmail.com): Hi [~iamaleksey] I have modified code as per your review comments, please find it attached "13740-2_3.0.15.txt" Also please find same patch here: https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e I will create patch for 3.11 as well as will run {{circleci}} after receiving your review comments. Jaydeep > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaydeepkumar Chovatia updated CASSANDRA-13740: -- Attachment: (was: 13740-3.0.15.txt) > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaydeepkumar Chovatia updated CASSANDRA-13740: -- Attachment: 13740-3.0.15.txt > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249 ] Jaydeepkumar Chovatia commented on CASSANDRA-13740: --- Hi [~iamaleksey] I have modified code as per your review comments, please find it attached "13740-2_3.0.15.txt" Also please find same patch here: https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e I will create patch for 3.11 as well as will run {{circleci}} after receiving your review comments. Jaydeep > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13758: Reviewer: Marcus Eriksson Status: Patch Available (was: Open) [trunk|https://github.com/bdeggleston/cassandra/tree/13758] [utest|https://circleci.com/gh/bdeggleston/cassandra/87] > Incremental repair sessions shouldn't be deleted if they still have sstables > > > Key: CASSANDRA-13758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13758 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston > > The incremental session cleanup doesn't verify that there are no remaining > sstables marked as part of the repair before deleting it. Deleting a > successful repair session which still has outstanding sstables will cause > those sstables to be demoted to unrepaired, creating an inconsistency. > This typically wouldn't be an issue, since we'd expect the sstables to long > since have been promoted / demoted. However, I've seen a few ref leak issues > which can cause sstables to get stuck. Those have been fixed, but we should > still protect against that edge case to prevent inconsistencies caused by > future (or currently unknown) bugs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13758: Fix Version/s: 4.0 > Incremental repair sessions shouldn't be deleted if they still have sstables > > > Key: CASSANDRA-13758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13758 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston > Fix For: 4.0 > > > The incremental session cleanup doesn't verify that there are no remaining > sstables marked as part of the repair before deleting it. Deleting a > successful repair session which still has outstanding sstables will cause > those sstables to be demoted to unrepaired, creating an inconsistency. > This typically wouldn't be an issue, since we'd expect the sstables to long > since have been promoted / demoted. However, I've seen a few ref leak issues > which can cause sstables to get stuck. Those have been fixed, but we should > still protect against that edge case to prevent inconsistencies caused by > future (or currently unknown) bugs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11483) Enhance sstablemetadata
[ https://issues.apache.org/jira/browse/CASSANDRA-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124103#comment-16124103 ] Joel Knighton commented on CASSANDRA-11483: --- The dtest fix was committed in [CASSANDRA-13755] - thanks everyone. > Enhance sstablemetadata > --- > > Key: CASSANDRA-11483 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11483 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Fix For: 4.0 > > Attachments: CASSANDRA-11483.txt, CASSANDRA-11483v2.txt, > CASSANDRA-11483v3.txt, CASSANDRA-11483v4.txt, CASSANDRA-11483v5.txt, Screen > Shot 2016-04-03 at 11.40.32 PM.png > > > sstablemetadata provides quite a bit of useful information but theres a few > hiccups I would like to see addressed: > * Does not use client mode > * Units are not provided (or anything for that matter). There is data in > micros, millis, seconds as durations and timestamps from epoch. But there is > no way to tell what one is without a non-trival code dive > * in general pretty frustrating to parse -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables
Blake Eggleston created CASSANDRA-13758: --- Summary: Incremental repair sessions shouldn't be deleted if they still have sstables Key: CASSANDRA-13758 URL: https://issues.apache.org/jira/browse/CASSANDRA-13758 Project: Cassandra Issue Type: Bug Reporter: Blake Eggleston Assignee: Blake Eggleston The incremental session cleanup doesn't verify that there are no remaining sstables marked as part of the repair before deleting it. Deleting a successful repair session which still has outstanding sstables will cause those sstables to be demoted to unrepaired, creating an inconsistency. This typically wouldn't be an issue, since we'd expect the sstables to long since have been promoted / demoted. However, I've seen a few ref leak issues which can cause sstables to get stuck. Those have been fixed, but we should still protect against that edge case to prevent inconsistencies caused by future (or currently unknown) bugs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123748#comment-16123748 ] Ariel Weisberg commented on CASSANDRA-13594: Unrelated but I did manage to reproduce short_read_test failing after letting it run a lot of times. > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > Attachments: 13594.png > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13594: --- Status: Ready to Commit (was: Patch Available) > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > Attachments: 13594.png > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-9989) Optimise BTree.Buider
[ https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123714#comment-16123714 ] Jay Zhuang commented on CASSANDRA-9989: --- [~Anthony Grasso] Would you please review the patch? > Optimise BTree.Buider > - > > Key: CASSANDRA-9989 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9989 > Project: Cassandra > Issue Type: Sub-task >Reporter: Benedict >Assignee: Jay Zhuang >Priority: Minor > Fix For: 4.x > > Attachments: 9989-trunk.txt > > > BTree.Builder could reduce its copying, and exploit toArray more efficiently, > with some work. It's not very important right now because we don't make as > much use of its bulk-add methods as we otherwise might, however over time > this work will become more useful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13688) Anticompaction race can leak sstables/txn
[ https://issues.apache.org/jira/browse/CASSANDRA-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13688: Resolution: Fixed Status: Resolved (was: Patch Available) Finally got a good dtest run. Committed as {{e9cc805db1133982c022657f8cab86cd24b3686f}} > Anticompaction race can leak sstables/txn > - > > Key: CASSANDRA-13688 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13688 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston > Fix For: 4.0 > > > At the top of {{CompactionManager#performAntiCompaction}}, the parent repair > session is loaded, if the session can't be found, a RuntimeException is > thrown. This can happen if a participant is evicted after the IR prepare > message is received, but before the anticompaction starts. This exception is > thrown outside of the try/finally block that guards the sstable and lifecycle > transaction, causing them to leak, and preventing the sstables from ever > being removed from View.compacting. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Fix race / ref leak in anticompaction
Repository: cassandra Updated Branches: refs/heads/trunk f4da90aca -> e9cc805db Fix race / ref leak in anticompaction Patch by Blake Eggleston; Reviewed by Ariel Weisberg for CASSANDRA-13688 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9cc805d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9cc805d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9cc805d Branch: refs/heads/trunk Commit: e9cc805db1133982c022657f8cab86cd24b3686f Parents: f4da90a Author: Blake Eggleston Authored: Wed Jul 12 14:47:48 2017 -0700 Committer: Blake Eggleston Committed: Fri Aug 11 10:24:17 2017 -0700 -- CHANGES.txt | 1 + .../db/compaction/AbstractCompactionTask.java | 40 + .../db/compaction/CompactionManager.java| 45 +++--- .../db/compaction/PendingRepairManager.java | 43 ++--- .../db/compaction/AntiCompactionTest.java | 40 + .../db/compaction/CompactionTaskTest.java | 157 +++ .../consistent/PendingAntiCompactionTest.java | 23 +++ 7 files changed, 312 insertions(+), 37 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 988f93d..7c9d79a 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Fix race / ref leak in anticompaction (CASSANDRA-13688) * Expose tasks queue length via JMX (CASSANDRA-12758) * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751) * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java index 430c916..c542a51 100644 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java @@ -17,7 +17,11 @@ */ package org.apache.cassandra.db.compaction; +import java.util.Iterator; import java.util.Set; +import java.util.UUID; + +import com.google.common.base.Preconditions; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.Directories; @@ -49,6 +53,42 @@ public abstract class AbstractCompactionTask extends WrappedRunnable Set compacting = transaction.tracker.getCompacting(); for (SSTableReader sstable : transaction.originals()) assert compacting.contains(sstable) : sstable.getFilename() + " is not correctly marked compacting"; + +validateSSTables(transaction.originals()); +} + +/** + * Confirm that we're not attempting to compact repaired/unrepaired/pending repair sstables together + */ +private void validateSSTables(Set sstables) +{ +// do not allow to be compacted together +if (!sstables.isEmpty()) +{ +Iterator iter = sstables.iterator(); +SSTableReader first = iter.next(); +boolean isRepaired = first.isRepaired(); +UUID pendingRepair = first.getPendingRepair(); +while (iter.hasNext()) +{ +SSTableReader next = iter.next(); +Preconditions.checkArgument(isRepaired == next.isRepaired(), +"Cannot compact repaired and unrepaired sstables"); + +if (pendingRepair == null) +{ +Preconditions.checkArgument(!next.isPendingRepair(), +"Cannot compact pending repair and non-pending repair sstables"); +} +else +{ +Preconditions.checkArgument(next.isPendingRepair(), +"Cannot compact pending repair and non-pending repair sstables"); + Preconditions.checkArgument(pendingRepair.equals(next.getPendingRepair()), +"Cannot compact sstables from different pending repairs"); +} +} +} } /** http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index b
[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123604#comment-16123604 ] Xiaolong Jiang commented on CASSANDRA-10726: [~krummas] Thanks for the review. Yes, your change looks good to me. I saw the tests in circle CI passed. Dtest is still running. Please go ahead and merge when dtest passes. I think we can go to 4.0 only for open source version. > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Xiaolong Jiang > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-12758: --- Resolution: Fixed Fix Version/s: (was: 4.x) 4.0 Status: Resolved (was: Ready to Commit) Committed to trunk - f4da90a Thanks Romain! > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 4.0 > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Add MBean to monitor max queued tasks
Repository: cassandra Updated Branches: refs/heads/trunk d68357a44 -> f4da90aca Add MBean to monitor max queued tasks patch by Romain Hardouin; reviewed by Michael Shuler for CASSANDRA-12758 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f4da90ac Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f4da90ac Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f4da90ac Branch: refs/heads/trunk Commit: f4da90aca0e79664ea06212283f6cd5f9288d441 Parents: d68357a Author: Romain Hardouin Authored: Thu Oct 6 22:36:07 2016 +0200 Committer: Michael Shuler Committed: Fri Aug 11 08:36:36 2017 -0500 -- CHANGES.txt | 1 + doc/source/operating/metrics.rst | 1 + src/java/org/apache/cassandra/concurrent/SEPExecutor.java | 2 +- src/java/org/apache/cassandra/metrics/SEPMetrics.java | 10 ++ 4 files changed, 13 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index efd6716..988f93d 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Expose tasks queue length via JMX (CASSANDRA-12758) * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751) * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615) * Improve sstablemetadata output (CASSANDRA-11483) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/doc/source/operating/metrics.rst -- diff --git a/doc/source/operating/metrics.rst b/doc/source/operating/metrics.rst index a38d7c1..cfdd584 100644 --- a/doc/source/operating/metrics.rst +++ b/doc/source/operating/metrics.rst @@ -193,6 +193,7 @@ CompletedTasksCounterNumber of tasks completed. TotalBlockedTasks CounterNumber of tasks that were blocked due to queue saturation. CurrentlyBlockedTask CounterNumber of tasks that are currently blocked due to queue saturation but on retry will become unblocked. MaxPoolSize Gauge The maximum number of threads in this pool. +MaxTasksQueuedGauge The maximum number of tasks queued before a task get blocked. = == === The following thread pools can be monitored. http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/src/java/org/apache/cassandra/concurrent/SEPExecutor.java -- diff --git a/src/java/org/apache/cassandra/concurrent/SEPExecutor.java b/src/java/org/apache/cassandra/concurrent/SEPExecutor.java index c87614b..add850a 100644 --- a/src/java/org/apache/cassandra/concurrent/SEPExecutor.java +++ b/src/java/org/apache/cassandra/concurrent/SEPExecutor.java @@ -35,7 +35,7 @@ public class SEPExecutor extends AbstractLocalAwareExecutorService public final int maxWorkers; public final String name; -private final int maxTasksQueued; +public final int maxTasksQueued; private final SEPMetrics metrics; // stores both a set of work permits and task permits: http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/src/java/org/apache/cassandra/metrics/SEPMetrics.java -- diff --git a/src/java/org/apache/cassandra/metrics/SEPMetrics.java b/src/java/org/apache/cassandra/metrics/SEPMetrics.java index 35f02b4..dd1d2d6 100644 --- a/src/java/org/apache/cassandra/metrics/SEPMetrics.java +++ b/src/java/org/apache/cassandra/metrics/SEPMetrics.java @@ -41,6 +41,8 @@ public class SEPMetrics public final Gauge pendingTasks; /** Maximum number of threads before it will start queuing tasks */ public final Gauge maxPoolSize; +/** Maximum number of tasks queued before a task get blocked */ +public final Gauge maxTasksQueued; private MetricNameFactory factory; @@ -85,6 +87,13 @@ public class SEPMetrics return executor.maxWorkers; } }); +maxTasksQueued = Metrics.register(factory.createMetricName("MaxTasksQueued"), new Gauge() +{ +public Integer getValue() +{ +return executor.maxTasksQueued; +} +}); } public void release() @@ -95,5 +104,6 @@ public class SEPMetrics Metrics.remove(factory.createMetricName("TotalBlockedTasks")); Metrics.remove(factory.createMetricName("CurrentlyBlockedTasks")); Metrics.remove(factory.createMetricName("MaxPoolSize")); +Metrics.remove(factory.createMetricName("MaxTasksQueued")); } }
[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-12758: --- Status: Ready to Commit (was: Patch Available) > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-12758: --- Fix Version/s: (was: 3.11.x) (was: 3.0.x) > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351 ] Michael Shuler edited comment on CASSANDRA-12758 at 8/11/17 3:58 PM: - Thanks! I had trouble with CircleCI completing a test run yesterday, so I pulled your patch to run through internal CI to see if this causes any test issues. {{ant test-all}} passed 100% {{cassandra-dtest}} run looks good - just one failure on CASSANDRA-13576 was (Author: mshuler): Thanks! I had trouble with CircleCI completing a test run yesterday, so I pulled your patch to run through internal CI to see if this causes any test issues. {{ant test-all}} passed 100% {{cassandra-dtest}} run in progress > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351 ] Michael Shuler edited comment on CASSANDRA-12758 at 8/11/17 3:26 PM: - Thanks! I had trouble with CircleCI completing a test run yesterday, so I pulled your patch to run through internal CI to see if this causes any test issues. {{ant test-all}} passed 100% {{cassandra-dtest}} run in progress was (Author: mshuler): Thanks! I had trouble with CircleCI completing a test run yesterday, so I pulled your patch to run through internal CI to see if this causes any test issues. > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues
[ https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123418#comment-16123418 ] Michael Shuler commented on CASSANDRA-13433: Reuploaded gpg-signed rpms for 2.2.10 just now, so this should be fixed. > RPM distribution improvements and known issues > -- > > Key: CASSANDRA-13433 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13433 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > > Starting with CASSANDRA-13252, new releases will be provided as both official > RPM and Debian packages. While the Debian packages are already well > established with our user base, the RPMs just have been release for the first > time and still require some attention. > Feel free to discuss RPM related issues in this ticket and open a sub-task to > fill a bug report. > Please note that native systemd support will be implemented with > CASSANDRA-13148 and this is not strictly a RPM specific issue. We still > intent to offer non-systemd support based on the already working init scripts > that we ship. Therefor the first step is to make use of systemd backward > compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd > and non-systemd environments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
svn commit: r20930 - in /release/cassandra/redhat/22x: ./ repodata/
Author: mshuler Date: Fri Aug 11 14:23:19 2017 New Revision: 20930 Log: Reupload gpg-signed rpms for Apache Cassandra 2.2.10 Added: release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2 (with props) release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2 (with props) release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2 (with props) release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz (with props) release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz (with props) release/cassandra/redhat/22x/repodata/f791d6311e120a8470481817193a3bcf2d210292d9b7127320ff3173f3546168-other.xml.gz (with props) Removed: release/cassandra/redhat/22x/repodata/3236f5a391cbf37fd0e70cba6ec4633b6d466f68064061a4b40443c002199304-filelists.xml.gz release/cassandra/redhat/22x/repodata/401603460b42f33ace08526f74b13526d3ebfa0aa1ac53b15fe0f6f1f4feae55-primary.sqlite.bz2 release/cassandra/redhat/22x/repodata/4b99211eb9721f495aa9e2aa16ed21e7d68ca0c17cc2135485d40a2dc2fb5dca-other.sqlite.bz2 release/cassandra/redhat/22x/repodata/5f42f64ab725593d48892f07ad9bf58ff3b452c298cd4ab6dba33b9ca60dbc6b-other.xml.gz release/cassandra/redhat/22x/repodata/813f5fd0236564d9fca0fa1a5913f7ce6672a0509732fb9a6a064d2fe2bbf447-filelists.sqlite.bz2 release/cassandra/redhat/22x/repodata/eba2ada18598c21e51c5e3294ad9414bae2cdbc65a36d4e03153bf3fe1a35571-primary.xml.gz Modified: release/cassandra/redhat/22x/cassandra-2.2.10-1.noarch.rpm release/cassandra/redhat/22x/cassandra-2.2.10-1.src.rpm release/cassandra/redhat/22x/cassandra-tools-2.2.10-1.noarch.rpm release/cassandra/redhat/22x/repodata/repomd.xml release/cassandra/redhat/22x/repodata/repomd.xml.asc Modified: release/cassandra/redhat/22x/cassandra-2.2.10-1.noarch.rpm == Binary files - no diff available. Modified: release/cassandra/redhat/22x/cassandra-2.2.10-1.src.rpm == Binary files - no diff available. Modified: release/cassandra/redhat/22x/cassandra-tools-2.2.10-1.noarch.rpm == Binary files - no diff available. Added: release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2 == Binary file - no diff available. Propchange: release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2 -- svn:mime-type = application/octet-stream Added: release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2 == Binary file - no diff available. Propchange: release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2 -- svn:mime-type = application/octet-stream Added: release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2 == Binary file - no diff available. Propchange: release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2 -- svn:mime-type = application/octet-stream Added: release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz == Binary file - no diff available. Propchange: release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz -- svn:mime-type = application/octet-stream Added: release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz == Binary file - no diff available. Propchange: release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz -
[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues
[ https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123380#comment-16123380 ] Michael Shuler commented on CASSANDRA-13433: Looks like I missed a package signature on 2.2.10, but the repository signature looks good. I believe setting gpgcheck=0 while leaving repo_gpgcheck=1 will allow installation, and we'll get this right on the next upload. {noformat} $ rpm -K *.rpm cassandra-2.1.18-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: (MD5) PGP#fe4b2bda) cassandra-2.2.10-1.noarch.rpm: sha1 md5 OK cassandra-3.0.14-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: (MD5) PGP#fe4b2bda) cassandra-3.11.0-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: (MD5) PGP#fe4b2bda) {noformat} > RPM distribution improvements and known issues > -- > > Key: CASSANDRA-13433 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13433 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > > Starting with CASSANDRA-13252, new releases will be provided as both official > RPM and Debian packages. While the Debian packages are already well > established with our user base, the RPMs just have been release for the first > time and still require some attention. > Feel free to discuss RPM related issues in this ticket and open a sub-task to > fill a bug report. > Please note that native systemd support will be implemented with > CASSANDRA-13148 and this is not strictly a RPM specific issue. We still > intent to offer non-systemd support based on the already working init scripts > that we ship. Therefor the first step is to make use of systemd backward > compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd > and non-systemd environments. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351 ] Michael Shuler commented on CASSANDRA-12758: Thanks! I had trouble with CircleCI completing a test run yesterday, so I pulled your patch to run through internal CI to see if this causes any test issues. > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-12758: Attachment: (was: 12758-trunk.patch) > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romain Hardouin updated CASSANDRA-12758: Attachment: (was: 12758-3.0.patch) > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12758) Expose tasks queue length via JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123323#comment-16123323 ] Romain Hardouin commented on CASSANDRA-12758: - Rebased on trunk, build successful https://circleci.com/gh/rhardouin/cassandra/18 Let's include it in trunk only, anyway it's trivial to backport if someone need it on production. > Expose tasks queue length via JMX > - > > Key: CASSANDRA-12758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12758 > Project: Cassandra > Issue Type: Improvement >Reporter: Romain Hardouin >Assignee: Michael Shuler >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > Attachments: 12758-3.0.patch, 12758-trunk.patch > > > CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} > to set the NTR queue length. > Currently Cassandra lacks of a JMX Mbean which exposes this value which would > allow to: > > 1. Be sure this value has been set > 2. Plot this value in a monitoring application to make correlations with > other graphs when we make changes -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running
[ https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serhat Rıfat Demircan updated CASSANDRA-13757: -- Description: We got following error while repair job running on our cluster. One of the nodes stop due to segmantation fault in JVM and repair job fails. We could not reproduce this problem on our test and staging enviroment (main difference is data size). {code:java} # # SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700 # # JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [liblz4-java3580121503903465201.so+0x5e70] LZ4_decompress_fast+0xd0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --- T H R E A D --- Current thread (0x7fce32dad1b0): JavaThread "CompactionExecutor:9798" daemon [_thread_in_native, id=16879, stack(0x7fd7ee784000,0x7fd7ee7c5000)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x7fd450c4d000 Registers: RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, RDX=0x7fcde6560d3e RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, RDI=0x00c2 R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, R11=0x R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, R15=0x7fcde6562ffb RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, ERR=0x0004 TRAPNO=0x000e Top of Stack: (sp=0x7fd7ee7c3160) 0x7fd7ee7c3160: 0008 7fd81e21c3d0 0x7fd7ee7c3170: 0004 0001 0x7fd7ee7c3180: 0002 0001 0x7fd7ee7c3190: 0004 0004 0x7fd7ee7c31a0: 0004 0004 0x7fd7ee7c31b0: 0x7fd7ee7c31c0: 0x7fd7ee7c31d0: 0001 0x7fd7ee7c31e0: 0002 0003 0x7fd7ee7c31f0: 7fd7ee7c32b8 7fce32dad3a8 0x7fd7ee7c3200: 0x7fd7ee7c3210: 7fd4501cd000 7fcde6553000 0x7fd7ee7c3220: 00a77ae6 7fd80a39659d 0x7fd7ee7c3230: dcb8fc9b 0x7fd7ee7c3240: 7fd7ee7c32d0 0x7fd7ee7c3250: 0006e5c7e4d8 7fd7ee7c32b8 0x7fd7ee7c3260: 7fce32dad1b0 7fd81df2099d 0x7fd7ee7c3270: 7fd7ee7c32a8 0x7fd7ee7c3280: 0001 0x7fd7ee7c3290: 0006e5c7e528 7fd81d74df10 0x7fd7ee7c32a0: 0006e5c7e4d8 0x7fd7ee7c32b0: 0006f6c7fbf8 0006f6e957f0 0x7fd7ee7c32c0: 0006e5c7e350 7fd87fff 0x7fd7ee7c32d0: 0006e5c7e528 7fd81fa867e0 0x7fd7ee7c32e0: 00a77ae20001 00a77ae2 0x7fd7ee7c32f0: 0006e5c7e488 0112d5f1 0x7fd7ee7c3300: dcb8fc9b99ce 000100a77ae6 0x7fd7ee7c3310: 00a814b000a814b4 0006e5c7e4d8 0x7fd7ee7c3320: 0006e5c7e4d8 0006f6a4df38 0x7fd7ee7c3330: 00060001 00067fff 0x7fd7ee7c3340: 008971582c8a 0006189d87852057 0x7fd7ee7c3350: e5244e71 Instructions: (pc=0x7fd80a399e70) 0x7fd80a399e50: e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20 0x7fd80a399e60: 48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00 0x7fd80a399e70: 48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39 0x7fd80a399e80: c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48 Register to memory mapping: RAX=0x7fcde6560d32 is an unknown value RBX=0x7fd450c4cff9 is an unknown value RCX=0x7fcde6560c7a is an unknown value RDX=0x7fcde6560d3e is an unknown value RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0 RBP=0x7fd450c44ae6 is an unknown value RSI=0x7fcde6562ff8 is an unknown value RDI=0x00c2 is an unknown value R8 =0x7fcde6562ff4 is an unknown value R9 =0x7fcde6563000 is an unknown value R10=0x is an unknown value R11=0x is an unknown value R12=0x000c is an unknown value R13=0x7fd4501cd000 is an unknown value R14=0x7fcde6562ff7 is an unknown value R15=0x7fcde6562ffb is an unknown value Stack: [0x7fd7ee784000,0x000
[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running
[ https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serhat Rıfat Demircan updated CASSANDRA-13757: -- Description: We got following error while repair job running on our cluster. One of the nodes stop due to segmantation fault in JVM and repair job fails. We could not reproduce this problem(main difference is data size) on our test and staging enviroment. {code:java} # # SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700 # # JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [liblz4-java3580121503903465201.so+0x5e70] LZ4_decompress_fast+0xd0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --- T H R E A D --- Current thread (0x7fce32dad1b0): JavaThread "CompactionExecutor:9798" daemon [_thread_in_native, id=16879, stack(0x7fd7ee784000,0x7fd7ee7c5000)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x7fd450c4d000 Registers: RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, RDX=0x7fcde6560d3e RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, RDI=0x00c2 R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, R11=0x R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, R15=0x7fcde6562ffb RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, ERR=0x0004 TRAPNO=0x000e Top of Stack: (sp=0x7fd7ee7c3160) 0x7fd7ee7c3160: 0008 7fd81e21c3d0 0x7fd7ee7c3170: 0004 0001 0x7fd7ee7c3180: 0002 0001 0x7fd7ee7c3190: 0004 0004 0x7fd7ee7c31a0: 0004 0004 0x7fd7ee7c31b0: 0x7fd7ee7c31c0: 0x7fd7ee7c31d0: 0001 0x7fd7ee7c31e0: 0002 0003 0x7fd7ee7c31f0: 7fd7ee7c32b8 7fce32dad3a8 0x7fd7ee7c3200: 0x7fd7ee7c3210: 7fd4501cd000 7fcde6553000 0x7fd7ee7c3220: 00a77ae6 7fd80a39659d 0x7fd7ee7c3230: dcb8fc9b 0x7fd7ee7c3240: 7fd7ee7c32d0 0x7fd7ee7c3250: 0006e5c7e4d8 7fd7ee7c32b8 0x7fd7ee7c3260: 7fce32dad1b0 7fd81df2099d 0x7fd7ee7c3270: 7fd7ee7c32a8 0x7fd7ee7c3280: 0001 0x7fd7ee7c3290: 0006e5c7e528 7fd81d74df10 0x7fd7ee7c32a0: 0006e5c7e4d8 0x7fd7ee7c32b0: 0006f6c7fbf8 0006f6e957f0 0x7fd7ee7c32c0: 0006e5c7e350 7fd87fff 0x7fd7ee7c32d0: 0006e5c7e528 7fd81fa867e0 0x7fd7ee7c32e0: 00a77ae20001 00a77ae2 0x7fd7ee7c32f0: 0006e5c7e488 0112d5f1 0x7fd7ee7c3300: dcb8fc9b99ce 000100a77ae6 0x7fd7ee7c3310: 00a814b000a814b4 0006e5c7e4d8 0x7fd7ee7c3320: 0006e5c7e4d8 0006f6a4df38 0x7fd7ee7c3330: 00060001 00067fff 0x7fd7ee7c3340: 008971582c8a 0006189d87852057 0x7fd7ee7c3350: e5244e71 Instructions: (pc=0x7fd80a399e70) 0x7fd80a399e50: e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20 0x7fd80a399e60: 48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00 0x7fd80a399e70: 48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39 0x7fd80a399e80: c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48 Register to memory mapping: RAX=0x7fcde6560d32 is an unknown value RBX=0x7fd450c4cff9 is an unknown value RCX=0x7fcde6560c7a is an unknown value RDX=0x7fcde6560d3e is an unknown value RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0 RBP=0x7fd450c44ae6 is an unknown value RSI=0x7fcde6562ff8 is an unknown value RDI=0x00c2 is an unknown value R8 =0x7fcde6562ff4 is an unknown value R9 =0x7fcde6563000 is an unknown value R10=0x is an unknown value R11=0x is an unknown value R12=0x000c is an unknown value R13=0x7fd4501cd000 is an unknown value R14=0x7fcde6562ff7 is an unknown value R15=0x7fcde6562ffb is an unknown value Stack: [0x7fd7ee784000,0x
[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running
[ https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serhat Rıfat Demircan updated CASSANDRA-13757: -- Description: We got following error while repair job running on our cluster. One of the nodes stop due to segmantation fault in JVM and repair job fails. {code:java} # # SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700 # # JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [liblz4-java3580121503903465201.so+0x5e70] LZ4_decompress_fast+0xd0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --- T H R E A D --- Current thread (0x7fce32dad1b0): JavaThread "CompactionExecutor:9798" daemon [_thread_in_native, id=16879, stack(0x7fd7ee784000,0x7fd7ee7c5000)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x7fd450c4d000 Registers: RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, RDX=0x7fcde6560d3e RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, RDI=0x00c2 R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, R11=0x R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, R15=0x7fcde6562ffb RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, ERR=0x0004 TRAPNO=0x000e Top of Stack: (sp=0x7fd7ee7c3160) 0x7fd7ee7c3160: 0008 7fd81e21c3d0 0x7fd7ee7c3170: 0004 0001 0x7fd7ee7c3180: 0002 0001 0x7fd7ee7c3190: 0004 0004 0x7fd7ee7c31a0: 0004 0004 0x7fd7ee7c31b0: 0x7fd7ee7c31c0: 0x7fd7ee7c31d0: 0001 0x7fd7ee7c31e0: 0002 0003 0x7fd7ee7c31f0: 7fd7ee7c32b8 7fce32dad3a8 0x7fd7ee7c3200: 0x7fd7ee7c3210: 7fd4501cd000 7fcde6553000 0x7fd7ee7c3220: 00a77ae6 7fd80a39659d 0x7fd7ee7c3230: dcb8fc9b 0x7fd7ee7c3240: 7fd7ee7c32d0 0x7fd7ee7c3250: 0006e5c7e4d8 7fd7ee7c32b8 0x7fd7ee7c3260: 7fce32dad1b0 7fd81df2099d 0x7fd7ee7c3270: 7fd7ee7c32a8 0x7fd7ee7c3280: 0001 0x7fd7ee7c3290: 0006e5c7e528 7fd81d74df10 0x7fd7ee7c32a0: 0006e5c7e4d8 0x7fd7ee7c32b0: 0006f6c7fbf8 0006f6e957f0 0x7fd7ee7c32c0: 0006e5c7e350 7fd87fff 0x7fd7ee7c32d0: 0006e5c7e528 7fd81fa867e0 0x7fd7ee7c32e0: 00a77ae20001 00a77ae2 0x7fd7ee7c32f0: 0006e5c7e488 0112d5f1 0x7fd7ee7c3300: dcb8fc9b99ce 000100a77ae6 0x7fd7ee7c3310: 00a814b000a814b4 0006e5c7e4d8 0x7fd7ee7c3320: 0006e5c7e4d8 0006f6a4df38 0x7fd7ee7c3330: 00060001 00067fff 0x7fd7ee7c3340: 008971582c8a 0006189d87852057 0x7fd7ee7c3350: e5244e71 Instructions: (pc=0x7fd80a399e70) 0x7fd80a399e50: e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20 0x7fd80a399e60: 48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00 0x7fd80a399e70: 48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39 0x7fd80a399e80: c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48 Register to memory mapping: RAX=0x7fcde6560d32 is an unknown value RBX=0x7fd450c4cff9 is an unknown value RCX=0x7fcde6560c7a is an unknown value RDX=0x7fcde6560d3e is an unknown value RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0 RBP=0x7fd450c44ae6 is an unknown value RSI=0x7fcde6562ff8 is an unknown value RDI=0x00c2 is an unknown value R8 =0x7fcde6562ff4 is an unknown value R9 =0x7fcde6563000 is an unknown value R10=0x is an unknown value R11=0x is an unknown value R12=0x000c is an unknown value R13=0x7fd4501cd000 is an unknown value R14=0x7fcde6562ff7 is an unknown value R15=0x7fcde6562ffb is an unknown value Stack: [0x7fd7ee784000,0x7fd7ee7c5000], sp=0x7fd7ee7c3160, free space=252k Native frames: (J=compiled Java code, j=interp
[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running
[ https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serhat Rıfat Demircan updated CASSANDRA-13757: -- Description: We got following error while repair job running on our cluster. One of the nodes stop due to segmantation fault in JVM. {code:java} # # SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700 # # JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [liblz4-java3580121503903465201.so+0x5e70] LZ4_decompress_fast+0xd0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --- T H R E A D --- Current thread (0x7fce32dad1b0): JavaThread "CompactionExecutor:9798" daemon [_thread_in_native, id=16879, stack(0x7fd7ee784000,0x7fd7ee7c5000)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x7fd450c4d000 Registers: RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, RDX=0x7fcde6560d3e RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, RDI=0x00c2 R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, R11=0x R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, R15=0x7fcde6562ffb RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, ERR=0x0004 TRAPNO=0x000e Top of Stack: (sp=0x7fd7ee7c3160) 0x7fd7ee7c3160: 0008 7fd81e21c3d0 0x7fd7ee7c3170: 0004 0001 0x7fd7ee7c3180: 0002 0001 0x7fd7ee7c3190: 0004 0004 0x7fd7ee7c31a0: 0004 0004 0x7fd7ee7c31b0: 0x7fd7ee7c31c0: 0x7fd7ee7c31d0: 0001 0x7fd7ee7c31e0: 0002 0003 0x7fd7ee7c31f0: 7fd7ee7c32b8 7fce32dad3a8 0x7fd7ee7c3200: 0x7fd7ee7c3210: 7fd4501cd000 7fcde6553000 0x7fd7ee7c3220: 00a77ae6 7fd80a39659d 0x7fd7ee7c3230: dcb8fc9b 0x7fd7ee7c3240: 7fd7ee7c32d0 0x7fd7ee7c3250: 0006e5c7e4d8 7fd7ee7c32b8 0x7fd7ee7c3260: 7fce32dad1b0 7fd81df2099d 0x7fd7ee7c3270: 7fd7ee7c32a8 0x7fd7ee7c3280: 0001 0x7fd7ee7c3290: 0006e5c7e528 7fd81d74df10 0x7fd7ee7c32a0: 0006e5c7e4d8 0x7fd7ee7c32b0: 0006f6c7fbf8 0006f6e957f0 0x7fd7ee7c32c0: 0006e5c7e350 7fd87fff 0x7fd7ee7c32d0: 0006e5c7e528 7fd81fa867e0 0x7fd7ee7c32e0: 00a77ae20001 00a77ae2 0x7fd7ee7c32f0: 0006e5c7e488 0112d5f1 0x7fd7ee7c3300: dcb8fc9b99ce 000100a77ae6 0x7fd7ee7c3310: 00a814b000a814b4 0006e5c7e4d8 0x7fd7ee7c3320: 0006e5c7e4d8 0006f6a4df38 0x7fd7ee7c3330: 00060001 00067fff 0x7fd7ee7c3340: 008971582c8a 0006189d87852057 0x7fd7ee7c3350: e5244e71 Instructions: (pc=0x7fd80a399e70) 0x7fd80a399e50: e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20 0x7fd80a399e60: 48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00 0x7fd80a399e70: 48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39 0x7fd80a399e80: c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48 Register to memory mapping: RAX=0x7fcde6560d32 is an unknown value RBX=0x7fd450c4cff9 is an unknown value RCX=0x7fcde6560c7a is an unknown value RDX=0x7fcde6560d3e is an unknown value RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0 RBP=0x7fd450c44ae6 is an unknown value RSI=0x7fcde6562ff8 is an unknown value RDI=0x00c2 is an unknown value R8 =0x7fcde6562ff4 is an unknown value R9 =0x7fcde6563000 is an unknown value R10=0x is an unknown value R11=0x is an unknown value R12=0x000c is an unknown value R13=0x7fd4501cd000 is an unknown value R14=0x7fcde6562ff7 is an unknown value R15=0x7fcde6562ffb is an unknown value Stack: [0x7fd7ee784000,0x7fd7ee7c5000], sp=0x7fd7ee7c3160, free space=252k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=
[jira] [Created] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running
Serhat Rıfat Demircan created CASSANDRA-13757: - Summary: Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running Key: CASSANDRA-13757 URL: https://issues.apache.org/jira/browse/CASSANDRA-13757 Project: Cassandra Issue Type: Bug Environment: Operation System: Debian Jessie Java: Oracle JDK 1.8.0_131 Cassandra: 3.5.0 Reporter: Serhat Rıfat Demircan We got following error while repair job running on our cluster. One of the nodes stop due to following error. {code:java} # # SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700 # # JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [liblz4-java3580121503903465201.so+0x5e70] LZ4_decompress_fast+0xd0 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --- T H R E A D --- Current thread (0x7fce32dad1b0): JavaThread "CompactionExecutor:9798" daemon [_thread_in_native, id=16879, stack(0x7fd7ee784000,0x7fd7ee7c5000)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x7fd450c4d000 Registers: RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, RDX=0x7fcde6560d3e RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, RDI=0x00c2 R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, R11=0x R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, R15=0x7fcde6562ffb RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, ERR=0x0004 TRAPNO=0x000e Top of Stack: (sp=0x7fd7ee7c3160) 0x7fd7ee7c3160: 0008 7fd81e21c3d0 0x7fd7ee7c3170: 0004 0001 0x7fd7ee7c3180: 0002 0001 0x7fd7ee7c3190: 0004 0004 0x7fd7ee7c31a0: 0004 0004 0x7fd7ee7c31b0: 0x7fd7ee7c31c0: 0x7fd7ee7c31d0: 0001 0x7fd7ee7c31e0: 0002 0003 0x7fd7ee7c31f0: 7fd7ee7c32b8 7fce32dad3a8 0x7fd7ee7c3200: 0x7fd7ee7c3210: 7fd4501cd000 7fcde6553000 0x7fd7ee7c3220: 00a77ae6 7fd80a39659d 0x7fd7ee7c3230: dcb8fc9b 0x7fd7ee7c3240: 7fd7ee7c32d0 0x7fd7ee7c3250: 0006e5c7e4d8 7fd7ee7c32b8 0x7fd7ee7c3260: 7fce32dad1b0 7fd81df2099d 0x7fd7ee7c3270: 7fd7ee7c32a8 0x7fd7ee7c3280: 0001 0x7fd7ee7c3290: 0006e5c7e528 7fd81d74df10 0x7fd7ee7c32a0: 0006e5c7e4d8 0x7fd7ee7c32b0: 0006f6c7fbf8 0006f6e957f0 0x7fd7ee7c32c0: 0006e5c7e350 7fd87fff 0x7fd7ee7c32d0: 0006e5c7e528 7fd81fa867e0 0x7fd7ee7c32e0: 00a77ae20001 00a77ae2 0x7fd7ee7c32f0: 0006e5c7e488 0112d5f1 0x7fd7ee7c3300: dcb8fc9b99ce 000100a77ae6 0x7fd7ee7c3310: 00a814b000a814b4 0006e5c7e4d8 0x7fd7ee7c3320: 0006e5c7e4d8 0006f6a4df38 0x7fd7ee7c3330: 00060001 00067fff 0x7fd7ee7c3340: 008971582c8a 0006189d87852057 0x7fd7ee7c3350: e5244e71 Instructions: (pc=0x7fd80a399e70) 0x7fd80a399e50: e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20 0x7fd80a399e60: 48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00 0x7fd80a399e70: 48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39 0x7fd80a399e80: c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48 Register to memory mapping: RAX=0x7fcde6560d32 is an unknown value RBX=0x7fd450c4cff9 is an unknown value RCX=0x7fcde6560c7a is an unknown value RDX=0x7fcde6560d3e is an unknown value RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0 RBP=0x7fd450c44ae6 is an unknown value RSI=0x7fcde6562ff8 is an unknown value RDI=0x00c2 is an unknown value R8 =0x7fcde6562ff4 is an unknown value R9 =0x7fcde6563000 is an unknown value R10=0x is an unknown value R11=0x is an unknown value R12=0x000c is an unknown value R13=0x7fd4501cd000 is an unknown val
[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123276#comment-16123276 ] Marcus Eriksson commented on CASSANDRA-13594: - https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/170/ ([^13594.png]) > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > Attachments: 13594.png > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()
[ https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13594: Attachment: 13594.png > Use an ExecutorService for repair commands instead of new Thread(..).start() > > > Key: CASSANDRA-13594 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13594 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > Attachments: 13594.png > > > Currently when starting a new repair, we create a new Thread and start it > immediately > It would be nice to be able to 1) limit the number of threads and 2) reject > starting new repair commands if we are already running too many. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123270#comment-16123270 ] Marcus Eriksson commented on CASSANDRA-13664: - rerunning tests https://circleci.com/gh/krummas/cassandra/68 https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/176 I'll try to grab screen shots once they are finished > RangeFetchMapCalculator should not try to optimise 'trivial' ranges > --- > > Key: CASSANDRA-13664 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13664 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams > out of each node as even as possible. > In a typical multi-dc ring the nodes in the dcs are setup using token + 1, > creating many tiny ranges. If we only try to optimise over the number of > streams, it is likely that the amount of data streamed out of each node is > unbalanced. > We should ignore those trivial ranges and only optimise the big ones, then > share the tiny ones over the nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[Cassandra Wiki] Update of "Committers" by AlekseyYeschenko
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "Committers" page has been changed by AlekseyYeschenko: https://wiki.apache.org/cassandra/Committers?action=diff&rev1=75&rev2=76 ||Blake Eggleston ||February 2017 ||Apple || || ||Alex Petrov ||February 2017 ||Datastax || || ||Joel Knighton ||February 2017 || Datastax || || - + ||Philip Thompson ||June 2017 || Datastax || || {{https://c.statcounter.com/9397521/0/fe557aad/1/|stats}} - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123171#comment-16123171 ] Marcus Eriksson commented on CASSANDRA-10726: - btw, I think this should go to 4.0 only, do you agree? > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Xiaolong Jiang > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-10726: Fix Version/s: (was: 3.0.x) 4.x > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Xiaolong Jiang > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123166#comment-16123166 ] Marcus Eriksson commented on CASSANDRA-10726: - This LGTM, pushed a branch with some small nits fixed here: https://github.com/krummas/cassandra/tree/xiaolong/10726 (please have a look) running tests: https://circleci.com/gh/krummas/cassandra/67 https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/175 Will commit if the tests look good and you think my nits are ok > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Xiaolong Jiang > Fix For: 3.0.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org