[jira] [Comment Edited] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-08-11 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249
 ] 

Jaydeepkumar Chovatia edited comment on CASSANDRA-13740 at 8/11/17 11:56 PM:
-

Hi [~iamaleksey]

I have modified code as per your review comments, please find it attached 
"13740-3.0.15.txt"
Also please find same patch here: 
https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e

I will create patch for 3.11 as well as will run {{circleci}} after receiving 
your review comments.

Jaydeep


was (Author: chovatia.jayd...@gmail.com):
Hi [~iamaleksey]

I have modified code as per your review comments, please find it attached 
"13740_3.0.15.txt"
Also please find same patch here: 
https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e

I will create patch for 3.11 as well as will run {{circleci}} after receiving 
your review comments.

Jaydeep

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-08-11 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249
 ] 

Jaydeepkumar Chovatia edited comment on CASSANDRA-13740 at 8/11/17 11:56 PM:
-

Hi [~iamaleksey]

I have modified code as per your review comments, please find it attached 
"13740_3.0.15.txt"
Also please find same patch here: 
https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e

I will create patch for 3.11 as well as will run {{circleci}} after receiving 
your review comments.

Jaydeep


was (Author: chovatia.jayd...@gmail.com):
Hi [~iamaleksey]

I have modified code as per your review comments, please find it attached 
"13740-2_3.0.15.txt"
Also please find same patch here: 
https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e

I will create patch for 3.11 as well as will run {{circleci}} after receiving 
your review comments.

Jaydeep

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-08-11 Thread Jaydeepkumar Chovatia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-13740:
--
Attachment: (was: 13740-3.0.15.txt)

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-08-11 Thread Jaydeepkumar Chovatia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaydeepkumar Chovatia updated CASSANDRA-13740:
--
Attachment: 13740-3.0.15.txt

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-08-11 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124249#comment-16124249
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-13740:
---

Hi [~iamaleksey]

I have modified code as per your review comments, please find it attached 
"13740-2_3.0.15.txt"
Also please find same patch here: 
https://github.com/jaydeepkumar1984/cassandra/commit/173fce0362246595d26b24196d6690223d132d5e

I will create patch for 3.11 as well as will run {{circleci}} after receiving 
your review comments.

Jaydeep

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables

2017-08-11 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13758:

Reviewer: Marcus Eriksson
  Status: Patch Available  (was: Open)

[trunk|https://github.com/bdeggleston/cassandra/tree/13758]

[utest|https://circleci.com/gh/bdeggleston/cassandra/87]

> Incremental repair sessions shouldn't be deleted if they still have sstables
> 
>
> Key: CASSANDRA-13758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13758
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>
> The incremental session cleanup doesn't verify that there are no remaining 
> sstables marked as part of the repair before deleting it. Deleting a 
> successful repair session which still has outstanding sstables will cause 
> those sstables to be demoted to unrepaired, creating an inconsistency.
> This typically wouldn't be an issue, since we'd expect the sstables to long 
> since have been promoted / demoted. However, I've seen a few ref leak issues 
> which can cause sstables to get stuck. Those have been fixed, but we should 
> still protect against that edge case to prevent inconsistencies caused by 
> future (or currently unknown) bugs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables

2017-08-11 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13758:

Fix Version/s: 4.0

> Incremental repair sessions shouldn't be deleted if they still have sstables
> 
>
> Key: CASSANDRA-13758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13758
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> The incremental session cleanup doesn't verify that there are no remaining 
> sstables marked as part of the repair before deleting it. Deleting a 
> successful repair session which still has outstanding sstables will cause 
> those sstables to be demoted to unrepaired, creating an inconsistency.
> This typically wouldn't be an issue, since we'd expect the sstables to long 
> since have been promoted / demoted. However, I've seen a few ref leak issues 
> which can cause sstables to get stuck. Those have been fixed, but we should 
> still protect against that edge case to prevent inconsistencies caused by 
> future (or currently unknown) bugs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11483) Enhance sstablemetadata

2017-08-11 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124103#comment-16124103
 ] 

Joel Knighton commented on CASSANDRA-11483:
---

The dtest fix was committed in [CASSANDRA-13755] - thanks everyone.

> Enhance sstablemetadata
> ---
>
> Key: CASSANDRA-11483
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11483
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Fix For: 4.0
>
> Attachments: CASSANDRA-11483.txt, CASSANDRA-11483v2.txt, 
> CASSANDRA-11483v3.txt, CASSANDRA-11483v4.txt, CASSANDRA-11483v5.txt, Screen 
> Shot 2016-04-03 at 11.40.32 PM.png
>
>
> sstablemetadata provides quite a bit of useful information but theres a few 
> hiccups I would like to see addressed:
> * Does not use client mode
> * Units are not provided (or anything for that matter). There is data in 
> micros, millis, seconds as durations and timestamps from epoch. But there is 
> no way to tell what one is without a non-trival code dive
> * in general pretty frustrating to parse



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13758) Incremental repair sessions shouldn't be deleted if they still have sstables

2017-08-11 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-13758:
---

 Summary: Incremental repair sessions shouldn't be deleted if they 
still have sstables
 Key: CASSANDRA-13758
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13758
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston


The incremental session cleanup doesn't verify that there are no remaining 
sstables marked as part of the repair before deleting it. Deleting a successful 
repair session which still has outstanding sstables will cause those sstables 
to be demoted to unrepaired, creating an inconsistency.

This typically wouldn't be an issue, since we'd expect the sstables to long 
since have been promoted / demoted. However, I've seen a few ref leak issues 
which can cause sstables to get stuck. Those have been fixed, but we should 
still protect against that edge case to prevent inconsistencies caused by 
future (or currently unknown) bugs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-11 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123748#comment-16123748
 ] 

Ariel Weisberg commented on CASSANDRA-13594:


Unrelated but I did manage to reproduce short_read_test failing after letting 
it run a lot of times.

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-11 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13594:
---
Status: Ready to Commit  (was: Patch Available)

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9989) Optimise BTree.Buider

2017-08-11 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123714#comment-16123714
 ] 

Jay Zhuang commented on CASSANDRA-9989:
---

[~Anthony Grasso] Would you please review the patch?

> Optimise BTree.Buider
> -
>
> Key: CASSANDRA-9989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9989
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 9989-trunk.txt
>
>
> BTree.Builder could reduce its copying, and exploit toArray more efficiently, 
> with some work. It's not very important right now because we don't make as 
> much use of its bulk-add methods as we otherwise might, however over time 
> this work will become more useful.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13688) Anticompaction race can leak sstables/txn

2017-08-11 Thread Blake Eggleston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13688:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Finally got a good dtest run. Committed as 
{{e9cc805db1133982c022657f8cab86cd24b3686f}}

> Anticompaction race can leak sstables/txn
> -
>
> Key: CASSANDRA-13688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13688
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> At the top of {{CompactionManager#performAntiCompaction}}, the parent repair 
> session is loaded, if the session can't be found, a RuntimeException is 
> thrown. This can happen if a participant is evicted after the IR prepare 
> message is received, but before the anticompaction starts. This exception is 
> thrown outside of the try/finally block that guards the sstable and lifecycle 
> transaction, causing them to leak, and preventing the sstables from ever 
> being removed from View.compacting.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Fix race / ref leak in anticompaction

2017-08-11 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/trunk f4da90aca -> e9cc805db


Fix race / ref leak in anticompaction

Patch by Blake Eggleston; Reviewed by Ariel Weisberg for CASSANDRA-13688


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e9cc805d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e9cc805d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e9cc805d

Branch: refs/heads/trunk
Commit: e9cc805db1133982c022657f8cab86cd24b3686f
Parents: f4da90a
Author: Blake Eggleston 
Authored: Wed Jul 12 14:47:48 2017 -0700
Committer: Blake Eggleston 
Committed: Fri Aug 11 10:24:17 2017 -0700

--
 CHANGES.txt |   1 +
 .../db/compaction/AbstractCompactionTask.java   |  40 +
 .../db/compaction/CompactionManager.java|  45 +++---
 .../db/compaction/PendingRepairManager.java |  43 ++---
 .../db/compaction/AntiCompactionTest.java   |  40 +
 .../db/compaction/CompactionTaskTest.java   | 157 +++
 .../consistent/PendingAntiCompactionTest.java   |  23 +++
 7 files changed, 312 insertions(+), 37 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 988f93d..7c9d79a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Fix race / ref leak in anticompaction (CASSANDRA-13688)
  * Expose tasks queue length via JMX (CASSANDRA-12758)
  * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)
  * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java 
b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
index 430c916..c542a51 100644
--- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
+++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java
@@ -17,7 +17,11 @@
  */
 package org.apache.cassandra.db.compaction;
 
+import java.util.Iterator;
 import java.util.Set;
+import java.util.UUID;
+
+import com.google.common.base.Preconditions;
 
 import org.apache.cassandra.db.ColumnFamilyStore;
 import org.apache.cassandra.db.Directories;
@@ -49,6 +53,42 @@ public abstract class AbstractCompactionTask extends 
WrappedRunnable
 Set compacting = transaction.tracker.getCompacting();
 for (SSTableReader sstable : transaction.originals())
 assert compacting.contains(sstable) : sstable.getFilename() + " is 
not correctly marked compacting";
+
+validateSSTables(transaction.originals());
+}
+
+/**
+ * Confirm that we're not attempting to compact 
repaired/unrepaired/pending repair sstables together
+ */
+private void validateSSTables(Set sstables)
+{
+// do not allow  to be compacted together
+if (!sstables.isEmpty())
+{
+Iterator iter = sstables.iterator();
+SSTableReader first = iter.next();
+boolean isRepaired = first.isRepaired();
+UUID pendingRepair = first.getPendingRepair();
+while (iter.hasNext())
+{
+SSTableReader next = iter.next();
+Preconditions.checkArgument(isRepaired == next.isRepaired(),
+"Cannot compact repaired and 
unrepaired sstables");
+
+if (pendingRepair == null)
+{
+Preconditions.checkArgument(!next.isPendingRepair(),
+"Cannot compact pending repair 
and non-pending repair sstables");
+}
+else
+{
+Preconditions.checkArgument(next.isPendingRepair(),
+"Cannot compact pending repair 
and non-pending repair sstables");
+
Preconditions.checkArgument(pendingRepair.equals(next.getPendingRepair()),
+"Cannot compact sstables from 
different pending repairs");
+}
+}
+}
 }
 
 /**

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e9cc805d/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index b

[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-11 Thread Xiaolong Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123604#comment-16123604
 ] 

Xiaolong Jiang commented on CASSANDRA-10726:


[~krummas] Thanks for the review.  Yes, your change looks good to me. I saw the 
tests in circle CI passed. Dtest is still running. Please go ahead and merge 
when dtest passes. 
I think we can go to 4.0 only for open source version. 

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-12758:
---
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Committed to trunk - f4da90a
Thanks Romain!

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 4.0
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Add MBean to monitor max queued tasks

2017-08-11 Thread mshuler
Repository: cassandra
Updated Branches:
  refs/heads/trunk d68357a44 -> f4da90aca


Add MBean to monitor max queued tasks

patch by Romain Hardouin; reviewed by Michael Shuler for CASSANDRA-12758


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f4da90ac
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f4da90ac
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f4da90ac

Branch: refs/heads/trunk
Commit: f4da90aca0e79664ea06212283f6cd5f9288d441
Parents: d68357a
Author: Romain Hardouin 
Authored: Thu Oct 6 22:36:07 2016 +0200
Committer: Michael Shuler 
Committed: Fri Aug 11 08:36:36 2017 -0500

--
 CHANGES.txt   |  1 +
 doc/source/operating/metrics.rst  |  1 +
 src/java/org/apache/cassandra/concurrent/SEPExecutor.java |  2 +-
 src/java/org/apache/cassandra/metrics/SEPMetrics.java | 10 ++
 4 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index efd6716..988f93d 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Expose tasks queue length via JMX (CASSANDRA-12758)
  * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)
  * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615)
  * Improve sstablemetadata output (CASSANDRA-11483)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/doc/source/operating/metrics.rst
--
diff --git a/doc/source/operating/metrics.rst b/doc/source/operating/metrics.rst
index a38d7c1..cfdd584 100644
--- a/doc/source/operating/metrics.rst
+++ b/doc/source/operating/metrics.rst
@@ -193,6 +193,7 @@ CompletedTasksCounterNumber of tasks 
completed.
 TotalBlockedTasks CounterNumber of tasks that were blocked due to 
queue saturation.
 CurrentlyBlockedTask  CounterNumber of tasks that are currently 
blocked due to queue saturation but on retry will become unblocked.
 MaxPoolSize   Gauge The maximum number of threads in this 
pool.
+MaxTasksQueuedGauge The maximum number of tasks queued before 
a task get blocked.
 = == ===
 
 The following thread pools can be monitored.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/src/java/org/apache/cassandra/concurrent/SEPExecutor.java
--
diff --git a/src/java/org/apache/cassandra/concurrent/SEPExecutor.java 
b/src/java/org/apache/cassandra/concurrent/SEPExecutor.java
index c87614b..add850a 100644
--- a/src/java/org/apache/cassandra/concurrent/SEPExecutor.java
+++ b/src/java/org/apache/cassandra/concurrent/SEPExecutor.java
@@ -35,7 +35,7 @@ public class SEPExecutor extends 
AbstractLocalAwareExecutorService
 
 public final int maxWorkers;
 public final String name;
-private final int maxTasksQueued;
+public final int maxTasksQueued;
 private final SEPMetrics metrics;
 
 // stores both a set of work permits and task permits:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f4da90ac/src/java/org/apache/cassandra/metrics/SEPMetrics.java
--
diff --git a/src/java/org/apache/cassandra/metrics/SEPMetrics.java 
b/src/java/org/apache/cassandra/metrics/SEPMetrics.java
index 35f02b4..dd1d2d6 100644
--- a/src/java/org/apache/cassandra/metrics/SEPMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/SEPMetrics.java
@@ -41,6 +41,8 @@ public class SEPMetrics
 public final Gauge pendingTasks;
 /** Maximum number of threads before it will start queuing tasks */
 public final Gauge maxPoolSize;
+/** Maximum number of tasks queued before a task get blocked */
+public final Gauge maxTasksQueued;
 
 private MetricNameFactory factory;
 
@@ -85,6 +87,13 @@ public class SEPMetrics
 return executor.maxWorkers;
 }
 });
+maxTasksQueued =  
Metrics.register(factory.createMetricName("MaxTasksQueued"), new 
Gauge()
+{
+public Integer getValue()
+{
+return executor.maxTasksQueued;
+}
+});
 }
 
 public void release()
@@ -95,5 +104,6 @@ public class SEPMetrics
 Metrics.remove(factory.createMetricName("TotalBlockedTasks"));
 Metrics.remove(factory.createMetricName("CurrentlyBlockedTasks"));
 Metrics.remove(factory.createMetricName("MaxPoolSize"));
+Metrics.remove(factory.createMetricName("MaxTasksQueued"));
 }
 }



[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-12758:
---
Status: Ready to Commit  (was: Patch Available)

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-12758:
---
Fix Version/s: (was: 3.11.x)
   (was: 3.0.x)

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351
 ] 

Michael Shuler edited comment on CASSANDRA-12758 at 8/11/17 3:58 PM:
-

Thanks! I had trouble with CircleCI completing a test run yesterday, so I 
pulled your patch to run through internal CI to see if this causes any test 
issues.

{{ant test-all}} passed 100%
{{cassandra-dtest}} run looks good - just one failure on CASSANDRA-13576


was (Author: mshuler):
Thanks! I had trouble with CircleCI completing a test run yesterday, so I 
pulled your patch to run through internal CI to see if this causes any test 
issues.

{{ant test-all}} passed 100%
{{cassandra-dtest}} run in progress

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351
 ] 

Michael Shuler edited comment on CASSANDRA-12758 at 8/11/17 3:26 PM:
-

Thanks! I had trouble with CircleCI completing a test run yesterday, so I 
pulled your patch to run through internal CI to see if this causes any test 
issues.

{{ant test-all}} passed 100%
{{cassandra-dtest}} run in progress


was (Author: mshuler):
Thanks! I had trouble with CircleCI completing a test run yesterday, so I 
pulled your patch to run through internal CI to see if this causes any test 
issues.

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues

2017-08-11 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123418#comment-16123418
 ] 

Michael Shuler commented on CASSANDRA-13433:


Reuploaded gpg-signed rpms for 2.2.10 just now, so this should be fixed.

> RPM distribution improvements and known issues
> --
>
> Key: CASSANDRA-13433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13433
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Starting with CASSANDRA-13252, new releases will be provided as both official 
> RPM and Debian packages.  While the Debian packages are already well 
> established with our user base, the RPMs just have been release for the first 
> time and still require some attention. 
> Feel free to discuss RPM related issues in this ticket and open a sub-task to 
> fill a bug report. 
> Please note that native systemd support will be implemented with 
> CASSANDRA-13148 and this is not strictly a RPM specific issue. We still 
> intent to offer non-systemd support based on the already working init scripts 
> that we ship. Therefor the first step is to make use of systemd backward 
> compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd 
> and non-systemd environments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



svn commit: r20930 - in /release/cassandra/redhat/22x: ./ repodata/

2017-08-11 Thread mshuler
Author: mshuler
Date: Fri Aug 11 14:23:19 2017
New Revision: 20930

Log:
Reupload gpg-signed rpms for Apache Cassandra 2.2.10

Added:

release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2
   (with props)

release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2
   (with props)

release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2
   (with props)

release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz
   (with props)

release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz
   (with props)

release/cassandra/redhat/22x/repodata/f791d6311e120a8470481817193a3bcf2d210292d9b7127320ff3173f3546168-other.xml.gz
   (with props)
Removed:

release/cassandra/redhat/22x/repodata/3236f5a391cbf37fd0e70cba6ec4633b6d466f68064061a4b40443c002199304-filelists.xml.gz

release/cassandra/redhat/22x/repodata/401603460b42f33ace08526f74b13526d3ebfa0aa1ac53b15fe0f6f1f4feae55-primary.sqlite.bz2

release/cassandra/redhat/22x/repodata/4b99211eb9721f495aa9e2aa16ed21e7d68ca0c17cc2135485d40a2dc2fb5dca-other.sqlite.bz2

release/cassandra/redhat/22x/repodata/5f42f64ab725593d48892f07ad9bf58ff3b452c298cd4ab6dba33b9ca60dbc6b-other.xml.gz

release/cassandra/redhat/22x/repodata/813f5fd0236564d9fca0fa1a5913f7ce6672a0509732fb9a6a064d2fe2bbf447-filelists.sqlite.bz2

release/cassandra/redhat/22x/repodata/eba2ada18598c21e51c5e3294ad9414bae2cdbc65a36d4e03153bf3fe1a35571-primary.xml.gz
Modified:
release/cassandra/redhat/22x/cassandra-2.2.10-1.noarch.rpm
release/cassandra/redhat/22x/cassandra-2.2.10-1.src.rpm
release/cassandra/redhat/22x/cassandra-tools-2.2.10-1.noarch.rpm
release/cassandra/redhat/22x/repodata/repomd.xml
release/cassandra/redhat/22x/repodata/repomd.xml.asc

Modified: release/cassandra/redhat/22x/cassandra-2.2.10-1.noarch.rpm
==
Binary files - no diff available.

Modified: release/cassandra/redhat/22x/cassandra-2.2.10-1.src.rpm
==
Binary files - no diff available.

Modified: release/cassandra/redhat/22x/cassandra-tools-2.2.10-1.noarch.rpm
==
Binary files - no diff available.

Added: 
release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2
==
Binary file - no diff available.

Propchange: 
release/cassandra/redhat/22x/repodata/29ede3ea14a1c5bee9a7b0f26fd9ad0f0a2ac2879a850d741c935083ba4d5914-primary.sqlite.bz2
--
svn:mime-type = application/octet-stream

Added: 
release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2
==
Binary file - no diff available.

Propchange: 
release/cassandra/redhat/22x/repodata/377176d209e1a2a3c4b616ca2f3fdae4eaae0a604134d9d3da3f0447b4b612d4-other.sqlite.bz2
--
svn:mime-type = application/octet-stream

Added: 
release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2
==
Binary file - no diff available.

Propchange: 
release/cassandra/redhat/22x/repodata/6e04fc3eddacd4121403039f1c37a828cea279d2b6f09a5d49122f82611fa3b9-filelists.sqlite.bz2
--
svn:mime-type = application/octet-stream

Added: 
release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz
==
Binary file - no diff available.

Propchange: 
release/cassandra/redhat/22x/repodata/86429aeadd7294922dc5e76b5e0a3da99162c24e9f12724835fae172cbb00790-filelists.xml.gz
--
svn:mime-type = application/octet-stream

Added: 
release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz
==
Binary file - no diff available.

Propchange: 
release/cassandra/redhat/22x/repodata/98688c5e74b32ca63c4da0bad59e2cd05ac53ea4c327329f9e75c7df40f2210b-primary.xml.gz
-

[jira] [Commented] (CASSANDRA-13433) RPM distribution improvements and known issues

2017-08-11 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123380#comment-16123380
 ] 

Michael Shuler commented on CASSANDRA-13433:


Looks like I missed a package signature on 2.2.10, but the repository signature 
looks good. I believe setting gpgcheck=0 while leaving repo_gpgcheck=1 will 
allow installation, and we'll get this right on the next upload.
{noformat}
$ rpm -K *.rpm
cassandra-2.1.18-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: 
(MD5) PGP#fe4b2bda) 
cassandra-2.2.10-1.noarch.rpm: sha1 md5 OK
cassandra-3.0.14-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: 
(MD5) PGP#fe4b2bda) 
cassandra-3.11.0-1.noarch.rpm: RSA sha1 ((MD5) PGP) md5 NOT OK (MISSING KEYS: 
(MD5) PGP#fe4b2bda)
{noformat}

> RPM distribution improvements and known issues
> --
>
> Key: CASSANDRA-13433
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13433
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>
> Starting with CASSANDRA-13252, new releases will be provided as both official 
> RPM and Debian packages.  While the Debian packages are already well 
> established with our user base, the RPMs just have been release for the first 
> time and still require some attention. 
> Feel free to discuss RPM related issues in this ticket and open a sub-task to 
> fill a bug report. 
> Please note that native systemd support will be implemented with 
> CASSANDRA-13148 and this is not strictly a RPM specific issue. We still 
> intent to offer non-systemd support based on the already working init scripts 
> that we ship. Therefor the first step is to make use of systemd backward 
> compatibility for SysV/LSB scripts, so we can provide RPMs for both systemd 
> and non-systemd environments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123351#comment-16123351
 ] 

Michael Shuler commented on CASSANDRA-12758:


Thanks! I had trouble with CircleCI completing a test run yesterday, so I 
pulled your patch to run through internal CI to see if this causes any test 
issues.

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-12758:

Attachment: (was: 12758-trunk.patch)

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Romain Hardouin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Hardouin updated CASSANDRA-12758:

Attachment: (was: 12758-3.0.patch)

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12758) Expose tasks queue length via JMX

2017-08-11 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123323#comment-16123323
 ] 

Romain Hardouin commented on CASSANDRA-12758:
-

Rebased on trunk, build successful 
https://circleci.com/gh/rhardouin/cassandra/18
Let's include it in trunk only, anyway it's trivial to backport if someone need 
it on production.

> Expose tasks queue length via JMX
> -
>
> Key: CASSANDRA-12758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12758
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Romain Hardouin
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: 12758-3.0.patch, 12758-trunk.patch
>
>
> CASSANDRA-11363 introduced {{cassandra.max_queued_native_transport_requests}} 
> to set the NTR queue length.
> Currently Cassandra lacks of a JMX Mbean which exposes this value which would 
> allow to:
>  
> 1. Be sure this value has been set
> 2. Plot this value in a monitoring application to make correlations with 
> other graphs when we make changes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running

2017-08-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serhat Rıfat Demircan updated CASSANDRA-13757:
--
Description: 
We got following error while repair job running on our cluster. One of the 
nodes stop due to segmantation fault in JVM and repair job fails.

We could not reproduce this problem on our test and staging enviroment (main 
difference is data size).

{code:java}
#
#  SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [liblz4-java3580121503903465201.so+0x5e70]  LZ4_decompress_fast+0xd0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---  T H R E A D  ---

Current thread (0x7fce32dad1b0):  JavaThread "CompactionExecutor:9798" 
daemon [_thread_in_native, id=16879, 
stack(0x7fd7ee784000,0x7fd7ee7c5000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x7fd450c4d000

Registers:
RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, 
RDX=0x7fcde6560d3e
RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, 
RDI=0x00c2
R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, 
R11=0x
R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, 
R15=0x7fcde6562ffb
RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, 
ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x7fd7ee7c3160)
0x7fd7ee7c3160:   0008 7fd81e21c3d0
0x7fd7ee7c3170:   0004 0001
0x7fd7ee7c3180:   0002 0001
0x7fd7ee7c3190:   0004 0004
0x7fd7ee7c31a0:   0004 0004
0x7fd7ee7c31b0:    
0x7fd7ee7c31c0:    
0x7fd7ee7c31d0:    0001
0x7fd7ee7c31e0:   0002 0003
0x7fd7ee7c31f0:   7fd7ee7c32b8 7fce32dad3a8
0x7fd7ee7c3200:    
0x7fd7ee7c3210:   7fd4501cd000 7fcde6553000
0x7fd7ee7c3220:   00a77ae6 7fd80a39659d
0x7fd7ee7c3230:    dcb8fc9b
0x7fd7ee7c3240:   7fd7ee7c32d0 
0x7fd7ee7c3250:   0006e5c7e4d8 7fd7ee7c32b8
0x7fd7ee7c3260:   7fce32dad1b0 7fd81df2099d
0x7fd7ee7c3270:   7fd7ee7c32a8 
0x7fd7ee7c3280:   0001 
0x7fd7ee7c3290:   0006e5c7e528 7fd81d74df10
0x7fd7ee7c32a0:    0006e5c7e4d8
0x7fd7ee7c32b0:   0006f6c7fbf8 0006f6e957f0
0x7fd7ee7c32c0:   0006e5c7e350 7fd87fff
0x7fd7ee7c32d0:   0006e5c7e528 7fd81fa867e0
0x7fd7ee7c32e0:   00a77ae20001 00a77ae2
0x7fd7ee7c32f0:   0006e5c7e488 0112d5f1
0x7fd7ee7c3300:   dcb8fc9b99ce 000100a77ae6
0x7fd7ee7c3310:   00a814b000a814b4 0006e5c7e4d8
0x7fd7ee7c3320:   0006e5c7e4d8 0006f6a4df38
0x7fd7ee7c3330:   00060001 00067fff
0x7fd7ee7c3340:   008971582c8a 0006189d87852057
0x7fd7ee7c3350:    e5244e71
Instructions: (pc=0x7fd80a399e70)
0x7fd80a399e50:   e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20
0x7fd80a399e60:   48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00
0x7fd80a399e70:   48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39
0x7fd80a399e80:   c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48

Register to memory mapping:

RAX=0x7fcde6560d32 is an unknown value
RBX=0x7fd450c4cff9 is an unknown value
RCX=0x7fcde6560c7a is an unknown value
RDX=0x7fcde6560d3e is an unknown value
RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0
RBP=0x7fd450c44ae6 is an unknown value
RSI=0x7fcde6562ff8 is an unknown value
RDI=0x00c2 is an unknown value
R8 =0x7fcde6562ff4 is an unknown value
R9 =0x7fcde6563000 is an unknown value
R10=0x is an unknown value
R11=0x is an unknown value
R12=0x000c is an unknown value
R13=0x7fd4501cd000 is an unknown value
R14=0x7fcde6562ff7 is an unknown value
R15=0x7fcde6562ffb is an unknown value


Stack: [0x7fd7ee784000,0x000

[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running

2017-08-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serhat Rıfat Demircan updated CASSANDRA-13757:
--
Description: 
We got following error while repair job running on our cluster. One of the 
nodes stop due to segmantation fault in JVM and repair job fails.

We could not reproduce this problem(main difference is data size) on our test 
and staging enviroment.

{code:java}
#
#  SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [liblz4-java3580121503903465201.so+0x5e70]  LZ4_decompress_fast+0xd0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---  T H R E A D  ---

Current thread (0x7fce32dad1b0):  JavaThread "CompactionExecutor:9798" 
daemon [_thread_in_native, id=16879, 
stack(0x7fd7ee784000,0x7fd7ee7c5000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x7fd450c4d000

Registers:
RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, 
RDX=0x7fcde6560d3e
RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, 
RDI=0x00c2
R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, 
R11=0x
R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, 
R15=0x7fcde6562ffb
RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, 
ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x7fd7ee7c3160)
0x7fd7ee7c3160:   0008 7fd81e21c3d0
0x7fd7ee7c3170:   0004 0001
0x7fd7ee7c3180:   0002 0001
0x7fd7ee7c3190:   0004 0004
0x7fd7ee7c31a0:   0004 0004
0x7fd7ee7c31b0:    
0x7fd7ee7c31c0:    
0x7fd7ee7c31d0:    0001
0x7fd7ee7c31e0:   0002 0003
0x7fd7ee7c31f0:   7fd7ee7c32b8 7fce32dad3a8
0x7fd7ee7c3200:    
0x7fd7ee7c3210:   7fd4501cd000 7fcde6553000
0x7fd7ee7c3220:   00a77ae6 7fd80a39659d
0x7fd7ee7c3230:    dcb8fc9b
0x7fd7ee7c3240:   7fd7ee7c32d0 
0x7fd7ee7c3250:   0006e5c7e4d8 7fd7ee7c32b8
0x7fd7ee7c3260:   7fce32dad1b0 7fd81df2099d
0x7fd7ee7c3270:   7fd7ee7c32a8 
0x7fd7ee7c3280:   0001 
0x7fd7ee7c3290:   0006e5c7e528 7fd81d74df10
0x7fd7ee7c32a0:    0006e5c7e4d8
0x7fd7ee7c32b0:   0006f6c7fbf8 0006f6e957f0
0x7fd7ee7c32c0:   0006e5c7e350 7fd87fff
0x7fd7ee7c32d0:   0006e5c7e528 7fd81fa867e0
0x7fd7ee7c32e0:   00a77ae20001 00a77ae2
0x7fd7ee7c32f0:   0006e5c7e488 0112d5f1
0x7fd7ee7c3300:   dcb8fc9b99ce 000100a77ae6
0x7fd7ee7c3310:   00a814b000a814b4 0006e5c7e4d8
0x7fd7ee7c3320:   0006e5c7e4d8 0006f6a4df38
0x7fd7ee7c3330:   00060001 00067fff
0x7fd7ee7c3340:   008971582c8a 0006189d87852057
0x7fd7ee7c3350:    e5244e71
Instructions: (pc=0x7fd80a399e70)
0x7fd80a399e50:   e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20
0x7fd80a399e60:   48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00
0x7fd80a399e70:   48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39
0x7fd80a399e80:   c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48

Register to memory mapping:

RAX=0x7fcde6560d32 is an unknown value
RBX=0x7fd450c4cff9 is an unknown value
RCX=0x7fcde6560c7a is an unknown value
RDX=0x7fcde6560d3e is an unknown value
RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0
RBP=0x7fd450c44ae6 is an unknown value
RSI=0x7fcde6562ff8 is an unknown value
RDI=0x00c2 is an unknown value
R8 =0x7fcde6562ff4 is an unknown value
R9 =0x7fcde6563000 is an unknown value
R10=0x is an unknown value
R11=0x is an unknown value
R12=0x000c is an unknown value
R13=0x7fd4501cd000 is an unknown value
R14=0x7fcde6562ff7 is an unknown value
R15=0x7fcde6562ffb is an unknown value


Stack: [0x7fd7ee784000,0x

[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running

2017-08-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serhat Rıfat Demircan updated CASSANDRA-13757:
--
Description: 
We got following error while repair job running on our cluster. One of the 
nodes stop due to segmantation fault in JVM and repair job fails.


{code:java}
#
#  SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [liblz4-java3580121503903465201.so+0x5e70]  LZ4_decompress_fast+0xd0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---  T H R E A D  ---

Current thread (0x7fce32dad1b0):  JavaThread "CompactionExecutor:9798" 
daemon [_thread_in_native, id=16879, 
stack(0x7fd7ee784000,0x7fd7ee7c5000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x7fd450c4d000

Registers:
RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, 
RDX=0x7fcde6560d3e
RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, 
RDI=0x00c2
R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, 
R11=0x
R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, 
R15=0x7fcde6562ffb
RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, 
ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x7fd7ee7c3160)
0x7fd7ee7c3160:   0008 7fd81e21c3d0
0x7fd7ee7c3170:   0004 0001
0x7fd7ee7c3180:   0002 0001
0x7fd7ee7c3190:   0004 0004
0x7fd7ee7c31a0:   0004 0004
0x7fd7ee7c31b0:    
0x7fd7ee7c31c0:    
0x7fd7ee7c31d0:    0001
0x7fd7ee7c31e0:   0002 0003
0x7fd7ee7c31f0:   7fd7ee7c32b8 7fce32dad3a8
0x7fd7ee7c3200:    
0x7fd7ee7c3210:   7fd4501cd000 7fcde6553000
0x7fd7ee7c3220:   00a77ae6 7fd80a39659d
0x7fd7ee7c3230:    dcb8fc9b
0x7fd7ee7c3240:   7fd7ee7c32d0 
0x7fd7ee7c3250:   0006e5c7e4d8 7fd7ee7c32b8
0x7fd7ee7c3260:   7fce32dad1b0 7fd81df2099d
0x7fd7ee7c3270:   7fd7ee7c32a8 
0x7fd7ee7c3280:   0001 
0x7fd7ee7c3290:   0006e5c7e528 7fd81d74df10
0x7fd7ee7c32a0:    0006e5c7e4d8
0x7fd7ee7c32b0:   0006f6c7fbf8 0006f6e957f0
0x7fd7ee7c32c0:   0006e5c7e350 7fd87fff
0x7fd7ee7c32d0:   0006e5c7e528 7fd81fa867e0
0x7fd7ee7c32e0:   00a77ae20001 00a77ae2
0x7fd7ee7c32f0:   0006e5c7e488 0112d5f1
0x7fd7ee7c3300:   dcb8fc9b99ce 000100a77ae6
0x7fd7ee7c3310:   00a814b000a814b4 0006e5c7e4d8
0x7fd7ee7c3320:   0006e5c7e4d8 0006f6a4df38
0x7fd7ee7c3330:   00060001 00067fff
0x7fd7ee7c3340:   008971582c8a 0006189d87852057
0x7fd7ee7c3350:    e5244e71
Instructions: (pc=0x7fd80a399e70)
0x7fd80a399e50:   e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20
0x7fd80a399e60:   48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00
0x7fd80a399e70:   48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39
0x7fd80a399e80:   c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48

Register to memory mapping:

RAX=0x7fcde6560d32 is an unknown value
RBX=0x7fd450c4cff9 is an unknown value
RCX=0x7fcde6560c7a is an unknown value
RDX=0x7fcde6560d3e is an unknown value
RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0
RBP=0x7fd450c44ae6 is an unknown value
RSI=0x7fcde6562ff8 is an unknown value
RDI=0x00c2 is an unknown value
R8 =0x7fcde6562ff4 is an unknown value
R9 =0x7fcde6563000 is an unknown value
R10=0x is an unknown value
R11=0x is an unknown value
R12=0x000c is an unknown value
R13=0x7fd4501cd000 is an unknown value
R14=0x7fcde6562ff7 is an unknown value
R15=0x7fcde6562ffb is an unknown value


Stack: [0x7fd7ee784000,0x7fd7ee7c5000],  sp=0x7fd7ee7c3160,  free 
space=252k
Native frames: (J=compiled Java code, j=interp

[jira] [Updated] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running

2017-08-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serhat Rıfat Demircan updated CASSANDRA-13757:
--
Description: 
We got following error while repair job running on our cluster. One of the 
nodes stop due to segmantation fault in JVM.


{code:java}
#
#  SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [liblz4-java3580121503903465201.so+0x5e70]  LZ4_decompress_fast+0xd0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---  T H R E A D  ---

Current thread (0x7fce32dad1b0):  JavaThread "CompactionExecutor:9798" 
daemon [_thread_in_native, id=16879, 
stack(0x7fd7ee784000,0x7fd7ee7c5000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x7fd450c4d000

Registers:
RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, 
RDX=0x7fcde6560d3e
RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, 
RDI=0x00c2
R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, 
R11=0x
R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, 
R15=0x7fcde6562ffb
RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, 
ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x7fd7ee7c3160)
0x7fd7ee7c3160:   0008 7fd81e21c3d0
0x7fd7ee7c3170:   0004 0001
0x7fd7ee7c3180:   0002 0001
0x7fd7ee7c3190:   0004 0004
0x7fd7ee7c31a0:   0004 0004
0x7fd7ee7c31b0:    
0x7fd7ee7c31c0:    
0x7fd7ee7c31d0:    0001
0x7fd7ee7c31e0:   0002 0003
0x7fd7ee7c31f0:   7fd7ee7c32b8 7fce32dad3a8
0x7fd7ee7c3200:    
0x7fd7ee7c3210:   7fd4501cd000 7fcde6553000
0x7fd7ee7c3220:   00a77ae6 7fd80a39659d
0x7fd7ee7c3230:    dcb8fc9b
0x7fd7ee7c3240:   7fd7ee7c32d0 
0x7fd7ee7c3250:   0006e5c7e4d8 7fd7ee7c32b8
0x7fd7ee7c3260:   7fce32dad1b0 7fd81df2099d
0x7fd7ee7c3270:   7fd7ee7c32a8 
0x7fd7ee7c3280:   0001 
0x7fd7ee7c3290:   0006e5c7e528 7fd81d74df10
0x7fd7ee7c32a0:    0006e5c7e4d8
0x7fd7ee7c32b0:   0006f6c7fbf8 0006f6e957f0
0x7fd7ee7c32c0:   0006e5c7e350 7fd87fff
0x7fd7ee7c32d0:   0006e5c7e528 7fd81fa867e0
0x7fd7ee7c32e0:   00a77ae20001 00a77ae2
0x7fd7ee7c32f0:   0006e5c7e488 0112d5f1
0x7fd7ee7c3300:   dcb8fc9b99ce 000100a77ae6
0x7fd7ee7c3310:   00a814b000a814b4 0006e5c7e4d8
0x7fd7ee7c3320:   0006e5c7e4d8 0006f6a4df38
0x7fd7ee7c3330:   00060001 00067fff
0x7fd7ee7c3340:   008971582c8a 0006189d87852057
0x7fd7ee7c3350:    e5244e71
Instructions: (pc=0x7fd80a399e70)
0x7fd80a399e50:   e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20
0x7fd80a399e60:   48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00
0x7fd80a399e70:   48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39
0x7fd80a399e80:   c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48

Register to memory mapping:

RAX=0x7fcde6560d32 is an unknown value
RBX=0x7fd450c4cff9 is an unknown value
RCX=0x7fcde6560c7a is an unknown value
RDX=0x7fcde6560d3e is an unknown value
RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0
RBP=0x7fd450c44ae6 is an unknown value
RSI=0x7fcde6562ff8 is an unknown value
RDI=0x00c2 is an unknown value
R8 =0x7fcde6562ff4 is an unknown value
R9 =0x7fcde6563000 is an unknown value
R10=0x is an unknown value
R11=0x is an unknown value
R12=0x000c is an unknown value
R13=0x7fd4501cd000 is an unknown value
R14=0x7fcde6562ff7 is an unknown value
R15=0x7fcde6562ffb is an unknown value


Stack: [0x7fd7ee784000,0x7fd7ee7c5000],  sp=0x7fd7ee7c3160,  free 
space=252k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=

[jira] [Created] (CASSANDRA-13757) Cassandra 3.5.0 JVM Segfault Problem While Repair Job is Running

2017-08-11 Thread JIRA
Serhat Rıfat Demircan created CASSANDRA-13757:
-

 Summary: Cassandra 3.5.0 JVM Segfault Problem While Repair Job is 
Running
 Key: CASSANDRA-13757
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13757
 Project: Cassandra
  Issue Type: Bug
 Environment: Operation System: Debian Jessie
Java: Oracle JDK 1.8.0_131
Cassandra: 3.5.0
Reporter: Serhat Rıfat Demircan


We got following error while repair job running on our cluster. One of the 
nodes stop due to following error. 


{code:java}
#
#  SIGSEGV (0xb) at pc=0x7fd80a399e70, pid=1305, tid=0x7fd7ee7c4700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [liblz4-java3580121503903465201.so+0x5e70]  LZ4_decompress_fast+0xd0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---  T H R E A D  ---

Current thread (0x7fce32dad1b0):  JavaThread "CompactionExecutor:9798" 
daemon [_thread_in_native, id=16879, 
stack(0x7fd7ee784000,0x7fd7ee7c5000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x7fd450c4d000

Registers:
RAX=0x7fcde6560d32, RBX=0x7fd450c4cff9, RCX=0x7fcde6560c7a, 
RDX=0x7fcde6560d3e
RSP=0x7fd7ee7c3160, RBP=0x7fd450c44ae6, RSI=0x7fcde6562ff8, 
RDI=0x00c2
R8 =0x7fcde6562ff4, R9 =0x7fcde6563000, R10=0x, 
R11=0x
R12=0x000c, R13=0x7fd4501cd000, R14=0x7fcde6562ff7, 
R15=0x7fcde6562ffb
RIP=0x7fd80a399e70, EFLAGS=0x00010283, CSGSFS=0x0033, 
ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x7fd7ee7c3160)
0x7fd7ee7c3160:   0008 7fd81e21c3d0
0x7fd7ee7c3170:   0004 0001
0x7fd7ee7c3180:   0002 0001
0x7fd7ee7c3190:   0004 0004
0x7fd7ee7c31a0:   0004 0004
0x7fd7ee7c31b0:    
0x7fd7ee7c31c0:    
0x7fd7ee7c31d0:    0001
0x7fd7ee7c31e0:   0002 0003
0x7fd7ee7c31f0:   7fd7ee7c32b8 7fce32dad3a8
0x7fd7ee7c3200:    
0x7fd7ee7c3210:   7fd4501cd000 7fcde6553000
0x7fd7ee7c3220:   00a77ae6 7fd80a39659d
0x7fd7ee7c3230:    dcb8fc9b
0x7fd7ee7c3240:   7fd7ee7c32d0 
0x7fd7ee7c3250:   0006e5c7e4d8 7fd7ee7c32b8
0x7fd7ee7c3260:   7fce32dad1b0 7fd81df2099d
0x7fd7ee7c3270:   7fd7ee7c32a8 
0x7fd7ee7c3280:   0001 
0x7fd7ee7c3290:   0006e5c7e528 7fd81d74df10
0x7fd7ee7c32a0:    0006e5c7e4d8
0x7fd7ee7c32b0:   0006f6c7fbf8 0006f6e957f0
0x7fd7ee7c32c0:   0006e5c7e350 7fd87fff
0x7fd7ee7c32d0:   0006e5c7e528 7fd81fa867e0
0x7fd7ee7c32e0:   00a77ae20001 00a77ae2
0x7fd7ee7c32f0:   0006e5c7e488 0112d5f1
0x7fd7ee7c3300:   dcb8fc9b99ce 000100a77ae6
0x7fd7ee7c3310:   00a814b000a814b4 0006e5c7e4d8
0x7fd7ee7c3320:   0006e5c7e4d8 0006f6a4df38
0x7fd7ee7c3330:   00060001 00067fff
0x7fd7ee7c3340:   008971582c8a 0006189d87852057
0x7fd7ee7c3350:    e5244e71
Instructions: (pc=0x7fd80a399e70)
0x7fd80a399e50:   e4 0f 49 83 fc 0f 0f 84 94 00 00 00 4a 8d 14 20
0x7fd80a399e60:   48 39 f2 0f 87 c0 00 00 00 0f 1f 80 00 00 00 00
0x7fd80a399e70:   48 8b 0b 48 83 c3 08 48 89 08 48 83 c0 08 48 39
0x7fd80a399e80:   c2 77 ed 48 29 d0 48 89 d1 48 29 c3 0f b7 03 48

Register to memory mapping:

RAX=0x7fcde6560d32 is an unknown value
RBX=0x7fd450c4cff9 is an unknown value
RCX=0x7fcde6560c7a is an unknown value
RDX=0x7fcde6560d3e is an unknown value
RSP=0x7fd7ee7c3160 is pointing into the stack for thread: 0x7fce32dad1b0
RBP=0x7fd450c44ae6 is an unknown value
RSI=0x7fcde6562ff8 is an unknown value
RDI=0x00c2 is an unknown value
R8 =0x7fcde6562ff4 is an unknown value
R9 =0x7fcde6563000 is an unknown value
R10=0x is an unknown value
R11=0x is an unknown value
R12=0x000c is an unknown value
R13=0x7fd4501cd000 is an unknown val

[jira] [Commented] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123276#comment-16123276
 ] 

Marcus Eriksson commented on CASSANDRA-13594:
-

https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/170/
 ([^13594.png])

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13594) Use an ExecutorService for repair commands instead of new Thread(..).start()

2017-08-11 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13594:

Attachment: 13594.png

> Use an ExecutorService for repair commands instead of new Thread(..).start()
> 
>
> Key: CASSANDRA-13594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13594
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
> Attachments: 13594.png
>
>
> Currently when starting a new repair, we create a new Thread and start it 
> immediately
> It would be nice to be able to 1) limit the number of threads and 2) reject 
> starting new repair commands if we are already running too many.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13664) RangeFetchMapCalculator should not try to optimise 'trivial' ranges

2017-08-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123270#comment-16123270
 ] 

Marcus Eriksson commented on CASSANDRA-13664:
-

rerunning tests
https://circleci.com/gh/krummas/cassandra/68
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/176

I'll try to grab screen shots once they are finished

> RangeFetchMapCalculator should not try to optimise 'trivial' ranges
> ---
>
> Key: CASSANDRA-13664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13664
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 4.x
>
>
> RangeFetchMapCalculator (CASSANDRA-4650) tries to make the number of streams 
> out of each node as even as possible.
> In a typical multi-dc ring the nodes in the dcs are setup using token + 1, 
> creating many tiny ranges. If we only try to optimise over the number of 
> streams, it is likely that the amount of data streamed out of each node is 
> unbalanced.
> We should ignore those trivial ranges and only optimise the big ones, then 
> share the tiny ones over the nodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[Cassandra Wiki] Update of "Committers" by AlekseyYeschenko

2017-08-11 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "Committers" page has been changed by AlekseyYeschenko:
https://wiki.apache.org/cassandra/Committers?action=diff&rev1=75&rev2=76

  ||Blake Eggleston ||February 2017 ||Apple || ||
  ||Alex Petrov ||February 2017 ||Datastax || ||
  ||Joel Knighton ||February 2017 || Datastax || ||
- 
+ ||Philip Thompson ||June 2017 || Datastax || ||
  
  {{https://c.statcounter.com/9397521/0/fe557aad/1/|stats}}
  

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123171#comment-16123171
 ] 

Marcus Eriksson commented on CASSANDRA-10726:
-

btw, I think this should go to 4.0 only, do you agree?

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-11 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-10726:

Fix Version/s: (was: 3.0.x)
   4.x

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-08-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123166#comment-16123166
 ] 

Marcus Eriksson commented on CASSANDRA-10726:
-

This LGTM, pushed a branch with some small nits fixed here: 
https://github.com/krummas/cassandra/tree/xiaolong/10726 (please have a look)

running tests:
https://circleci.com/gh/krummas/cassandra/67
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/175

Will commit if the tests look good and you think my nits are ok

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Xiaolong Jiang
> Fix For: 3.0.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org