[jira] [Updated] (CASSANDRA-15289) bad merge reverted CASSANDRA-14993

2019-08-23 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15289:

Test and Documentation Plan: circle
 Status: Patch Available  (was: In Progress)

|[3.0|https://github.com/bdeggleston/cassandra/tree/15289-3.0]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15289-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15289-3.11]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15289-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15289-trunk]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15289-trunk]|

> bad merge reverted CASSANDRA-14993
> --
>
> Key: CASSANDRA-15289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15289
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15289) bad merge reverted CASSANDRA-14993

2019-08-23 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15289:

 Bug Category: Parent values: Degradation(12984)Level 1 values: Other 
Exception(12998)
   Complexity: Low Hanging Fruit
  Component/s: Local/SSTable
Discovered By: User Report
Reviewers: Stefan Podkowinski
 Severity: Normal
 Assignee: Blake Eggleston
   Status: Open  (was: Triage Needed)

> bad merge reverted CASSANDRA-14993
> --
>
> Key: CASSANDRA-15289
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15289
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15289) bad merge reverted CASSANDRA-14993

2019-08-23 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15289:
---

 Summary: bad merge reverted CASSANDRA-14993
 Key: CASSANDRA-15289
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15289
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston






--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15279) Remove overly conservative check breaking VirtualTable unit test

2019-08-14 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15279:

Status: Ready to Commit  (was: Review In Progress)

+1

> Remove overly conservative check breaking VirtualTable unit test
> 
>
> Key: CASSANDRA-15279
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15279
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> CASSANDRA-15194 introduced a check to make it easier to debug bad values 
> being passed to SimpleDataSet but it was too aggressive and actually blocks 
> valid cases which are shown in unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15279) Remove overly conservative check breaking VirtualTable unit test

2019-08-14 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15279:

Reviewers: Blake Eggleston, Blake Eggleston  (was: Blake Eggleston)
   Status: Review In Progress  (was: Patch Available)
Reviewers: Blake Eggleston, Blake Eggleston

> Remove overly conservative check breaking VirtualTable unit test
> 
>
> Key: CASSANDRA-15279
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15279
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> CASSANDRA-15194 introduced a check to make it easier to debug bad values 
> being passed to SimpleDataSet but it was too aggressive and actually blocks 
> valid cases which are shown in unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-06 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15259:

Source Control Link: 
https://github.com/apache/cassandra/commit/da8d41f497efedf57e335ec2664680da583a3aba
 Status: Resolved  (was: Ready to Commit)
 Resolution: Fixed

+1, committed to 3.0 as 
[da8d41f497efedf57e335ec2664680da583a3aba|https://github.com/apache/cassandra/commit/da8d41f497efedf57e335ec2664680da583a3aba]
 and merged up to trunk. Thanks!

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-06 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15259:

Status: Ready to Commit  (was: Changes Suggested)

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-06 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901202#comment-16901202
 ] 

Blake Eggleston commented on CASSANDRA-15259:
-

I’m not 100% sure, but I think the math in both methods commute to the same 
calculation, in which case I’d prefer {{getMeanRowCount}} for it’s simplicity.

I do agree this should move into {{CassandraIndex}} though, since it’s pretty 
specific to that use case.

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-05 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15259:

Status: Changes Suggested  (was: Review In Progress)

The {{totalRows}} metadata element was added to the sstable format in 3.0, so 
we’ll still need to use the old method for 2.x sstables. Looks good otherwise.

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-05 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15259:

Status: Review In Progress  (was: Patch Available)

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15259) Selecting Index by Lowest Mean Column Count Selects Random Index

2019-08-05 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15259:

Reviewers: Blake Eggleston

> Selecting Index by Lowest Mean Column Count Selects Random Index
> 
>
> Key: CASSANDRA-15259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15259
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/2i Index
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Urgent
> Fix For: 3.0.19, 4.0, 3.11.x
>
>
> {{CassandraIndex}} uses 
> [{{ColumnFamilyStore#getMeanColumns}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L273],
>  average columns per partition, which always returns the same answer for 
> index CFs because they contain no regular columns and clustering columns 
> aren't included in the count in Cassandra 3.0+.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15123) Avoid keeping sstables marked compacting forever when user defined compaction gets interrupted

2019-07-31 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15123:

Status: Ready to Commit  (was: Changes Suggested)

+1

> Avoid keeping sstables marked compacting forever when user defined compaction 
> gets interrupted
> --
>
> Key: CASSANDRA-15123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15123
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>
> When we have both repaired + unrepaired data on a node, we create multiple 
> compaction tasks and run them serially. If one of those tasks gets 
> interrupted or throws exception we will keep sstables in the other tasks as 
> compacting forever.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15123) Avoid keeping sstables marked compacting forever when user defined compaction gets interrupted

2019-07-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15123:

Status: Changes Suggested  (was: Review In Progress)

For trunk, I made a few adjustments to the CompactionTaskCollection class in a 
branch 
[here|https://github.com/bdeggleston/cassandra/tree/marcuse/15123-trunk], let 
me know what you think. The main motivation was improving how empty collections 
are created / identified to eliminate potential problems with instances created 
with empty / null collections, but I also renamed the class to be more 
consistent with our other extended collection classes.

For 3.11, we should either catch Throwable, or always call 
{{LifecycleTransaction#close}} in a finally block since it’s a noop on 
committed and aborted txns (my preference).

> Avoid keeping sstables marked compacting forever when user defined compaction 
> gets interrupted
> --
>
> Key: CASSANDRA-15123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15123
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>
> When we have both repaired + unrepaired data on a node, we create multiple 
> compaction tasks and run them serially. If one of those tasks gets 
> interrupted or throws exception we will keep sstables in the other tasks as 
> compacting forever.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15123) Avoid keeping sstables marked compacting forever when user defined compaction gets interrupted

2019-07-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15123:

Status: Review In Progress  (was: Patch Available)

> Avoid keeping sstables marked compacting forever when user defined compaction 
> gets interrupted
> --
>
> Key: CASSANDRA-15123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15123
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>
> When we have both repaired + unrepaired data on a node, we create multiple 
> compaction tasks and run them serially. If one of those tasks gets 
> interrupted or throws exception we will keep sstables in the other tasks as 
> compacting forever.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15123) Avoid keeping sstables marked compacting forever when user defined compaction gets interrupted

2019-07-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15123:

Reviewers: Blake Eggleston

> Avoid keeping sstables marked compacting forever when user defined compaction 
> gets interrupted
> --
>
> Key: CASSANDRA-15123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15123
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
>
> When we have both repaired + unrepaired data on a node, we create multiple 
> compaction tasks and run them serially. If one of those tasks gets 
> interrupted or throws exception we will keep sstables in the other tasks as 
> compacting forever.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Source Control Link: 
https://github.com/apache/cassandra/commit/177a8e91e3f0ef85e2bc3f64b0e566ace6330071
  Since Version: 3.0.0
 Status: Resolved  (was: Ready to Commit)
 Resolution: Fixed

Committed to 3.0 as 
[177a8e91e3f0ef85e2bc3f64b0e566ace6330071|https://github.com/apache/cassandra/commit/177a8e91e3f0ef85e2bc3f64b0e566ace6330071]
 and merged up to trunk. Thanks for the patch [~gzh1992n] and nice work on the 
tests!

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Status: Ready to Commit  (was: Review In Progress)

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Status: Review In Progress  (was: Patch Available)

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-15198:
---

Assignee: Zephyr Guo  (was: Blake Eggleston)

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-15198:
---

Assignee: Blake Eggleston  (was: Zephyr Guo)

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-03 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Fix Version/s: 3.11.x
   3.0.x
   4.0

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-03 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878143#comment-16878143
 ] 

Blake Eggleston commented on CASSANDRA-15198:
-

|[3.0|https://github.com/bdeggleston/cassandra/tree/15198-3.0]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15198-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15198-3.11]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15198-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15198-trunk]|[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15198-trunk]|

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-03 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Reviewers: Blake Eggleston

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15198) Preventing RuntimeException when the username or password is empty

2019-07-03 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15198:

Impacts: Security  (was: None)
Test and Documentation Plan: unit testing new validation
 Status: Patch Available  (was: Open)

> Preventing RuntimeException when the username or password is empty
> --
>
> Key: CASSANDRA-15198
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15198
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
>  Labels: pull-request-available
> Attachments: CASSANDRA-15198-v1.patch, empty_username_error.jpg
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !empty_username_error.jpg! 
> Although this does not affect the service, it's necessary to improve code 
> robustness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15176) Fix PagingState deserialization when the state was serialized using protocol version different from current session's

2019-06-25 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872606#comment-16872606
 ] 

Blake Eggleston commented on CASSANDRA-15176:
-

+1

> Fix PagingState deserialization when the state was serialized using protocol 
> version different from current session's
> -
>
> Key: CASSANDRA-15176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Normal
>
> 3.0 and native protocol V4 introduced a change to how {{PagingState}} is 
> serialized. Unfortunately that can break requests during upgrades: since 
> paging states are opaque, it's possible for a client to receive a paging 
> state encoded as V3 on a 2.1 node, and then send it to a 3.0 node on a V4 
> session. The version of the current session will be used to deserialize the 
> paging state, instead of the actual version used to serialize it, and the 
> request will fail.
> This is obviously sub-optimal, but also avoidable. This JIRA fixes one half 
> of the problem: 3.0 failing to deserialize 'mislabeled' paging states. We can 
> do this by inspecting the byte buffer to verify if it's been indeed 
> serialized with the protocol version used by the session, and if not, use the 
> other method of deserialization.
> It should be noted that we list this as a 'known limitation' somewhere, but 
> really this is an upgrade-blocking bug for some users of C*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-06-20 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868710#comment-16868710
 ] 

Blake Eggleston commented on CASSANDRA-15108:
-

bq. The change here removed the additional source folder for Java 11 specific 
stuff, to be able to add support for example for Direct-I/O (CASSANDRA-14466) 
or use the new ByteBuffer.mismatch().

We can add separate source directories in the future if they’re needed, 
although I think we should avoid them. CASSANDRA-14466 won’t be making it into 
4.0, and it seems plausible that the next major release will only support 
java11+.

bq. C* built w/ Java 11 doesn't work on Java 8 ((Byte)Buffer in particular)

The intent is that a java 8 build would run on both java 8 and java 11 without 
requiring java 11 during compilation. Then if you wanted to build with java 11 
you could, but it would only run on java 11+.

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14516) filter sstables by min/max clustering bounds during reads

2019-05-20 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14516:

Resolution: Fixed
Status: Resolved  (was: Open)

Thanks for looking into this [~n.v.harikrishna]. Sorry for the false alarm.

> filter sstables by min/max clustering bounds during reads
> -
>
> Key: CASSANDRA-14516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14516
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> In SinglePartitionReadCommand, we don't filter out sstables whose min/max 
> clustering bounds don't intersect with the clustering bounds being queried. 
> This causes us to do extra work on the read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2019-05-15 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840757#comment-16840757
 ] 

Blake Eggleston commented on CASSANDRA-14459:
-

I finished my first round of review of the implementation. I still need to 
spend some time looking at the tests and docs.

Some general notes:
 * I don't think we need a dynamicsnitch package, the 2 implementations could 
just live in the locator package.
 * I can't find anywhere that we validate the various dynamic snitch values? I 
didn't look super hard, but if we're not, it would be good to check people are 
using sane values on startup and jmx update.
 * It's probably a good time to rebase onto current trunk.

ScheduledExecutors:
 * {{getOrCreateSharedExecutor}} could be private
 * shutdown times accumulate in the wait section. ie: if we have 2 executors 
with a shutdown time of 2 seconds, we'll wait up to 4 seconds for them to stop.
 * we should validate shutdown hasn't been called in 
{{getOrCreateSharedExecutor}}
 * I don't see any benefit to making this class lock free. Making the 
{{getOrCreateSharedExecutor}} and {{shutdownAndWait}} methods synchronized and 
just using a normal hash map for {{executors}} would make this class easier to 
reason about. As it stands, you could argue there's some raciness with creating 
and shutting down at the same time, although it's unlikely to be a problem in 
practice. I do think I might be borderline nit picking here though.

StorageService
 * {{doLocalReadTest}} method is unused

MessagingService
 * Fix class declaration indentation
 * remove unused import

DynamicEndpointSnitch
 * I don't think we need to check {{logger.isTraceEnabled}} before calling 
{{logger.trace()}}? At least I don't see any toString computation that would 
happen in the calls.
 * The class hierarchy could be improved a bit. There's code in 
{{DynamicEndpointSnitchHistogram}} that's also used in the legacy snitch, and 
code in {{DynamicEndpointSnitch}} that's only used in 
{{DynamicEndpointSnitchHistogram}}. The boundary between DynamicEndpointSnitch 
and DynamicEndpointSnitchHistogram in particular feels kind of arbitrary.

DynamicEndpointLegacySnitch
 * If we're going to keep the old behavior around as a failsafe (and we 
should), I think we should avoid improving it by changing the reset behavior. 
Only resetting under some situations intuitively feels like the right thing to 
do, but it would suck if there were unforeseen problems with it that made it a 
regression from the 3.x behavior.

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Low
>  Labels: 4.0-feature-freeze-review-requested, 
> pull-request-available
> Fix For: 4.x
>
>  Time Spent: 25.5h
>  Remaining Estimate: 0h
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, 

[jira] [Updated] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-10 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15108:

Fix Version/s: 4.0
   Status: Resolved  (was: Ready to Commit)
   Resolution: Fixed

Thanks, committed to trunk as 
[aa762c6d5253e0cc2947d3bf2b6149197e106036|https://github.com/apache/cassandra/commit/aa762c6d5253e0cc2947d3bf2b6149197e106036],
 and dtests master as 
[b1167bef169d657dbd7d1eb09c4d4a6fa8ecf6a9|https://github.com/apache/cassandra-dtest/commit/b1167bef169d657dbd7d1eb09c4d4a6fa8ecf6a9]

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-10 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16837436#comment-16837436
 ] 

Blake Eggleston commented on CASSANDRA-15108:
-

[~djoshi3] {{computeIfAbsent}} doesn't work for this situation. Since JMX 
metrics register themselves in their ctor, we need to create the metric exactly 
once, otherwise we'll get duplicate name exceptions (the problem the 
synchronization is solving). Although computeIfAbsent is thread safe in the 
context of the map, it uses compare and swap to add the computed value to the 
map. This means it eagerly allocates new metric instances, which can cause the 
jmx name collision we're trying to avoid if multiple calls interleave.

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-09 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836725#comment-16836725
 ] 

Blake Eggleston commented on CASSANDRA-15108:
-

That would make sense. I half reverted that 
[here|https://github.com/bdeggleston/cassandra/commit/52d0903eed4c7eac8ab1a487c0110032aba9b81b].
 Now we do realclean on the initial build, and do no cleans on the test runs. 
According to the comments, clean was only added to prevent trying to run java 
11 compiled tests in a java 8, which should no longer be an issue with the 
split workspace in this patch.

|[trunk|https://github.com/bdeggleston/cassandra/tree/15108]|[j8 
circle|https://circleci.com/workflow-run/f983d849-3678-4490-9029-45c7af3283ac]|[j11
 
circle|https://circleci.com/workflow-run/b7ebddba-b528-4e7e-a723-591a9159ca48]| 

 

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-09 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836692#comment-16836692
 ] 

Blake Eggleston commented on CASSANDRA-15108:
-

What kind of failures was realclean causing? I'd added that because random test 
runs were using the wrong byteman version and failing.

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14883) Let Cassandra support the new JVM, Eclipse Openj9.

2019-05-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-14883:
---

Assignee: (was: Blake Eggleston)

> Let Cassandra support the new JVM, Eclipse Openj9.
> --
>
> Key: CASSANDRA-14883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14883
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
> Environment: jdk8u192-b12_openj9-0.11.0
> cassandra 4.0.0_beta_20181109_build
>Reporter: Lee Sangboo
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: jamm-0.3.2.jar, jamm.zip
>
>
> Cassandra does not currently support the new JVM, Eclipse Openj9. In internal 
> testing, Openj9 outperforms Hotspot. I have deployed a modified jamm library 
> that has a problem with the current startup, but when I started Cassandra, I 
> got a log message saying "Non-Oracle JVM detected." Some features, such as 
> unimported compact SSTables, may not work as intended "If there is no 
> problem, I would also like to delete the above message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14883) Let Cassandra support the new JVM, Eclipse Openj9.

2019-05-08 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-14883:
---

Assignee: Blake Eggleston

> Let Cassandra support the new JVM, Eclipse Openj9.
> --
>
> Key: CASSANDRA-14883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14883
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
> Environment: jdk8u192-b12_openj9-0.11.0
> cassandra 4.0.0_beta_20181109_build
>Reporter: Lee Sangboo
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: jamm-0.3.2.jar, jamm.zip
>
>
> Cassandra does not currently support the new JVM, Eclipse Openj9. In internal 
> testing, Openj9 outperforms Hotspot. I have deployed a modified jamm library 
> that has a problem with the current startup, but when I started Cassandra, I 
> got a log message saying "Non-Oracle JVM detected." Some features, such as 
> unimported compact SSTables, may not work as intended "If there is no 
> problem, I would also like to delete the above message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14883) Let Cassandra support the new JVM, Eclipse Openj9.

2019-05-08 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835874#comment-16835874
 ] 

Blake Eggleston edited comment on CASSANDRA-14883 at 5/8/19 8:12 PM:
-

[~avermeerbergen] do you see this log message (logged at INFO) when starting 
your Cassandra 3.x install with OpenJ9?

 
{quote}Cannot initialize un-mmaper. (Are you using a non-Oracle JVM?) Compacted 
data files will not be removed promptly. Consider using an Oracle JVM or using 
standard disk access mode
{quote}


was (Author: bdeggleston):
[~avermeerbergen] do you see this message when starting your Cassandra 3.x 
install with OpenJ9?

 
{quote}Cannot initialize un-mmaper. (Are you using a non-Oracle JVM?) Compacted 
data files will not be removed promptly. Consider using an Oracle JVM or using 
standard disk access mode{quote}

> Let Cassandra support the new JVM, Eclipse Openj9.
> --
>
> Key: CASSANDRA-14883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14883
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
> Environment: jdk8u192-b12_openj9-0.11.0
> cassandra 4.0.0_beta_20181109_build
>Reporter: Lee Sangboo
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: jamm-0.3.2.jar, jamm.zip
>
>
> Cassandra does not currently support the new JVM, Eclipse Openj9. In internal 
> testing, Openj9 outperforms Hotspot. I have deployed a modified jamm library 
> that has a problem with the current startup, but when I started Cassandra, I 
> got a log message saying "Non-Oracle JVM detected." Some features, such as 
> unimported compact SSTables, may not work as intended "If there is no 
> problem, I would also like to delete the above message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14883) Let Cassandra support the new JVM, Eclipse Openj9.

2019-05-08 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835874#comment-16835874
 ] 

Blake Eggleston commented on CASSANDRA-14883:
-

[~avermeerbergen] do you see this message when starting your Cassandra 3.x 
install with OpenJ9?

 
{quote}Cannot initialize un-mmaper. (Are you using a non-Oracle JVM?) Compacted 
data files will not be removed promptly. Consider using an Oracle JVM or using 
standard disk access mode{quote}

> Let Cassandra support the new JVM, Eclipse Openj9.
> --
>
> Key: CASSANDRA-14883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14883
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
> Environment: jdk8u192-b12_openj9-0.11.0
> cassandra 4.0.0_beta_20181109_build
>Reporter: Lee Sangboo
>Priority: Normal
> Fix For: 4.0.x
>
> Attachments: jamm-0.3.2.jar, jamm.zip
>
>
> Cassandra does not currently support the new JVM, Eclipse Openj9. In internal 
> testing, Openj9 outperforms Hotspot. I have deployed a modified jamm library 
> that has a problem with the current startup, but when I started Cassandra, I 
> got a log message saying "Non-Oracle JVM detected." Some features, such as 
> unimported compact SSTables, may not work as intended "If there is no 
> problem, I would also like to delete the above message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14609) Update circleci builds/env to support java 11

2019-05-02 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14609:

Resolution: Duplicate
Status: Resolved  (was: Open)

This was done as part of CASSANDRA-14806

> Update circleci builds/env to support java 11
> -
>
> Key: CASSANDRA-14609
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14609
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Testing
>Reporter: Jason Brown
>Assignee: Sumanth Pasupuleti
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> CASSANDRA-9608 introduced java 11 support, and it needs to be added to the 
> circleci testing environment. This is a place marker for that work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-02 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15108:

Test and Documentation Plan: I've just been testing with circle. Once it's 
committed, some more detailed performance testing would be useful.
 Status: Patch Available  (was: Open)

|[trunk|https://github.com/bdeggleston/cassandra/tree/15108]|[dtests|https://github.com/bdeggleston/cassandra-dtest/tree/jdk11]|[j8
 
circle|https://circleci.com/workflow-run/4fd48a55-24cc-4920-bdb4-6e8a1c8ff90e]|[j11
 circle|https://circleci.com/workflow-run/0655865d-ff5b-4bc8-a1ff-b705c63a40ed]|

This changes around how build.xml handles different jdks. The JDK you want to 
build with is selected with $JAVA_HOME. JDK8 will work as before, but 
attempting to build with JDK11 will fail unless you signal it’s intentional by 
either setting the flag -Duse.jdk=11 and/or setting the env var 
$CASSANDRA_USE_JDK11. Attempting to build with JDK8 will fail with these flags 
set.

The commits are divided into about 3 parts. First is the actual build.xml 
changes to support jdk11 build (and update intellij files). Second is various 
adjustments to libraries and command flags to make C* and it’s tests work 
properly with jdk11. Third is changes to circle ci configs to support building 
and testing jdk8 and jdk11.

This also fixes several dtests that were failing since CASSANDRA-9608


> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-02 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15108:

 Complexity: Normal
Change Category: Operability
 Status: Open  (was: Triage Needed)

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-02 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15108:

Reviewers: Dinesh Joshi, Sam Tunnicliffe

> Support building Cassandra with JDK 11
> --
>
> Key: CASSANDRA-15108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> With the changes in java 8 support and licensing, we should be able to build 
> and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15108) Support building Cassandra with JDK 11

2019-05-02 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-15108:
---

 Summary: Support building Cassandra with JDK 11
 Key: CASSANDRA-15108
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15108
 Project: Cassandra
  Issue Type: Improvement
  Components: Build
Reporter: Blake Eggleston
Assignee: Blake Eggleston


With the changes in java 8 support and licensing, we should be able to build 
and run Cassandra with java 11.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15059:

Fix Version/s: 4.0
   3.11.5
   3.0.19
Since Version: 3.0.0
   Status: Resolved  (was: Ready to Commit)
   Resolution: Fixed

Committed to 3.0 as 
[c3ce32e239b1ba41faf1d58a942465b9bf45b986|https://github.com/apache/cassandra/commit/c3ce32e239b1ba41faf1d58a942465b9bf45b986]
 and merged up. Thanks!

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.19, 3.11.5, 4.0
>
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15059:

Reviewers: Ariel Weisberg  (was: Ariel Weisberg, Jordan West)
   Status: Review In Progress  (was: Patch Available)

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15059:

Status: Ready to Commit  (was: Review In Progress)

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-24 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15072:

Fix Version/s: 3.11.5
   3.0.19
Since Version: 3.0.0
   Status: Resolved  (was: Ready to Commit)
   Resolution: Fixed

Committed to 3.0 as 
[d27c3ad0d2d006a5f156f0a2f2a24286d31c5069|https://github.com/apache/cassandra/commit/d27c3ad0d2d006a5f156f0a2f2a24286d31c5069]
 and merged up. Thanks!

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Fix For: 3.0.19, 3.11.5
>
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-24 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15072:

Status: Ready to Commit  (was: Review In Progress)

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-24 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15078:

Status: Resolved  (was: Ready to Commit)
Resolution: Fixed

Committed as [7d2c3c215f65ee41f86886304257647fc24b1f70 
|https://github.com/apache/cassandra/commit/7d2c3c215f65ee41f86886304257647fc24b1f70],
 thanks.

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-22 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823466#comment-16823466
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

bq. quarantineEndpoint and replacementQuarantine are private, but maybe check 
there as well? I don't feel strongly about it, but it's slightly safer when 
they are called indirectly.

{{quarantineEndpoint}} (and by extension, {{replacementQuarantine}}) modifies 
the {{justRemovedEndpoints}} map. This map is also modified in the GossipTask 
thread, and is done correctly as far as I can tell. That's why I didn't add the 
assertion there.

bq. assassinateEndpoint asserts on the thread inside the lambda for run in 
Gossip stage. Harmless, but is it too much?

Probably overkill, removed.

bq. notifyFailureDetector seems like it could tolerate having this assertion 
since it is called from VerbHandlers in the gossip stage?

Nothing happens in that method that we'd need to assert the thread for. I could 
see adding an assertion with the interface refactor, but I wouldn't want make 
noise in people's logs if we're not actually doing anything worth complaining 
about.

bq. applyNewStates is also private, but maybe check the assertion there?

Same as above

I also switched to using the NoSpamLogger for the assertion.

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-22 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823290#comment-16823290
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

I opened CASSANDRA-15095 as a follow on JIRA.

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15095) Split Gossiper into separate interfaces

2019-04-22 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-15095:
---

 Summary: Split Gossiper into separate interfaces
 Key: CASSANDRA-15095
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15095
 Project: Cassandra
  Issue Type: Improvement
Reporter: Blake Eggleston


As [~aweisberg] suggests 
[here|https://issues.apache.org/jira/browse/CASSANDRA-15059?focusedCommentId=16802986&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16802986],
 a better way to encourage Gossiper threadsafety would be to split it into 
interfaces depending on which methods are safe to be called from where. At 
minimum, one for outside the gossiper stage, on for inside.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-18 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821569#comment-16821569
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

I've updated my branches with the runtime checking, could you take a look? The 
checks discovered a few more places where we were mutating Gossip state from 
the wrong thread, which I fixed.

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15089) CassandraNetworkAuthorizer::authorize should get role details from Roles, not directly from IRoleManager

2019-04-17 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15089:

Status: Ready to Commit  (was: Review In Progress)

> CassandraNetworkAuthorizer::authorize should get role details from Roles, not 
> directly from IRoleManager
> 
>
> Key: CASSANDRA-15089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 4.0
>
>
> If the network permissions cache doesn't contain any entry for a role, the 
> authorize method is invoked on the configured INetworkAuthorizer. In the case 
> of CassandraNetworkAuthorizer, this immediately checks whether the role in 
> question has the LOGIN privilege set. It does this using the configured 
> IRoleManager directly, which causes a read from the underlying table in 
> system_auth. It should fetch the flag from Roles::canLogin, which uses the 
> RolesCache, falling back to the IRoleManager if necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15089) CassandraNetworkAuthorizer::authorize should get role details from Roles, not directly from IRoleManager

2019-04-17 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15089:

Status: Review In Progress  (was: Patch Available)

> CassandraNetworkAuthorizer::authorize should get role details from Roles, not 
> directly from IRoleManager
> 
>
> Key: CASSANDRA-15089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 4.0
>
>
> If the network permissions cache doesn't contain any entry for a role, the 
> authorize method is invoked on the configured INetworkAuthorizer. In the case 
> of CassandraNetworkAuthorizer, this immediately checks whether the role in 
> question has the LOGIN privilege set. It does this using the configured 
> IRoleManager directly, which causes a read from the underlying table in 
> system_auth. It should fetch the flag from Roles::canLogin, which uses the 
> RolesCache, falling back to the IRoleManager if necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15089) CassandraNetworkAuthorizer::authorize should get role details from Roles, not directly from IRoleManager

2019-04-17 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820518#comment-16820518
 ] 

Blake Eggleston commented on CASSANDRA-15089:
-

+1

> CassandraNetworkAuthorizer::authorize should get role details from Roles, not 
> directly from IRoleManager
> 
>
> Key: CASSANDRA-15089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Authorization
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 4.0
>
>
> If the network permissions cache doesn't contain any entry for a role, the 
> authorize method is invoked on the configured INetworkAuthorizer. In the case 
> of CassandraNetworkAuthorizer, this immediately checks whether the role in 
> question has the LOGIN privilege set. It does this using the configured 
> IRoleManager directly, which causes a read from the underlying table in 
> system_auth. It should fetch the flag from Roles::canLogin, which uses the 
> RolesCache, falling back to the IRoleManager if necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-15 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818427#comment-16818427
 ] 

Blake Eggleston commented on CASSANDRA-15078:
-

Thanks [~ifesdjeen]. Pushed up requested changes. I ended up using 
{{callsOnInstance}} for the version getter instead of {{appliesOnInstance}}

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-11 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815537#comment-16815537
 ] 

Blake Eggleston edited comment on CASSANDRA-15059 at 4/11/19 4:03 PM:
--

Maybe we could compromise a bit here. There’s definitely value in rooting out 
places where we violate the assumptions gossiper makes about concurrency, but I 
worry that if we don’t catch all of them we'll turn a race that _may_ cause a 
problem into a hard failure. What if we logged an error by default, but threw 
an exception if a system property was set? This way we can get feedback and 
scary messages in the logs, but we haven't made things any less stable, and our 
tests will fail if we’re doing something we shouldn’t be. I have no problem 
putting that in 3.x and up.

I would like to punt the refactor to 4.next though. I do think it’s valuable, 
but I think it’s a bit late in the game to add it to 4.0.

WDYT?


was (Author: bdeggleston):
Maybe we could compromise a bit here. There’s definitely value in rooting out 
places where we violate the assumptions gossiper makes about concurrency, but I 
worry that if we don’t catch all of them we'll turn a relatively low impact 
race (low enough that we haven't caught it) into a hard failure. What if we 
logged an error by default, but threw an exception if a system property was 
set? This way we can get feedback and scary messages in the logs, but we 
haven't made things any less stable, and our tests will fail if we’re doing 
something we shouldn’t be. I have no problem putting that in 3.x and up.

I would like to punt the refactor to 4.next though. I do think it’s valuable, 
but I think it’s a bit late in the game to add it to 4.0.

WDYT?

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-11 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815537#comment-16815537
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

Maybe we could compromise a bit here. There’s definitely value in rooting out 
places where we violate the assumptions gossiper makes about concurrency, but I 
worry that if we don’t catch all of them we'll turn a relatively low impact 
race (low enough that we haven't caught it) into a hard failure. What if we 
logged an error by default, but threw an exception if a system property was 
set? This way we can get feedback and scary messages in the logs, but we 
haven't made things any less stable, and our tests will fail if we’re doing 
something we shouldn’t be. I have no problem putting that in 3.x and up.

I would like to punt the refactor to 4.next though. I do think it’s valuable, 
but I think it’s a bit late in the game to add it to 4.0.

WDYT?

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-09 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813615#comment-16813615
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

I think both of these changes are worth considering, but I don’t think would be 
appropriate to include them as part of this ticket. They are both out of scope 
and too risky to be putting in 3.x, and I’d argue the same for 4.0 at this 
point as well.

Gossiper is very brittle, critical to the operation of a cluster, and has 
little to no test coverage. Any bugs that aren't caught by a manual review or 
raise any red flags in the dtests will become production issues at some point.

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-04-08 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812769#comment-16812769
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

{quote}
Future wise this doesn't do anything to address the underlying fragility in how 
Gossiper doesn't document what is safe to call from outside the Gossip thread 
and what isn't. It also doesn't validate the correct thread is running a given 
method.
{quote}

I’ve been thinking about this a lot, and I think it would be safer if we didn’t 
do this.

Adding some preconditions isn’t going to fix the underlying fragility of 
Gossiper. Given the “realities” of the Gossiper class, I think it would end up 
causing more harm that good. Just starting to pull on that thread reveals at 
least one situation where we modify gossip state out of the gossip stage that 
makes sense (on startup). There are probably one or two more (at least), and 
I’d hate to break a nodetool command or something.

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-05 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810407#comment-16810407
 ] 

Blake Eggleston edited comment on CASSANDRA-15072 at 4/5/19 4:48 PM:
-

|[3.0|https://github.com/bdeggleston/cassandra/tree/15072-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15072-3.11]|[tests|https://circleci.com/workflow-run/4567dbed-be97-49e5-8c82-66e320e074ca]|

[~beobal] do you have time to review this? It seems to be related to 
CASSANDRA-11087.

A few notes:
 * From what I can tell, returning a row per cell is the right thing to do in 
this case, so I'm using a modified result counter to only going entire 
partitions in this specific case. However, I'm not familiar enough with all the 
dark corners of the 2.1 storage engine to be sure that's appropriate, or won't 
break something else.
 * Doing a point read with the partition key also returns a row per cell, but 
works correctly because the 2.2 coordinator seems to just discard the limit in 
that case.
 * If you're not familiar with the in-jvm dtests yet, and want to run the one 
in this patch, you'll want to run {{ant dtest-jar}} on this branch and [this 
2.2 branch|https://github.com/bdeggleston/cassandra/tree/15078-2.2], and put 
the 2.2 dtest jar in the 3.0 build directory.
 * -CircleCI seems to be behind picking up new branches to test, but I'll 
update this with links to the workflows once it catches up.-


was (Author: bdeggleston):
|[3.0|https://github.com/bdeggleston/cassandra/tree/15072-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15072-3.11]|[tests|https://circleci.com/workflow-run/4567dbed-be97-49e5-8c82-66e320e074ca]|

[~beobal] do you have time to review this? It seems to be related to 
CASSANDRA-11087.

A few notes:
 * From what I can tell, returning a row per cell is the right thing to do in 
this case, so I'm using a modified result counter to only going entire 
partitions in this specific case. However, I'm not familiar enough with all the 
dark corners of the 2.1 storage engine to be sure that's appropriate, or won't 
break something else.
 * Doing a point read with the partition key also returns a row per cell, but 
works correctly because the 2.2 coordinator seems to just discard the limit in 
that case.
 * If you're not familiar with the in-jvm dtests yet, and want to run the one 
in this patch, you'll want to run {{ant dtest-jar}} on this branch and [this 
2.2 branch|https://github.com/bdeggleston/cassandra/tree/15078-2.2], and put 
the 2.2 dtest jar in the 3.0 build directory.
 * CircleCI seems to be behind picking up new branches to test, but I'll update 
this with links to the workflows once it catches up.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONS

[jira] [Comment Edited] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-05 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810373#comment-16810373
 ] 

Blake Eggleston edited comment on CASSANDRA-15078 at 4/5/19 4:47 PM:
-

|[2.2|https://github.com/bdeggleston/cassandra/tree/15078-2.2]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-2.2]|
|[3.0|https://github.com/bdeggleston/cassandra/tree/15078-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15078-3.11]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15078-trunk]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-trunk]|

[~ifesdjeen] / [~benedict] would one of you mind taking a look at this? It's 
just adding and using hooks for getting/setting messaging versions. It also 
fixes a minor issue where the {{setup}} method wasn't getting called. These 
changes are in support of a test case for CASSANDRA-15072, feel free to take a 
look and give feedback on that as well. -Circle is behind picking up new 
branches, but I'll update with the workflow links once they're available.-


was (Author: bdeggleston):
|[2.2|https://github.com/bdeggleston/cassandra/tree/15078-2.2]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-2.2]|
|[3.0|https://github.com/bdeggleston/cassandra/tree/15078-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15078-3.11]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15078-trunk]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-trunk]|

[~ifesdjeen] / [~benedict] would one of you mind taking a look at this? It's 
just adding and using hooks for getting/setting messaging versions. It also 
fixes a minor issue where the {{setup}} method wasn't getting called. These 
changes are in support of a test case for CASSANDRA-15072, feel free to take a 
look and give feedback on that as well. Circle is behind picking up new 
branches, but I'll update with the workflow links once they're available.

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-05 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810373#comment-16810373
 ] 

Blake Eggleston edited comment on CASSANDRA-15078 at 4/5/19 4:47 PM:
-

|[2.2|https://github.com/bdeggleston/cassandra/tree/15078-2.2]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-2.2]|
|[3.0|https://github.com/bdeggleston/cassandra/tree/15078-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15078-3.11]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15078-trunk]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15078-trunk]|

[~ifesdjeen] / [~benedict] would one of you mind taking a look at this? It's 
just adding and using hooks for getting/setting messaging versions. It also 
fixes a minor issue where the {{setup}} method wasn't getting called. These 
changes are in support of a test case for CASSANDRA-15072, feel free to take a 
look and give feedback on that as well. Circle is behind picking up new 
branches, but I'll update with the workflow links once they're available.


was (Author: bdeggleston):
|[2.2|https://github.com/bdeggleston/cassandra/tree/15078-2.2]|
|[3.0|https://github.com/bdeggleston/cassandra/tree/15078-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15078-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15078-trunk]|

[~ifesdjeen] / [~benedict] would one of you mind taking a look at this? It's 
just adding and using hooks for getting/setting messaging versions. It also 
fixes a minor issue where the {{setup}} method wasn't getting called. These 
changes are in support of a test case for CASSANDRA-15072, feel free to take a 
look and give feedback on that as well. Circle is behind picking up new 
branches, but I'll update with the workflow links once they're available.

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-05 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810407#comment-16810407
 ] 

Blake Eggleston edited comment on CASSANDRA-15072 at 4/5/19 4:46 PM:
-

|[3.0|https://github.com/bdeggleston/cassandra/tree/15072-3.0]|[tests|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15072-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15072-3.11]|[tests|https://circleci.com/workflow-run/4567dbed-be97-49e5-8c82-66e320e074ca]|

[~beobal] do you have time to review this? It seems to be related to 
CASSANDRA-11087.

A few notes:
 * From what I can tell, returning a row per cell is the right thing to do in 
this case, so I'm using a modified result counter to only going entire 
partitions in this specific case. However, I'm not familiar enough with all the 
dark corners of the 2.1 storage engine to be sure that's appropriate, or won't 
break something else.
 * Doing a point read with the partition key also returns a row per cell, but 
works correctly because the 2.2 coordinator seems to just discard the limit in 
that case.
 * If you're not familiar with the in-jvm dtests yet, and want to run the one 
in this patch, you'll want to run {{ant dtest-jar}} on this branch and [this 
2.2 branch|https://github.com/bdeggleston/cassandra/tree/15078-2.2], and put 
the 2.2 dtest jar in the 3.0 build directory.
 * CircleCI seems to be behind picking up new branches to test, but I'll update 
this with links to the workflows once it catches up.


was (Author: bdeggleston):
|[3.0|https://github.com/bdeggleston/cassandra/tree/15072-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15072-3.11]|

[~beobal] do you have time to review this? It seems to be related to 
CASSANDRA-11087.

A few notes:
 * From what I can tell, returning a row per cell is the right thing to do in 
this case, so I'm using a modified result counter to only going entire 
partitions in this specific case. However, I'm not familiar enough with all the 
dark corners of the 2.1 storage engine to be sure that's appropriate, or won't 
break something else.
 * Doing a point read with the partition key also returns a row per cell, but 
works correctly because the 2.2 coordinator seems to just discard the limit in 
that case.
 * If you're not familiar with the in-jvm dtests yet, and want to run the one 
in this patch, you'll want to run {{ant dtest-jar}} on this branch and [this 
2.2 branch|https://github.com/bdeggleston/cassandra/tree/15078-2.2], and put 
the 2.2 dtest jar in the 3.0 build directory.
 * CircleCI seems to be behind picking up new branches to test, but I'll update 
this with links to the workflows once it catches up.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+--

[jira] [Updated] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15072:

Test and Documentation Plan: circleci / in jvm upgrade dtests
 Status: Patch Available  (was: Open)

|[3.0|https://github.com/bdeggleston/cassandra/tree/15072-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15072-3.11]|

[~beobal] do you have time to review this? It seems to be related to 
CASSANDRA-11087.

A few notes:
 * From what I can tell, returning a row per cell is the right thing to do in 
this case, so I'm using a modified result counter to only going entire 
partitions in this specific case. However, I'm not familiar enough with all the 
dark corners of the 2.1 storage engine to be sure that's appropriate, or won't 
break something else.
 * Doing a point read with the partition key also returns a row per cell, but 
works correctly because the 2.2 coordinator seems to just discard the limit in 
that case.
 * If you're not familiar with the in-jvm dtests yet, and want to run the one 
in this patch, you'll want to run {{ant dtest-jar}} on this branch and [this 
2.2 branch|https://github.com/bdeggleston/cassandra/tree/15078-2.2], and put 
the 2.2 dtest jar in the 3.0 build directory.
 * CircleCI seems to be behind picking up new branches to test, but I'll update 
this with links to the workflows once it catches up.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15078:

Test and Documentation Plan: circle ci once available
 Status: Patch Available  (was: Open)

|[2.2|https://github.com/bdeggleston/cassandra/tree/15078-2.2]|
|[3.0|https://github.com/bdeggleston/cassandra/tree/15078-3.0]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15078-3.11]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15078-trunk]|

[~ifesdjeen] / [~benedict] would one of you mind taking a look at this? It's 
just adding and using hooks for getting/setting messaging versions. It also 
fixes a minor issue where the {{setup}} method wasn't getting called. These 
changes are in support of a test case for CASSANDRA-15072, feel free to take a 
look and give feedback on that as well. Circle is behind picking up new 
branches, but I'll update with the workflow links once they're available.

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15078:

 Complexity: Normal
Change Category: Quality Assurance
 Status: Open  (was: Triage Needed)

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15072:

   Severity: Normal
 Complexity: Normal
Component/s: Legacy/Coordination
 Status: Open  (was: Triage Needed)

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-15078:
---

Assignee: Blake Eggleston

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15078:

Fix Version/s: 4.0
   3.11.5
   3.0.19
   2.2.15

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Priority: Normal
> Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15078:

Component/s: Test/dtest

> Support cross version messaging in in-jvm upgrade dtests
> 
>
> Key: CASSANDRA-15078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Blake Eggleston
>Priority: Normal
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests

2019-04-04 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-15078:
---

 Summary: Support cross version messaging in in-jvm upgrade dtests
 Key: CASSANDRA-15078
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15078
 Project: Cassandra
  Issue Type: Improvement
Reporter: Blake Eggleston






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808193#comment-16808193
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

No problem. Yes mixed mode just means you're upgrading your cluster.

I don't know the exact cause, but you've summarized what I think is probably 
happening. Specifically the legacy read path on the 3.0 nodes is probably 
always interpreting single cells as rows for compact storage tables, even ones 
without clustering columns.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808114#comment-16808114
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

Huh, I did not know that. I guess that makes sense though. So then this is just 
an upgrade bug.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808019#comment-16808019
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

This is a great repro script, thanks. 

A couple of observations:
 * test.test has 2 columns, and uses compact storage, which shouldn’t be 
possible
 * node1 & node3 are the replicas of the missing partition (we’re querying from 
the un-upgraded node2, for those following along).
 * doing a point read ({{select * from test.test where id=‘1’;}}) returns the 
expected partition
 * using LIMIT 2 instead of PAGING 2 has the same problem
 * LIMIT 3 returns a partial row: {{1 | there | null}}
 * LIMIT 4 returns the entire row: {{1 | there |  hi}}

Tables with compact storage can only have a single column, so you shouldn’t be 
able to create a compact storage table with 2 columns. Instead of throwing an 
error though, it seems like it just silently treats the table as a normal 
table. This might be why no one has noticed that our ddl validation is broken.

It looks like the mixed mode read path is treating the table as a proper 
compact storage table though, and treating each cell as a row, which is why you 
see partial rows start to appear as you increase the limit. If you remove 
compact storage from the ddl, or only use a single column, everything works 
normally.

I'll think on the best way to address this.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-01 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807327#comment-16807327
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

Ok, I can repro your issue with the updated script. It looks like you’re 
hitting a commit log bug that was introduced in 2.1 and fixed in 3.0 
(CASSANDRA-13987) If you drain node 1 & 2 before shutting them down, this 
should stop happening.

I’d also expected putting a sleep larger than the commit log sync interval 
before shutting down node 1 would fix the problem, but it didn’t. I’m still 
looking at why that is.

When you say:
{quote}When all nodes were upgraded (before upgrading sstables), we stopped 
getting incomplete results
{quote}
do you mean data you'd inserted before the upgrade reappeared?

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Priority: High
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-01 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-15072:
---

Assignee: Blake Eggleston

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-01 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807185#comment-16807185
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

Are you seeing incomplete results like this in a real cluster? If so, what 
consistency level are you reading and writing at?

The ccm script you have here _does_ return incomplete results, but it’s also 
writing and reading at CL ONE (the cqlsh default), so that’s not unexpected. I 
modified the script here to read and write at QUORUM, and haven't gotten any 
incomplete results.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Priority: High
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> # need to use new cqlsh so we can configure page size
> cqlsh 127.0.0.2 < PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2019-03-28 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804130#comment-16804130
 ] 

Blake Eggleston commented on CASSANDRA-10726:
-

4.0 only I'm afraid. It's built on a refactor that's too invasive to backport 
to 3.11

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-03-27 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803112#comment-16803112
 ] 

Blake Eggleston commented on CASSANDRA-15059:
-

{quote}
Future wise this doesn't do anything to address the underlying fragility in how 
Gossiper doesn't document what is safe to call from outside the Gossip thread 
and what isn't. It also doesn't validate the correct thread is running a given 
method.
I think we want to add two (or three?) interfaces that Gossiper can implement. 
One is for methods that are non-mutating and safe to call from any thread. The 
other is for methods for that should only be called from the Gossip stage 
thread. And maybe a third which contains methods that mutate, but block on the 
Gossip stage. So Gossiper.instance would go away and you would have a reference 
to the Gossiper via one of the 2-3 interfaces.
Additionally any method that should only run in Gossip stage interface should 
have a Preconditions.checkState checking that it's actually running in the 
Gossip stage.
{quote}

Do you mean as part of this ticket, or as future improvements to gossip?

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14607:

Status: Resolved  (was: Ready to Commit)

Committed to trunk as {{9063ceabc9a41825a09db811f74821537dfe726b}}, thanks!

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14607:

Status: Review In Progress  (was: Patch Available)

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-25 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14607:

Status: Ready to Commit  (was: Review In Progress)

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-03-22 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15059:

Test and Documentation Plan: circleci runs looks good
 Status: Patch Available  (was: Open)

|[3.0|https://github.com/bdeggleston/cassandra/tree/15059-3.0]|[circle|https://circleci.com/workflow-run/a863b2af-db01-4f7a-b059-a0ed2496f286]|
|[3.11|https://github.com/bdeggleston/cassandra/tree/15059-3.11]|[circle|https://circleci.com/workflow-run/c271d33b-92c4-4579-80ce-cab20f4aac6c]|
|[trunk|https://github.com/bdeggleston/cassandra/tree/15059-trunk]|[circle|https://circleci.com/workflow-run/d476de41-c374-40f1-bdae-b464149b703a]|

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-03-22 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15059:

 Severity: Low
   Complexity: Normal
Discovered By: User Report
 Bug Category: Parent values: Degradation(12984)Level 1 values: Other 
Exception(12998)
  Component/s: Cluster/Gossip
   Status: Open  (was: Triage)

> Gossiper#markAlive can race with Gossiper#markDead
> --
>
> Key: CASSANDRA-15059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
>
> The Gossiper class is not threadsafe and assumes all state changes happen in 
> a single thread (the gossip stage). Gossiper#convict, however, can be called 
> from the GossipTasks thread. This creates a race where calls to 
> Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip 
> state. Gossiper#assassinateEndpoint has a similar problem, being called from 
> the mbean server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2019-03-21 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14459:

Reviewers: Ariel Weisberg, Blake Eggleston  (was: Ariel Weisberg)

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Low
>  Labels: 4.0-feature-freeze-review-requested, 
> pull-request-available
> Fix For: 4.x
>
>  Time Spent: 23h
>  Remaining Estimate: 0h
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead

2019-03-21 Thread Blake Eggleston (JIRA)
Blake Eggleston created CASSANDRA-15059:
---

 Summary: Gossiper#markAlive can race with Gossiper#markDead
 Key: CASSANDRA-15059
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15059
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Blake Eggleston


The Gossiper class is not threadsafe and assumes all state changes happen in a 
single thread (the gossip stage). Gossiper#convict, however, can be called from 
the GossipTasks thread. This creates a race where calls to Gossiper#markAlive 
and Gossiper#markDead can interleave, corrupting gossip state. 
Gossiper#assassinateEndpoint has a similar problem, being called from the mbean 
server thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-20 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14607:

Reviewers: Benedict

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-20 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14607:

Test and Documentation Plan: 
CIrcleCI run looks good: 
https://circleci.com/workflow-run/2680b5b1-e7a3-4ef8-92ae-ba238ea6a1bc

all j11 failures are also failing in trunk
 Status: Patch Available  (was: Open)

|[trunk|https://github.com/bdeggleston/cassandra/tree/14607-trunk-2]|[circle| 
https://circleci.com/workflow-run/2680b5b1-e7a3-4ef8-92ae-ba238ea6a1bc]|

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15053) Fix handling FS errors on writing and reading flat files - LogTransaction and hints

2019-03-15 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793750#comment-16793750
 ] 

Blake Eggleston commented on CASSANDRA-15053:
-

+1

> Fix handling FS errors on writing and reading flat files - LogTransaction and 
> hints
> ---
>
> Key: CASSANDRA-15053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Normal
> Fix For: 4.0, 3.0.x, 3.11.x
>
>
> We currently fail to handle and propagate IO errors when dealing with 
> transaction log and hints.  It's trivial to fix this behaviour to ensure that 
> disk failure policy is properly invoked in error scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-13 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-14607:
---

Assignee: Blake Eggleston

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Assignee: Blake Eggleston
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-12 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791278#comment-16791278
 ] 

Blake Eggleston edited comment on CASSANDRA-14607 at 3/13/19 4:21 AM:
--

I think we could get the java 8 behavior in java 11 by moving the update logic 
of {{addAllWithSizeDelta}} into a private method, shuffling the 'shouldLock' 
logic around, and just using a {{synchronized}} block. This would work in both 
java 8 and java 11, so we wouldn't need the java8 and java11 source directories.

I pushed up a branch showing how this would work here: 
[https://github.com/bdeggleston/cassandra/tree/14607]

I may be missing some nuance in the usage of {{monitorEnter/Exit}} (I couldn't 
find a lot of documentation on the method, or explanations for why we were 
using it), but I think this would be functionally identical to what we were 
doing with {{monitorEnter/Exit}}.

WDYT [~benedict] & [~jasobrown]?

Edit: That branch is just for illustration, it's not a finished patch. At a 
minimum it still needs to fix the handling of {{inputDeletionInfoCopy}}, 
condense the {{AbstractBTreePartition}} class structure, and remove the java8 
and java11 source directories.


was (Author: bdeggleston):
I think we could get the java 8 behavior in java 11 by moving the update logic 
of {{addAllWithSizeDelta}} into a private method, shuffling the 'shouldLock' 
logic around, and just using a {{synchronized}} block. This would work in both 
java 8 and java 11, so we wouldn't need the java8 and java11 source directories.

I pushed up a branch showing how this would work here: 
[https://github.com/bdeggleston/cassandra/tree/14607]

I may be missing some nuance in the usage of {{monitorEnter/Exit}} (I couldn't 
find a lot of documentation on the method, or explanations for why we were 
using it), but I think this would be functionally identical to what we were 
doing with {{monitorEnter/Exit}}.

WDYT [~benedict] & [~jasobrown]?

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14607) Explore optimizations in AbstractBtreePartiton, java 11 variant

2019-03-12 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791278#comment-16791278
 ] 

Blake Eggleston commented on CASSANDRA-14607:
-

I think we could get the java 8 behavior in java 11 by moving the update logic 
of {{addAllWithSizeDelta}} into a private method, shuffling the 'shouldLock' 
logic around, and just using a {{synchronized}} block. This would work in both 
java 8 and java 11, so we wouldn't need the java8 and java11 source directories.

I pushed up a branch showing how this would work here: 
[https://github.com/bdeggleston/cassandra/tree/14607]

I may be missing some nuance in the usage of {{monitorEnter/Exit}} (I couldn't 
find a lot of documentation on the method, or explanations for why we were 
using it), but I think this would be functionally identical to what we were 
doing with {{monitorEnter/Exit}}.

WDYT [~benedict] & [~jasobrown]?

> Explore optimizations in AbstractBtreePartiton, java 11 variant
> ---
>
> Key: CASSANDRA-14607
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14607
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Jason Brown
>Priority: Normal
>  Labels: Java11
> Fix For: 4.0
>
>
> In CASSANDRA-9608, we discussed some way to optimize the java 11 
> implementation of {{AbstractBTreePartition}}. This ticket serves that 
> purpose, as well as a "note to selves" to ensure the java 11 version does not 
> have a performance regression.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-22 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14482:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

committed to trunk as 
[dccf53061a61e7c632669c60cd94626e405518e9|https://github.com/apache/cassandra/commit/dccf53061a61e7c632669c60cd94626e405518e9],
 thanks!

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-22 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15027:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk as 9bde713ee8883f70d130efb6290ec0e6daea524f, thanks

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-22 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775361#comment-16775361
 ] 

Blake Eggleston edited comment on CASSANDRA-15027 at 2/22/19 5:34 PM:
--

Nice. Your follow on changes look good to me, I have 2 nits, but those can just 
be fixed on commit.

* We should log the session id in compaction manager when an anti-compaction is 
cancelled (and probably when there's an error as well)
* Some error handling should be added to the commit fixing the race between 
proposeFuture and hasFailure so nodetool doesn't hang if there's an error in 
the callback

edit: proposed fixes 
[here|https://github.com/bdeggleston/cassandra/commit/02d7d9e09983db0d4661486b17adc375e17be24f]


was (Author: bdeggleston):
Nice. Your follow on changes look good to me, I have 2 nits, but those can just 
be fixed on commit.
 
* We should log the session id in compaction manager when an anti-compaction is 
cancelled (and probably when there's an error as well)
* Some error handling should be added to the commit fixing the race between 
proposeFuture and hasFailure so nodetool doesn't hang if there's an error in 
the callback

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-22 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775361#comment-16775361
 ] 

Blake Eggleston commented on CASSANDRA-15027:
-

Nice. Your follow on changes look good to me, I have 2 nits, but those can just 
be fixed on commit.
 
* We should log the session id in compaction manager when an anti-compaction is 
cancelled (and probably when there's an error as well)
* Some error handling should be added to the commit fixing the race between 
proposeFuture and hasFailure so nodetool doesn't hang if there's an error in 
the callback

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-21 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774515#comment-16774515
 ] 

Blake Eggleston commented on CASSANDRA-15027:
-

Thanks [~spo...@gmail.com]. I’ve extended your code so that in addition to 
waiting for other anti-compactions to complete, the coordinator also 
pro-actively cancels ongoing anti-compactions on the other participants. This 
avoids wasting time waiting for anti-compactions on other machines. The code 
does 3 things:
 * Adds a session state check to the {{isStopRequested}} method in the 
anti-compaction iterator.
 * The coordinator now sends failure messages to all participants when it 
receives a failure message from one of them in the prepare phase. It does not 
mark these participants as having failed internally though, since that would 
cause the nodetool session to immediately complete. Instead, it waits until 
it’s received messages from all the other nodes.
 * The participants will now respond with a failed prepare message if the 
anti-compaction completes, but the session was failed in the mean time. This 
prevents a dead lock on the coordinator in the case where the participant 
received a failure message between the time the anti-compaction completes and 
the callback fires.

Let me know what you think. If everything looks ok to you, I’m +1 on committing.

[trunk|https://github.com/bdeggleston/cassandra/tree/15027-trunk]
 
[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15027-trunk]

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-20 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773524#comment-16773524
 ] 

Blake Eggleston commented on CASSANDRA-14482:
-

[~djoshi3], I pushed up a commit 
[here|https://github.com/bdeggleston/cassandra/commit/716a6b631d68532773bb804e8ce33ab3dd23946c]
 that makes the following changes:

1. simplifies {{ZstdCompressor#compress}} method to just call 
{{Zstd#compress}}. The compressor doesn't support on heap buffers so there's no 
need to do the stream stuff. This is basically how the snappy compressor works.
 2. updates test support method visibility/annotation in ZstdCompressor
 3. adds ratio to CompressorPerformance output

If you're ok with these changes and I haven't broken the tests, I think this is 
ready to commit.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-20 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15027:

Reviewer: Blake Eggleston

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-19 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-14482:

Reviewers:   (was: Dinesh Joshi)
 Reviewer: Blake Eggleston

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



<    1   2   3   4   5   6   7   8   9   10   >