date:20171130

[jira] [Commented] (CASSANDRA-12971) Add CAS option to WRITE test to stress tool

2017-11-30 Thread Vladimir Yudovin (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273986#comment-16273986
 ] 

Vladimir Yudovin commented on CASSANDRA-12971:
--

I guess yes.

> Add CAS option to WRITE test to stress tool
> ---
>
> Key: CASSANDRA-12971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12971
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Stress, Tools
>Reporter: Vladimir Yudovin
>Assignee: Vladimir Yudovin
> Attachments: stress-cass.patch
>
>
> If -cas option is present each UPDATE is performed with true IF condition, 
> thus data is inserted anyway.
> It's implemented, if it's needed I proceed with the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections

2017-11-30 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273946#comment-16273946
 ] 

Kurt Greaves commented on CASSANDRA-14078:
--

Could we not do this test by just setting a low 
{{native_transport_max_concurrent_connections}} on one node and then have a 
much higher value on the other nodes, so we just trigger a fail-over on one 
node? That way we aren't relying on completely overloading the cluster just to 
test this.


> Fix dTest test_bulk_round_trip_blogposts_with_max_connections
> -
>
> Key: CASSANDRA-14078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14078
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
>
> This ticket is regarding following dTest 
> {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}}
> This test is trying to limit number of client connections and assumes that 
> once connection count has reached then client will fail-over to other node 
> and do the request. The reason is, it is not deterministic test case as it 
> totally depends on what hardware you run, timing, etc.
> For example
> If we look at 
> https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/
> {quote}
> ...
> Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s
> All replicas busy, sleeping for 4 second(s)...
> Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s
> All replicas busy, sleeping for 1 second(s)...
> Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s
> All replicas busy, sleeping for 11 second(s)...
> Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s
> Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s
> Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s
> Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s
> Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s
> Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s
> Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s
> Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s
> All replicas busy, sleeping for 23 second(s)...
> Replicas too busy, given up
> ...
> {quote}
> Here we can see request is timing out, sometimes it resumes after 1 second, 
> next time 11 seconds and some times it doesn't work at all. 
> In my opinion this test is not a good fit for dTest as dTest(s) should be 
> deterministic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13873) Ref bug in Scrub

2017-11-30 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273882#comment-16273882
 ] 

Joel Knighton commented on CASSANDRA-13873:
---

Thanks for the patches and CI. Both your remarks look correct to me; frankly, I 
have no idea how I missed that in anticompaction.

Test results look good for the most part. There's a few flaky unit tests on 
3.0/3.11 that appear to have failed the same way before the patch, pass for me 
locally, and appear to be at the limits of CircleCI's timeouts/resources. The 
2.2 dtests timed out, so it seems worthwhile to trigger those again just in 
case. The only unusual failures on 3.0 dtests are a bunch of tests where 
Jolokia failed to attach for JMX. I'm not sure if this is a known environmental 
problem on ASF dtests, but I was unable to reproduce this elsewhere.

Overall, +1 to the patch for me, and this looks good to merge if none of the 
test issues I raised above worry you.

> Ref bug in Scrub
> 
>
> Key: CASSANDRA-13873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> I'm hitting a Ref bug when many scrubs run against a node.  This doesn't 
> happen on 3.0.X.  I'm not sure if/if not this happens with compactions too 
> but I suspect it does.
> I'm not seeing any Ref leaks or double frees.
> To Reproduce:
> {quote}
> ./tools/bin/cassandra-stress write n=10m -rate threads=100
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> {quote}
> Eventually in the logs you get:
> WARN  [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 
> NoSpamLogger.java:97 - Spinning trying to capture readers 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')],
> *released: 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],*
>  
> This released table has a selfRef of 0 but is in the Tracker



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13873) Ref bug in Scrub

2017-11-30 Thread Joel Knighton (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton reassigned CASSANDRA-13873:
-

Assignee: Marcus Eriksson  (was: Joel Knighton)

> Ref bug in Scrub
> 
>
> Key: CASSANDRA-13873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> I'm hitting a Ref bug when many scrubs run against a node.  This doesn't 
> happen on 3.0.X.  I'm not sure if/if not this happens with compactions too 
> but I suspect it does.
> I'm not seeing any Ref leaks or double frees.
> To Reproduce:
> {quote}
> ./tools/bin/cassandra-stress write n=10m -rate threads=100
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> {quote}
> Eventually in the logs you get:
> WARN  [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 
> NoSpamLogger.java:97 - Spinning trying to capture readers 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')],
> *released: 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],*
>  
> This released table has a selfRef of 0 but is in the Tracker



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13873) Ref bug in Scrub

2017-11-30 Thread Joel Knighton (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-13873:
--
Reviewer: Joel Knighton  (was: Marcus Eriksson)

> Ref bug in Scrub
> 
>
> Key: CASSANDRA-13873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: T Jake Luciani
>Assignee: Marcus Eriksson
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> I'm hitting a Ref bug when many scrubs run against a node.  This doesn't 
> happen on 3.0.X.  I'm not sure if/if not this happens with compactions too 
> but I suspect it does.
> I'm not seeing any Ref leaks or double frees.
> To Reproduce:
> {quote}
> ./tools/bin/cassandra-stress write n=10m -rate threads=100
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> #Ctrl-C
> ./bin/nodetool scrub
> {quote}
> Eventually in the logs you get:
> WARN  [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 
> NoSpamLogger.java:97 - Spinning trying to capture readers 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'),
>  
> BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')],
> *released: 
> [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],*
>  
> This released table has a selfRef of 0 but is in the Tracker



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Attachment: (was: Pasted image at 2017_12_01 11_44 AM.png)

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, cleanup.png, multiple operations.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Issue Comment Deleted] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Comment: was deleted

(was: MultipleDirectoriesScreenshot)

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, Pasted image at 2017_12_01 11_44 AM.png, 
> cleanup.png, multiple operations.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Attachment: Pasted image at 2017_12_01 11_44 AM.png

MultipleDirectoriesScreenshot

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, Pasted image at 2017_12_01 11_44 AM.png, 
> cleanup.png, multiple operations.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273867#comment-16273867
 ] 

Alex Lourie commented on CASSANDRA-13010:
-

[~rustyrazorblade] I've got back to working on this ticket. I think I've 
covered all possible operations and the patch is now in a good shape.

I've tested it with compactions(including split and user-defined), repair, 
scrub and cleanup operations; I also tested with multiple data directories. It 
looks ok for all of them, here are a couple of screenshots:

[^cleanup.png]
[^multiple operations.png]

I think that the patch is ready for review at github 
(https://github.com/apache/cassandra/compare/trunk...alourie:CASSANDRA-13010) 
or as a patch [^13010.patch]

Would appreciate any feedback.
Thanks.

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, cleanup.png, multiple operations.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13530) GroupCommitLogService

2017-11-30 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273855#comment-16273855
 ] 

Jason Brown commented on CASSANDRA-13530:
-

OK, I've reverted the patch that lengthened the long tests' timeout, and 
refactored {{CommitLogStressTest}} properly. For each commit log mode, I've 
created a subclass of {{CommitLogStress}} and they run (obviously) in ~1/3 the 
time of the test with all of the modes. The only thing I wasn't sure about was 
the {{main()}} function in {{CommitLogStressTest}}. I think it's there for 
convenience, but I'm not sure what it's convenient for. I'm all for removing it 
as there's no infra that depends on it and it's behavior was exactly the same 
running the long-test. wdyt?

> GroupCommitLogService
> -
>
> Key: CASSANDRA-13530
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13530
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yuji Ito
>Assignee: Yuji Ito
> Fix For: 2.2.x, 3.0.x, 3.11.x
>
> Attachments: GuavaRequestThread.java, MicroRequestThread.java, 
> groupAndBatch.png, groupCommit22.patch, groupCommit30.patch, 
> groupCommit3x.patch, groupCommitLog_noSerial_result.xlsx, 
> groupCommitLog_result.xlsx
>
>
> I propose a new CommitLogService, GroupCommitLogService, to improve the 
> throughput when lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the disk.
> In Batch, we can write commit log to the disk every time. The size of commit 
> log to write is too small (< 4KB). When high concurrency, these writes are 
> gathered and persisted to the disk at once. But, when insufficient 
> concurrency, many small writes are issued and the performance decreases due 
> to the latency of the disk. Even if you use SSD, processes of many IO 
> commands decrease the performance.
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting 
> `commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at the 
> same time.
> In GroupCommitLogService, the latency becomes worse if the there is no 
> concurrency.
> I measured the performance with my microbench (MicroRequestThread.java) by 
> increasing the number of threads.The cluster has 3 nodes (Replication factor: 
> 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window improved 
> update with Paxos by 94% and improved select with Paxos by 76%.
> h6. SELECT / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|192|103|
> |2|163|212|
> |4|264|416|
> |8|454|800|
> |16|744|1311|
> |32|1151|1481|
> |64|1767|1844|
> |128|2949|3011|
> |256|4723|5000|
> h6. UPDATE / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|45|26|
> |2|39|51|
> |4|58|102|
> |8|102|198|
> |16|167|213|
> |32|289|295|
> |64|544|548|
> |128|1046|1058|
> |256|2020|2061|



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Attachment: multiple operations.png

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, cleanup.png, multiple operations.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Attachment: cleanup.png

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch, cleanup.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to

2017-11-30 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie updated CASSANDRA-13010:

Attachment: 13010.patch

> nodetool compactionstats should say which disk a compaction is writing to
> -
>
> Key: CASSANDRA-13010
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13010
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, Tools
>Reporter: Jon Haddad
>Assignee: Alex Lourie
>  Labels: lhf
> Attachments: 13010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273767#comment-16273767
 ] 

Jeff Jirsa commented on CASSANDRA-13976:


Encourage you to ask dev@ - I was going to suggest that as well. Pretty -0 on 
this right now (it's pretty firmly in the "I wouldn't do this, but I'm not 
going to really go out of my way to hard veto it" category).

My primary concern is that as we change yaml params, years of blog posts become 
irrelevant, and eventually we'll deprecate out the old ones and remove them, 
and then someone's rolling upgrade will break. "Of course you have to change 
yaml with major versions", you say, but the less true that is, the better life 
is for users.


> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Jon Haddad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273764#comment-16273764
 ] 

Jon Haddad commented on CASSANDRA-13976:


I'd actually like to get some feedback on -dev regarding this.  I'd like to 
change *every* setting to use duration types, because it makes it less error 
prone to set.  Mistyping a millisecond based config is pretty easy and hard to 
catch when it's wrong.

> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273758#comment-16273758
 ] 

Jeff Jirsa commented on CASSANDRA-13976:


We have 25 other config options that take millis. Why are we changing one, when 
it's one that's rarely tuned anyway? There are plenty others (auth permission 
validity) that is also almost certainly never set in milliseconds that you 
haven't suggested changing. How do you propose we keep consistency there?

Is this really something where the ease of setting it once is going to outweigh 
the config churn for the typical user?



> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Jon Haddad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273727#comment-16273727
 ] 

Jon Haddad edited comment on CASSANDRA-13976 at 12/1/17 12:49 AM:
--

I've thought about this a bit more, and I think across the board we should be 
using duration types and get rid of the _ms label altogether.

It's WAY more readable and friendly to be able to do:

{code}
max_hint_window = 3h
{code}

Regarding nodetool, it would report back whatever duration labeled setting was 
in there using "ms" if the old _ms value was provided.  Internally, it would 
convert everything to ms, leaving the current code in place.


was (Author: rustyrazorblade):
I've thought about this a bit more, and I think across the board we should be 
using duration types and get rid of the _ms label altogether.

It's WAY more readable and friendly to be able to do:

{code}
max_hint_window = 3h
{code}

Regarding nodetool, it would report back whatever setting was in there.  
Internally, it would convert everything to ms, leaving the current code in 
place.

> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14008) RTs at index boundaries in 2.x sstables can create unexpected CQL row in 3.x

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273735#comment-16273735
 ] 

Jeff Jirsa commented on CASSANDRA-14008:


The raw patches that fix the bug in LegacyLayout are at 

|| Branch || CI ||
| [3.0|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.0-14008] | 
[!https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-14008.svg?style=svg!|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-14008/]
 | 
| [3.11|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.11-14008] | 
[!https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.1-14008.svg?style=svg!|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.11-14008/]|
 

I was hoping to actually have a solution to un-breaking the broken 3.0 sstables 
in the same patch, but it's proving to be more difficult than I anticipated. I 
haven't yet tried to make some sample sstables for regression tests, I agree 
it'd be nice to have those. 

Please glance at the code, and I'll work on the regression sstables before 
committing.

> RTs at index boundaries in 2.x sstables can create unexpected CQL row in 3.x
> 
>
> Key: CASSANDRA-14008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14008
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>  Labels: correctness
> Fix For: 3.0.x, 3.11.x
>
>
> In 2.1/2.2, it is possible for a range tombstone that isn't a row deletion 
> and isn't a complex deletion to appear between two cells with the same 
> clustering. The 8099 legacy code incorrectly treats the two (non-RT) cells as 
> two distinct CQL rows, despite having the same clustering prefix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Jon Haddad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273727#comment-16273727
 ] 

Jon Haddad commented on CASSANDRA-13976:


I've thought about this a bit more, and I think across the board we should be 
using duration types and get rid of the _ms label altogether.

It's WAY more readable and friendly to be able to do:

{code}
max_hint_window = 3h
{code}

Regarding nodetool, it would report back whatever setting was in there.  
Internally, it would convert everything to ms, leaving the current code in 
place.

> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Resolved] (CASSANDRA-14074) Remove "OpenJDK is not recommended" Startup Warning

2017-11-30 Thread Kurt Greaves (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves resolved CASSANDRA-14074.
--
Resolution: Fixed

Closing as duplicate of CASSANDRA-13916

> Remove "OpenJDK is not recommended" Startup Warning
> ---
>
> Key: CASSANDRA-14074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14074
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Michael Kjellman
>  Labels: lhf
>
> We should remove the following warning on C* startup that OpenJDK is not 
> recommended. Now that with JDK8 OpenJDK is the reference JVM implementation 
> and things are much more stable -- and that all of our tests run on OpenJDK 
> builds due to the Oracle JDK license, this warning isn't helpful and is 
> actually wrong and we should remove it to prevent any user confusion.
> WARN  [main] 2017-11-28 19:39:08,446 StartupChecks.java:202 - OpenJDK is not 
> recommended. Please upgrade to the newest Oracle Java release



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13916) Remove OpenJDK log warning

2017-11-30 Thread Kurt Greaves (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-13916:
-
Labels: lhf  (was: )

> Remove OpenJDK log warning
> --
>
> Key: CASSANDRA-13916
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13916
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Anthony Grasso
>Priority: Minor
>  Labels: lhf
>
> The following warning message will appear in the logs when using OpenJDK
> {noformat}
> WARN  [main] ... OpenJDK is not recommended. Please upgrade to the newest 
> Oracle Java release
> {noformat}
> The above warning dates back to when OpenJDK 6 was released and there were 
> some issues in early releases of this version. The OpenJDK implementation is 
> used as a reference for the OracleJDK which means the implementations are 
> very close. In addition, most users have moved off Java 6 so we can probably 
> remove this warning message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms

2017-11-30 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273722#comment-16273722
 ] 

Kurt Greaves commented on CASSANDRA-13976:
--

Not necessary to support both, just convert the minute value to ms if the 
{{max_hint_window_in_min}} property exists, otherwise use the 
{{max_hint_window_in_ms}} value or default if neither exist.

TBH I don't think we should ever completely get rid of the ms config option as 
I wouldn't be surprised if there are tests relying on setting it to <1min, but 
we could remove it from the default yaml and add in the minute based config 
instead.

> introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
> -
>
> Key: CASSANDRA-13976
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13976
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Kirk True
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
>
> Milliseconds is unnecessarily precise.  At most, minutes would be used.  
> Config in 4.0 should default to a minute granularity, but if the 
> max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms 
> and emit a warning.
> max_hint_window_in_min: 180 # new default, still at 3 hours.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2017-11-30 Thread Vincent White (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273709#comment-16273709
 ] 

Vincent White commented on CASSANDRA-14013:
---

I've create a patch for 3.0.x and trunk using the same method. I guess it 
should be safe to work with just absolute paths rather than canonical paths 
here, I haven't made that change on the 3.x.x patches yet. I also had to fiddle 
with the unit tests since there is now a dependancy on DatabaseDescriptor and 
passing in file paths that exist in the configured data directory.

[3.0.x|https://github.com/vincewhite/cassandra/commits/14013-30]
[3.11.x|https://github.com/vincewhite/cassandra/commits/14013-test]
[trunk|https://github.com/vincewhite/cassandra/commits/14013-trunk]



> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gregor Uhlenheuer
>Assignee: Vincent White
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14020) test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh: pep8 has been renamed to pycodestyle (GitHub issue #466)

2017-11-30 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269227#comment-16269227
 ] 

Jay Zhuang edited comment on CASSANDRA-14020 at 12/1/17 12:09 AM:
--

Thanks [~mkjellman]
Fixing 
[linter_check.sh|https://github.com/apache/cassandra-dtest/blob/master/linter_check.sh#L10]
 here: CASSANDRA-14076


was (Author: jay.zhuang):
Thanks [~mkjellman]
Would be great if pep8 is also renamed in 
[linter_check.sh|https://github.com/apache/cassandra-dtest/blob/master/linter_check.sh#L10]

> test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh: pep8 has been 
> renamed to pycodestyle (GitHub issue #466)
> --
>
> Key: CASSANDRA-14020
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14020
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh always fails due to 
> us catching a informative warning from the pip8 tool.. looks like we just 
> need to swap out the usage
> /home/cassandra/env/local/lib/python2.7/site-packages/pep8.py:2124: 
> UserWarning: 
> pep8 has been renamed to pycodestyle (GitHub issue #466)
> Use of the pep8 tool will be removed in a future release.
> Please install and use `pycodestyle` instead.
> $ pip install pycodestyle
> $ pycodestyle ...
>   '\n\n'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273643#comment-16273643
 ] 

Jay Zhuang edited comment on CASSANDRA-14076 at 11/30/17 11:32 PM:
---

Here is the patch, please review:
| Branch | TravisCI Build Status |
| [14076|https://github.com/cooldoger/cassandra-dtest/tree/14076] | 
[!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256]
 |


was (Author: jay.zhuang):
Here is the patch, please review:
| Branch | TravisCI Build Status |
| [14076|https://github.com/cooldoger/cassandra/tree/14076] | 
[!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256]
 |

> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank l

[jira] [Comment Edited] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273645#comment-16273645
 ] 

Jeff Jirsa edited comment on CASSANDRA-14076 at 11/30/17 11:28 PM:
---

cc [~philipthompson]

(Also actual branch is https://github.com/cooldoger/cassandra-dtest/tree/14076 
) 



was (Author: jjirsa):
cc [~philipthompson]


> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:219:32: W503 line break before binary operator
> ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for 
> hanging indent
> ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for 
> hanging indent
> ./rep

[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14075:
---
Component/s: Testing

> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273645#comment-16273645
 ] 

Jeff Jirsa commented on CASSANDRA-14076:


cc [~philipthompson]


> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:219:32: W503 line break before binary operator
> ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for 
> hanging indent
> ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for 
> hanging indent
> ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name 
> 'l'
> ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace
> ./repair_tests/incremental

[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273646#comment-16273646
 ] 

Paulo Motta commented on CASSANDRA-14079:
-

Ninja fixed bad commit/merge as {{d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b}} on 
cassandra-11 and fixed master as {{88b244a1380c44d36861b6d0be9c78c968d292c2}}. 
Thanks Joel!

> Prevent compaction strategies from looping indefinitely
> ---
>
> Key: CASSANDRA-14079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14079
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.11.2, 4.0
>
>
> As a result of CASSANDRA-13948, LCS was looping indefinitely trying to 
> generate the same candidates for SSTables which were not on the tracker.
> We should add a protection on compaction strategies against looping 
> indefinitely to avoid similar bugs in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273643#comment-16273643
 ] 

Jay Zhuang commented on CASSANDRA-14076:


Here is the patch, please review:
| Branch | TravisCI Build Status |
| [14076|https://github.com/cooldoger/cassandra/tree/14076] | 
[!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256]
 |

> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:219:32: W503 line break before binary operator
> ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for 
> hanging indent
> ./plugins/dtestxunit.py:277:25: E1

[jira] [Updated] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14076:
---
Status: Patch Available  (was: Open)

> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:219:32: W503 line break before binary operator
> ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for 
> hanging indent
> ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for 
> hanging indent
> ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name 
> 'l'
> ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace
> ./repair_tests/incremental_repair_test.py:773:76: W291 trailing w

[4/4] cassandra git commit: ninja: fix bad #14079 merge (Fix AbstractCompactionStrategyTest TableMetadataRef -> TableMetadata)

2017-11-30 Thread paulo

ninja: fix bad #14079 merge (Fix AbstractCompactionStrategyTest 
TableMetadataRef -> TableMetadata)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/88b244a1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/88b244a1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/88b244a1

Branch: refs/heads/trunk
Commit: 88b244a1380c44d36861b6d0be9c78c968d292c2
Parents: f81e57e
Author: Paulo Motta 
Authored: Fri Dec 1 10:13:59 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 10:19:24 2017 +1100

--
 .../cassandra/db/compaction/AbstractCompactionStrategyTest.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/88b244a1/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java
 
b/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java
index 481b394..b77589d 100644
--- 
a/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java
+++ 
b/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java
@@ -134,7 +134,7 @@ public class AbstractCompactionStrategyTest
 long timestamp = System.currentTimeMillis();
 DecoratedKey dk = Util.dk(String.format("%03d", key));
 ColumnFamilyStore cfs = 
Keyspace.open(KEYSPACE1).getColumnFamilyStore(table);
-new RowUpdateBuilder(cfs.metadata, timestamp, dk.getKey())
+new RowUpdateBuilder(cfs.metadata(), timestamp, dk.getKey())
 .clustering(String.valueOf(key))
 .add("val", "val")
 .build()


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[2/4] cassandra git commit: ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest)

2017-11-30 Thread paulo

ninja: fix bad #14079 commit (add removeUnsafe method used by 
AbstractCompactionStrategyTest)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d2e4ce48
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d2e4ce48
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d2e4ce48

Branch: refs/heads/trunk
Commit: d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b
Parents: c253ed4
Author: Paulo Motta 
Authored: Fri Dec 1 10:08:30 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 10:18:29 2017 +1100

--
 src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 
 1 file changed, 8 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d2e4ce48/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java 
b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
index 6136f79..47efbce 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
@@ -505,4 +505,12 @@ public class Tracker
 {
 return view.get();
 }
+
+@VisibleForTesting
+public void removeUnsafe(Set toRemove)
+{
+Pair result = apply(view -> {
+return updateLiveSet(toRemove, emptySet()).apply(view);
+});
+}
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[3/4] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-11-30 Thread paulo

Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f81e57e4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f81e57e4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f81e57e4

Branch: refs/heads/trunk
Commit: f81e57e4f6503260f9ba3a36d5d096ed8d97607f
Parents: a01019d d2e4ce4
Author: Paulo Motta 
Authored: Fri Dec 1 10:18:43 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 10:18:43 2017 +1100

--
 src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 
 1 file changed, 8 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f81e57e4/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[1/4] cassandra git commit: ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest)

2017-11-30 Thread paulo

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 c253ed4fa -> d2e4ce489
  refs/heads/trunk a01019d2c -> 88b244a13


ninja: fix bad #14079 commit (add removeUnsafe method used by 
AbstractCompactionStrategyTest)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d2e4ce48
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d2e4ce48
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d2e4ce48

Branch: refs/heads/cassandra-3.11
Commit: d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b
Parents: c253ed4
Author: Paulo Motta 
Authored: Fri Dec 1 10:08:30 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 10:18:29 2017 +1100

--
 src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 
 1 file changed, 8 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d2e4ce48/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
--
diff --git a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java 
b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
index 6136f79..47efbce 100644
--- a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
+++ b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java
@@ -505,4 +505,12 @@ public class Tracker
 {
 return view.get();
 }
+
+@VisibleForTesting
+public void removeUnsafe(Set toRemove)
+{
+Pair result = apply(view -> {
+return updateLiveSet(toRemove, emptySet()).apply(view);
+});
+}
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14076) dtest code style check failed

2017-11-30 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14076:
--

Assignee: Jay Zhuang

> dtest code style check failed
> -
>
> Key: CASSANDRA-14076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14076
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>
> https://travis-ci.org/cooldoger/cassandra-dtest
> {noformat}
> $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 
> --exclude=thrift_bindings,cassandra-thrift .
> ./consistency_test.py:547:17: E722 do not use bare except'
> ./consistency_test.py:976:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:976:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:981:63: E703 statement ends with a semicolon
> ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1054:46: E261 at least two spaces before inline comment
> ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / 
> parameter equals
> ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / 
> parameter equals
> ./counter_tests.py:59:24: E703 statement ends with a semicolon
> ./counter_tests.py:383:37: E261 at least two spaces before inline comment
> ./dtest.py:586:13: E722 do not use bare except'
> ./dtest.py:1130:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1
> ./nodetool_test.py:78:1: W293 blank line contains whitespace
> ./nodetool_test.py:174:45: E261 at least two spaces before inline comment
> ./run_dtests.py:220:54: E221 multiple spaces before operator
> ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but 
> unused
> ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported 
> but unused
> ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1
> ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace
> ./sslnodetonode_test.py:191:1: W391 blank line at end of file
> ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1
> ./system_keyspaces_test.py:28:59: E241 multiple spaces after ','
> ./system_keyspaces_test.py:50:62: E241 multiple spaces after ','
> ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported 
> but unused
> ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but 
> unused
> ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused
> ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested 
> definition, found 0
> ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon
> ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused
> ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused
> ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1
> ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace
> ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace
> ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused
> ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused
> ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic 
> operator
> ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1
> ./plugins/dtestxunit.py:219:32: W503 line break before binary operator
> ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for 
> hanging indent
> ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for 
> hanging indent
> ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name 
> 'l'
> ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace
> ./repair_tests/incremental_repair_test.py:773:76: W291 trailing whitespace

[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Michael Kjellman (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Kjellman updated CASSANDRA-14075:
-
Status: Ready to Commit  (was: Patch Available)

> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Michael Kjellman (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Kjellman updated CASSANDRA-14075:
-
Status: Patch Available  (was: Open)

> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Michael Kjellman (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Kjellman updated CASSANDRA-14075:
-
Reviewer: Michael Kjellman

> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Michael Kjellman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273615#comment-16273615
 ] 

Michael Kjellman commented on CASSANDRA-14075:
--

looks good! +1

> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273611#comment-16273611
 ] 

Paulo Motta commented on CASSANDRA-14079:
-

oops, lost during break up of CASSANDRA-13948, sorry about that, will ninja a 
fix soon! thanks for the heads up!

> Prevent compaction strategies from looping indefinitely
> ---
>
> Key: CASSANDRA-14079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14079
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.11.2, 4.0
>
>
> As a result of CASSANDRA-13948, LCS was looping indefinitely trying to 
> generate the same candidates for SSTables which were not on the tracker.
> We should add a protection on compaction strategies against looping 
> indefinitely to avoid similar bugs in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely

2017-11-30 Thread Joel Knighton (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273603#comment-16273603
 ] 

Joel Knighton commented on CASSANDRA-14079:
---

It looks like this broke the build on 3.11/trunk. On trunk only, there's a 
place in {{AbstractCompactionStrategyTest}} where we pass a 
{{TableMetadataRef}} instead of a {{TableMetadata}}. On 3.11/trunk, it looks 
like there's a missing {{removeUnsafe}} test method on {{Tracker}} that 
{{AbstractCompactionStrategyTest}} uses. It looks like that's missing on all 
branches, so maybe it just got left out of the commit. 

[~pauloricardomg] ^

> Prevent compaction strategies from looping indefinitely
> ---
>
> Key: CASSANDRA-14079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14079
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.11.2, 4.0
>
>
> As a result of CASSANDRA-13948, LCS was looping indefinitely trying to 
> generate the same candidates for SSTables which were not on the tracker.
> We should add a protection on compaction strategies against looping 
> indefinitely to avoid similar bugs in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14085:
---
Fix Version/s: (was: 3.0.16)
   (was: 4.0)
   4.x
   3.11.x
   3.0.x

> Excessive update of ReadLatency metric in digest calculation
> 
>
> Key: CASSANDRA-14085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, Metrics
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We noticed an increase in read latency after upgrading to 3.x, specifically 
> for requests with CL>ONE. It turns out the read latency metric is being 
> doubly updated for digest calculations. This code 
> (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243)
>  makes an improper copy of an iterator that's wrapped by MetricRecording, 
> whose onClose() records the latency of the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14085:
---
Component/s: Core

> Excessive update of ReadLatency metric in digest calculation
> 
>
> Key: CASSANDRA-14085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, Metrics
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We noticed an increase in read latency after upgrading to 3.x, specifically 
> for requests with CL>ONE. It turns out the read latency metric is being 
> doubly updated for digest calculations. This code 
> (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243)
>  makes an improper copy of an iterator that's wrapped by MetricRecording, 
> whose onClose() records the latency of the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-14085:
--

Assignee: Andrew Whang

> Excessive update of ReadLatency metric in digest calculation
> 
>
> Key: CASSANDRA-14085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Metrics
>Reporter: Andrew Whang
>Assignee: Andrew Whang
>Priority: Minor
> Fix For: 3.0.16, 4.0
>
>
> We noticed an increase in read latency after upgrading to 3.x, specifically 
> for requests with CL>ONE. It turns out the read latency metric is being 
> doubly updated for digest calculations. This code 
> (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243)
>  makes an improper copy of an iterator that's wrapped by MetricRecording, 
> whose onClose() records the latency of the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation

2017-11-30 Thread Andrew Whang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Whang updated CASSANDRA-14085:
-
Fix Version/s: 4.0
   3.0.16
   Status: Patch Available  (was: Open)

https://github.com/whangsf/cassandra/commit/2ae3589ce9eefd8699bbd4e29bf1c61a486d394e

> Excessive update of ReadLatency metric in digest calculation
> 
>
> Key: CASSANDRA-14085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Metrics
>Reporter: Andrew Whang
>Priority: Minor
> Fix For: 3.0.16, 4.0
>
>
> We noticed an increase in read latency after upgrading to 3.x, specifically 
> for requests with CL>ONE. It turns out the read latency metric is being 
> doubly updated for digest calculations. This code 
> (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243)
>  makes an improper copy of an iterator that's wrapped by MetricRecording, 
> whose onClose() records the latency of the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation

2017-11-30 Thread Andrew Whang (JIRA)

Andrew Whang created CASSANDRA-14085:


 Summary: Excessive update of ReadLatency metric in digest 
calculation
 Key: CASSANDRA-14085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14085
 Project: Cassandra
  Issue Type: Bug
  Components: Metrics
Reporter: Andrew Whang
Priority: Minor


We noticed an increase in read latency after upgrading to 3.x, specifically for 
requests with CL>ONE. It turns out the read latency metric is being doubly 
updated for digest calculations. This code 
(https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243)
 makes an improper copy of an iterator that's wrapped by MetricRecording, whose 
onClose() records the latency of the execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation

2017-11-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-3200:
---
Status: Ready to Commit  (was: Patch Available)

> Repair: compare all trees together (for a given range/cf) instead of by pair 
> in isolation
> -
>
> Key: CASSANDRA-3200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: repair
> Fix For: 4.x
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other 
> tree. What that means concretely is that if I have three node A, B and C 
> (RF=3) with A and B in sync, but C having some range r inconsitent with both 
> A and B (since those are consistent), we will do the following transfer of r: 
> A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know 
> which one is more to date from A or C. However, the transfer B -> C is 
> useless provided we do A -> C if A and B are in sync. Not doing that transfer 
> will be a 25% improvement in that case. With RF=5 and only one node 
> inconsistent with all the others, that almost a 40% improvement, etc...
> Given that this situation of one node not in sync while the others are is 
> probably fairly common (one node died so it is behind), this could be a fair 
> improvement over what is transferred. In the case where we use repair to 
> rebuild completely a node, this will be a dramatic improvement, because it 
> will avoid the rebuilded node to get RF times the data it should get.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation

2017-11-30 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273511#comment-16273511
 ] 

Blake Eggleston commented on CASSANDRA-3200:


The last test run seems to have died. I restarted it 
[here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/448/].

Assuming there aren't any related failures, I'm +1.

> Repair: compare all trees together (for a given range/cf) instead of by pair 
> in isolation
> -
>
> Key: CASSANDRA-3200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: repair
> Fix For: 4.x
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other 
> tree. What that means concretely is that if I have three node A, B and C 
> (RF=3) with A and B in sync, but C having some range r inconsitent with both 
> A and B (since those are consistent), we will do the following transfer of r: 
> A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know 
> which one is more to date from A or C. However, the transfer B -> C is 
> useless provided we do A -> C if A and B are in sync. Not doing that transfer 
> will be a 25% improvement in that case. With RF=5 and only one node 
> inconsistent with all the others, that almost a 40% improvement, etc...
> Given that this situation of one node not in sync while the others are is 
> probably fairly common (one node died so it is behind), this could be a fair 
> improvement over what is transferred. In the case where we use repair to 
> rebuild completely a node, this will be a dramatic improvement, because it 
> will avoid the rebuilded node to get RF times the data it should get.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12971) Add CAS option to WRITE test to stress tool

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273502#comment-16273502
 ] 

Jeff Jirsa commented on CASSANDRA-12971:


[~vovodroid] / [~spo...@gmail.com] / [~jay.zhuang]  - should this be closed as 
a duplicate?

> Add CAS option to WRITE test to stress tool
> ---
>
> Key: CASSANDRA-12971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12971
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Stress, Tools
>Reporter: Vladimir Yudovin
>Assignee: Vladimir Yudovin
> Attachments: stress-cass.patch
>
>
> If -cas option is present each UPDATE is performed with true IF condition, 
> thus data is inserted anyway.
> It's implemented, if it's needed I proceed with the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12922) Bloom filter miss counts are not measured correctly

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273498#comment-16273498
 ] 

Jeff Jirsa commented on CASSANDRA-12922:


[~krishnasun] are you still interested in writing the unit test?


> Bloom filter miss counts are not measured correctly
> ---
>
> Key: CASSANDRA-12922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Sundar Srinivasan
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 12922-trunk.txt
>
>
> Bloom filter hits and misses are evaluated incorrectly in 
> {{BigTableReader.getPosition}}: we properly record hits, but not misses. In 
> particular, if we don't find a match for a key in the index, which is where 
> almost all non-matches will be rejected, [we don't record a bloom filter 
> false 
> positive|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableReader.java#L228].
> This leads to very misleading output from e.g. {{nodetool tablestats}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13901:
---
Status: Awaiting Feedback  (was: Open)

> Linux Script for stopping running cassandra and cqlsh
> -
>
> Key: CASSANDRA-13901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13901
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Akash Sethi
>Assignee: Akash Sethi
>Priority: Minor
> Fix For: 3.11.0
>
> Attachments: 
> 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch
>
>
> The script for stopping Cassandra and cqlsh if running on any Linux machine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13901:
---
Status: Open  (was: Patch Available)

> Linux Script for stopping running cassandra and cqlsh
> -
>
> Key: CASSANDRA-13901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13901
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Akash Sethi
>Assignee: Akash Sethi
>Priority: Minor
> Fix For: 3.11.0
>
> Attachments: 
> 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch
>
>
> The script for stopping Cassandra and cqlsh if running on any Linux machine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-13901:
--

Assignee: Akash Sethi

> Linux Script for stopping running cassandra and cqlsh
> -
>
> Key: CASSANDRA-13901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13901
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Akash Sethi
>Assignee: Akash Sethi
>Priority: Minor
> Fix For: 3.11.0
>
> Attachments: 
> 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch
>
>
> The script for stopping Cassandra and cqlsh if running on any Linux machine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273479#comment-16273479
 ] 

Jeff Jirsa commented on CASSANDRA-13901:


I have some concerns here.

1) Cassandra has mechanisms to stop itself (via nodetool), which does a nice 
clean shutdown, not {{kill -9}} which can potentially lose data in some edge 
cases,

2) The command to fetch the PID {{ | grep apache-cassandra }} is unlikely to 
work reliably and safely. It'll probably not match in many environments, and 
it'll over-match in environments where multiple instances are running.

3) I'm not sure what problem this solves. Can you help explain why you need 
such utilities?

> Linux Script for stopping running cassandra and cqlsh
> -
>
> Key: CASSANDRA-13901
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13901
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Akash Sethi
>Priority: Minor
> Fix For: 3.11.0
>
> Attachments: 
> 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch
>
>
> The script for stopping Cassandra and cqlsh if running on any Linux machine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13968) Cannot replace a live node on large clusters

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273471#comment-16273471
 ] 

Jeff Jirsa commented on CASSANDRA-13968:


Marking Jason as reviewer since he was silly enough to suggest he may be 
willing to do it.


> Cannot replace a live node on large clusters
> 
>
> Key: CASSANDRA-13968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13968
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
> Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4)
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>  Labels: gossip
> Attachments: 
> 0001-During-node-replacement-check-for-updates-in-the-tim.patch, 
> 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch
>
>
> During forced node replacements we very frequently (~every time for large 
> clusters) see:
> {noformat}
> ERROR [main] 2017-10-17 06:54:35,680  CassandraDaemon.java:583 - Exception 
> encountered during startup
> java.lang.UnsupportedOperationException: Cannot replace a live node...
> {noformat}
> The old node is dead, the new node that is replacing it thinks it is dead (DN 
> state), and all other nodes think it is dead (all have the DN state). 
> However, I believe there are two bugs in the "is live" check that can cause 
> this error, namely that:
> 1. We sleep for 
> [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905]
>  (hardcoded 60s on 2.1, on later version configurable but still 60s by 
> default), but 
> [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919]
>  for an update in the last RING_DELAY seconds (typically set to 30s). When a 
> fresh node is joining, in my experience, [the 
> schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859]
>  check almost immediately returns true after gossiping with seeds, so in 
> reality we do not even sleep for RING_DELAY. If operators increase ring delay 
> past broadcast_interval (as you might do if you think you are victim to the 
> second bug below), then you guarantee that you will always get the exception 
> because the gossip update is basically guaranteed to happen in the last 
> RING_DELAY seconds since you didn't sleep for that duration (you slept for 
> broadcast). For example if an operator sets ring delay to 300s, then the 
> check says "oh yea, the last update was 59 seconds ago, which is sooner than 
> 300s, so fail".
> 2. We don't actually check that the node is alive, we just check that a 
> gossip update has happened in the last X seconds. Sometimes with large 
> clusters nodes are still converging on the proper generation/version of a 
> dead node, and the "is live" check prevents an operator from replacing the 
> node until gossip has settled on the cluster regarding the dead node, which 
> for large clusters can take a really long time. This can be really hurtful to 
> availability in cloud environments and every time I've seen this error it's 
> the case that the new node believes that the old node is down (since 
> [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954]
>  [marks 
> dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962]
>  first and then triggers a callback to 
> [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975]
>  which never triggers because the old node is actually down).
> I think that #1 is definitely a bug, #2 might be considered an extra safety" 
> feature (that you don't allow replacement during gossip convergence), but 
> given that the operator took the effort to supply the replace_address flag, I 
> think it's prudent to only fail if we really know something is wrong.
> I've attached two patches against 2.1, one that fixes bug #1 and one that 
> fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the 
> schema check from exiting the RING_DELAY sleep early but maybe it's just 
> better to backport configurable broadcast_interval and pick the maximum or 
> something. If we don't like the way I've worked around #2, maybe I could make 
> it an option that operators could turn on if they wanted? If folks are happy 
> with the approach I can attach patches for 2.2, 3.0, and 3.11.
> A relevant example of a log showing the first bug

[jira] [Updated] (CASSANDRA-13968) Cannot replace a live node on large clusters

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13968:
---
Reviewer: Jason Brown

> Cannot replace a live node on large clusters
> 
>
> Key: CASSANDRA-13968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13968
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
> Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4)
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>  Labels: gossip
> Attachments: 
> 0001-During-node-replacement-check-for-updates-in-the-tim.patch, 
> 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch
>
>
> During forced node replacements we very frequently (~every time for large 
> clusters) see:
> {noformat}
> ERROR [main] 2017-10-17 06:54:35,680  CassandraDaemon.java:583 - Exception 
> encountered during startup
> java.lang.UnsupportedOperationException: Cannot replace a live node...
> {noformat}
> The old node is dead, the new node that is replacing it thinks it is dead (DN 
> state), and all other nodes think it is dead (all have the DN state). 
> However, I believe there are two bugs in the "is live" check that can cause 
> this error, namely that:
> 1. We sleep for 
> [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905]
>  (hardcoded 60s on 2.1, on later version configurable but still 60s by 
> default), but 
> [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919]
>  for an update in the last RING_DELAY seconds (typically set to 30s). When a 
> fresh node is joining, in my experience, [the 
> schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859]
>  check almost immediately returns true after gossiping with seeds, so in 
> reality we do not even sleep for RING_DELAY. If operators increase ring delay 
> past broadcast_interval (as you might do if you think you are victim to the 
> second bug below), then you guarantee that you will always get the exception 
> because the gossip update is basically guaranteed to happen in the last 
> RING_DELAY seconds since you didn't sleep for that duration (you slept for 
> broadcast). For example if an operator sets ring delay to 300s, then the 
> check says "oh yea, the last update was 59 seconds ago, which is sooner than 
> 300s, so fail".
> 2. We don't actually check that the node is alive, we just check that a 
> gossip update has happened in the last X seconds. Sometimes with large 
> clusters nodes are still converging on the proper generation/version of a 
> dead node, and the "is live" check prevents an operator from replacing the 
> node until gossip has settled on the cluster regarding the dead node, which 
> for large clusters can take a really long time. This can be really hurtful to 
> availability in cloud environments and every time I've seen this error it's 
> the case that the new node believes that the old node is down (since 
> [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954]
>  [marks 
> dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962]
>  first and then triggers a callback to 
> [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975]
>  which never triggers because the old node is actually down).
> I think that #1 is definitely a bug, #2 might be considered an extra safety" 
> feature (that you don't allow replacement during gossip convergence), but 
> given that the operator took the effort to supply the replace_address flag, I 
> think it's prudent to only fail if we really know something is wrong.
> I've attached two patches against 2.1, one that fixes bug #1 and one that 
> fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the 
> schema check from exiting the RING_DELAY sleep early but maybe it's just 
> better to backport configurable broadcast_interval and pick the maximum or 
> something. If we don't like the way I've worked around #2, maybe I could make 
> it an option that operators could turn on if they wanted? If folks are happy 
> with the approach I can attach patches for 2.2, 3.0, and 3.11.
> A relevant example of a log showing the first bug (in this case the node that 
> was being replaced was drained moving it to shutdown before replacement, and 
> ring delay was

[jira] [Updated] (CASSANDRA-13968) Cannot replace a live node on large clusters

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13968:
---
Component/s: Coordination

> Cannot replace a live node on large clusters
> 
>
> Key: CASSANDRA-13968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13968
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
> Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4)
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>  Labels: gossip
> Attachments: 
> 0001-During-node-replacement-check-for-updates-in-the-tim.patch, 
> 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch
>
>
> During forced node replacements we very frequently (~every time for large 
> clusters) see:
> {noformat}
> ERROR [main] 2017-10-17 06:54:35,680  CassandraDaemon.java:583 - Exception 
> encountered during startup
> java.lang.UnsupportedOperationException: Cannot replace a live node...
> {noformat}
> The old node is dead, the new node that is replacing it thinks it is dead (DN 
> state), and all other nodes think it is dead (all have the DN state). 
> However, I believe there are two bugs in the "is live" check that can cause 
> this error, namely that:
> 1. We sleep for 
> [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905]
>  (hardcoded 60s on 2.1, on later version configurable but still 60s by 
> default), but 
> [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919]
>  for an update in the last RING_DELAY seconds (typically set to 30s). When a 
> fresh node is joining, in my experience, [the 
> schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859]
>  check almost immediately returns true after gossiping with seeds, so in 
> reality we do not even sleep for RING_DELAY. If operators increase ring delay 
> past broadcast_interval (as you might do if you think you are victim to the 
> second bug below), then you guarantee that you will always get the exception 
> because the gossip update is basically guaranteed to happen in the last 
> RING_DELAY seconds since you didn't sleep for that duration (you slept for 
> broadcast). For example if an operator sets ring delay to 300s, then the 
> check says "oh yea, the last update was 59 seconds ago, which is sooner than 
> 300s, so fail".
> 2. We don't actually check that the node is alive, we just check that a 
> gossip update has happened in the last X seconds. Sometimes with large 
> clusters nodes are still converging on the proper generation/version of a 
> dead node, and the "is live" check prevents an operator from replacing the 
> node until gossip has settled on the cluster regarding the dead node, which 
> for large clusters can take a really long time. This can be really hurtful to 
> availability in cloud environments and every time I've seen this error it's 
> the case that the new node believes that the old node is down (since 
> [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954]
>  [marks 
> dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962]
>  first and then triggers a callback to 
> [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975]
>  which never triggers because the old node is actually down).
> I think that #1 is definitely a bug, #2 might be considered an extra safety" 
> feature (that you don't allow replacement during gossip convergence), but 
> given that the operator took the effort to supply the replace_address flag, I 
> think it's prudent to only fail if we really know something is wrong.
> I've attached two patches against 2.1, one that fixes bug #1 and one that 
> fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the 
> schema check from exiting the RING_DELAY sleep early but maybe it's just 
> better to backport configurable broadcast_interval and pick the maximum or 
> something. If we don't like the way I've worked around #2, maybe I could make 
> it an option that operators could turn on if they wanted? If folks are happy 
> with the approach I can attach patches for 2.2, 3.0, and 3.11.
> A relevant example of a log showing the first bug (in this case the node that 
> was being replaced was drained moving it to shutdown before replacement, and 
> ring delay

[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13974:
---
Reviewer: Jeff Jirsa

> Bad prefix matching when figuring out data directory for an sstable
> ---
>
> Key: CASSANDRA-13974
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13974
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.11.x, 4.x
>
>
> We do a "startsWith" check when getting data directory for an sstable, we 
> should match including File.separator



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273465#comment-16273465
 ] 

Jeff Jirsa commented on CASSANDRA-13974:


I'll take review on this, but it'll be a bit. If someone beats me to it, I 
won't mind ([~stefania_alborghetti] or [~bdeggleston] or [~pauloricardomg])


> Bad prefix matching when figuring out data directory for an sstable
> ---
>
> Key: CASSANDRA-13974
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13974
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
> Fix For: 3.11.x, 4.x
>
>
> We do a "startsWith" check when getting data directory for an sstable, we 
> should match including File.separator



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273459#comment-16273459
 ] 

Jeff Jirsa commented on CASSANDRA-13851:


Who wants to review a gossip patch? [~jasobrown] or [~jkni], you two have 
touched it most recently?


> Allow existing nodes to use all peers in shadow round
> -
>
> Key: CASSANDRA-13851
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13851
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
> Fix For: 3.11.x, 4.x
>
>
> In CASSANDRA-10134 we made collision checks necessary on every startup. A 
> side-effect was introduced that then requires a nodes seeds to be contacted 
> on every startup. Prior to this change an existing node could start up 
> regardless whether it could contact a seed node or not (because 
> checkForEndpointCollision() was only called for bootstrapping nodes). 
> Now if a nodes seeds are removed/deleted/fail it will no longer be able to 
> start up until live seeds are configured (or itself is made a seed), even 
> though it already knows about the rest of the ring. This is inconvenient for 
> operators and has the potential to cause some nasty surprises and increase 
> downtime.
> One solution would be to use all a nodes existing peers as seeds in the 
> shadow round. Not a Gossip guru though so not sure of implications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14065) Docs: Fix page width exceeding the viewport

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273454#comment-16273454
 ] 

Jeff Jirsa commented on CASSANDRA-14065:


I genuinely have no idea how to review this. It looks reasonable, but it's been 
a very long time since I tried to do cross-browser/cross-device CSS validation?

If it's not referenced, maybe there's no harm anyway?

> Docs: Fix page width exceeding the viewport
> ---
>
> Key: CASSANDRA-14065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Stefan Podkowinski
> Fix For: 4.x
>
> Attachments: 14065-trunk.patch
>
>
> Ticket for [#175|https://github.com/apache/cassandra/pull/175] / 
> [#176|https://github.com/apache/cassandra/pull/176].
> The layout seems to adapt more natural after applying the patch with less 
> overlapping content. Seems to fix a real issue with our template.
> However, I'm not really sure about the extra.css changes, as the compile 
> website (build via jekyll) doesn't seem to reference the css file anywhere..



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14060) Separate CorruptSSTableException and FSError handling policies

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273442#comment-16273442
 ] 

Jeff Jirsa commented on CASSANDRA-14060:


I'll take review on this, but it'll be a few days. Feel free to replace me if 
another committers will get to it faster than me.


> Separate CorruptSSTableException and FSError handling policies
> --
>
> Key: CASSANDRA-14060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14060
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> Currently, if 
> [{{disk_failure_policy}}|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L230]
>  is set to {{stop}} (default), StorageService will shutdown for {{FSError}}, 
> but not {{CorruptSSTableException}} 
> [DefaultFSErrorHandler.java:40|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DefaultFSErrorHandler.java#L40].
> But when we use policy: {{die}}, it has different behave, JVM will be killed 
> for both {{FSError}} and {{CorruptSSTableException}} 
> [JVMStabilityInspector.java:63|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L63]:
> ||{{disk_failure_policy}}|| hit {{FSError}} Exception || hit 
> {{CorruptSSTableException}} ||
> |{{stop}}| (/) stop | (x) not stop |
> |{{die}}| (/) die | (/) die |
> We saw {{CorruptSSTableException}} from time to time in our production, but 
> mostly it's *not* because of a disk issue. So I would suggest having a 
> separate policy for CorruptSSTable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14060) Separate CorruptSSTableException and FSError handling policies

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14060:
---
Reviewer: Jeff Jirsa

> Separate CorruptSSTableException and FSError handling policies
> --
>
> Key: CASSANDRA-14060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14060
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> Currently, if 
> [{{disk_failure_policy}}|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L230]
>  is set to {{stop}} (default), StorageService will shutdown for {{FSError}}, 
> but not {{CorruptSSTableException}} 
> [DefaultFSErrorHandler.java:40|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DefaultFSErrorHandler.java#L40].
> But when we use policy: {{die}}, it has different behave, JVM will be killed 
> for both {{FSError}} and {{CorruptSSTableException}} 
> [JVMStabilityInspector.java:63|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L63]:
> ||{{disk_failure_policy}}|| hit {{FSError}} Exception || hit 
> {{CorruptSSTableException}} ||
> |{{stop}}| (/) stop | (x) not stop |
> |{{die}}| (/) die | (/) die |
> We saw {{CorruptSSTableException}} from time to time in our production, but 
> mostly it's *not* because of a disk issue. So I would suggest having a 
> separate policy for CorruptSSTable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14055) Index redistribution breaks SASI index

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-14055:
--

Assignee: Ludovic Boutros

> Index redistribution breaks SASI index
> --
>
> Key: CASSANDRA-14055
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Ludovic Boutros
>Assignee: Ludovic Boutros
>  Labels: patch
> Fix For: 3.11.x
>
> Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, 
> CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : 
> [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14055) Index redistribution breaks SASI index

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273438#comment-16273438
 ] 

Jeff Jirsa commented on CASSANDRA-14055:


[~ifesdjeen] are you still reviewing SASI patches or do we need to find someone 
else?

> Index redistribution breaks SASI index
> --
>
> Key: CASSANDRA-14055
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Ludovic Boutros
>Assignee: Ludovic Boutros
>  Labels: patch
> Fix For: 3.11.x
>
> Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, 
> CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : 
> [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14059) Root logging formatter broken in dtests

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273437#comment-16273437
 ] 

Jeff Jirsa commented on CASSANDRA-14059:


[~spo...@gmail.com] can I mark you as reviewer here as well? 

> Root logging formatter broken in dtests
> ---
>
> Key: CASSANDRA-14059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14059
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Joel Knighton
>Assignee: Joel Knighton
>Priority: Minor
>
> Since the ccm dependency in dtest was bumped to {{3.1.0}} in 
> {{7cc06a086f89ed76499837558ff263d84337acba}}, when dtests are run with 
> --nologcapture, errors of the following form are printed:
> {code}
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/logging/__init__.py", line 861, in emit
> msg = self.format(record)
>   File "/usr/lib64/python2.7/logging/__init__.py", line 734, in format
> return fmt.format(record)
>   File "/usr/lib64/python2.7/logging/__init__.py", line 469, in format
> s = self._fmt % record.__dict__
> KeyError: 'current_test'
> Logged from file dtest.py, line 485
> {code}
> This is because CCM no longer installs a basic root logger configuration, 
> which is probably a more correct behavior than what it did prior to this 
> change. Now, dtest installs its own basic root logger configuration which 
> writes to 'dtest.log' using the formatter {{'%(asctime)s,%(msecs)d %(name)s 
> %(current_test)s %(levelname)s %(message)s'}}. This means that anything 
> logging a message must provide the current_test key in its extras map. The 
> dtest {{debug}} and {{warning}} functions do this, but logging from 
> dependencies doesn't, producing these {{KeyError}} s. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14061) trunk eclipse-warnings

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273436#comment-16273436
 ] 

Jeff Jirsa commented on CASSANDRA-14061:


[~spo...@gmail.com] interested in being the official reviewer on this?


> trunk eclipse-warnings
> --
>
> Key: CASSANDRA-14061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> {noformat}
> eclipse-warnings:
> [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj
>  [echo] Running Eclipse Code Analysis.  Output logged to 
> /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt
>  [java] --
>  [java] 1. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 59)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, file.getPath(), iterator);
>  [java]   
> ^^^
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2. ERROR in 
> /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
>  (at line 79)
>  [java]   return new SSTableIdentityIterator(sstable, key, 
> partitionLevelDeletion, dfile.getPath(), iterator);
>  [java]   
> 
>  [java] Potential resource leak: 'iterator' may not be closed at this 
> location
>  [java] --
>  [java] 2 problems (2 errors)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13917) COMPACT STORAGE inserts on tables without clusterings accept hidden column1 and value columns

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273425#comment-16273425
 ] 

Jeff Jirsa commented on CASSANDRA-13917:


[~ifesdjeen] are you able to review this as the reporter?

> COMPACT STORAGE inserts on tables without clusterings accept hidden column1 
> and value columns
> -
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-10726) Read repair inserts should not be blocking

2017-11-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston reassigned CASSANDRA-10726:
---

Assignee: Blake Eggleston  (was: Xiaolong Jiang)

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"

2017-11-30 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273188#comment-16273188
 ] 

Jason Brown commented on CASSANDRA-14075:
-

[~mkjellman]'s evaluations is correct: in CASSANDRA-10404, I didn't correctly 
support pre-4.0 in this dtest. Here is a [dtest 
patch|https://github.com/jasobrown/cassandra-dtest/tree/14075] that checks the 
cluster version and only adds the new props if the it's greater than or equal 
to 4.0.

Here are runs of the dtest patch against both 3.11 and trunk:

||3.11||trunk||
|[utests & 
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14075-3.11]|[utests
 & 
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14075-trunk]|
||


Note: I also ran this locally with jdk1.8.0_151, and started getting this 
warning:

{noformat}
Warning:
The JKS keystore uses a proprietary format. It is recommended to migrate to 
PKCS12 which is an industry standard format using "keytool -importkeystore 
-srckeystore /tmp/tmpICn9py/ca.keystore -destkeystore 
/tmp/tmpICn9py/ca.keystore -deststoretype pkcs12".
{noformat}

I've also updated {{sslkeygen.py}} in this patch with a trivial fix to 
eliminate the warning.


> Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with 
> "Please remove properties [optional, enabled] from your cassandra.yaml"
> --
>
> Key: CASSANDRA-14075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14075
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>
> Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on 
> 3.11 with an exception on startup due to invalid yaml properties.
> Unexpected error in node1 log, error: 
> ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception 
> encountered during startup: Invalid yaml. Please remove properties [optional, 
> enabled] from your cassandra.yaml 
> Although ccm was updated in 
> https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53
>  to include a version check for >= 4.0, enabled and optional are emitted 
> unconditionally in the actual dtest itself -- they should also be conditional 
> on >= 4.0
> {code:java}
> node.set_configuration_options(values={
> 'server_encryption_options': {
> 'enabled': encryption_enabled,
> 'optional': encryption_optional,
> 'internode_encryption': internode_encryption,
> 'keystore': kspath,
> 'keystore_password': 'cassandra',
> 'truststore': tspath,
> 'truststore_password': 'cassandra',
> 'require_endpoint_verification': endpoint_verification,
> 'require_client_auth': client_auth,
> }
> })
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-11-30 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273156#comment-16273156
 ] 

Blake Eggleston commented on CASSANDRA-13983:
-

+1 with the recent changes. Thanks for dividing the fixes between a few commits

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked

2017-11-30 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-13983:

Status: Ready to Commit  (was: Patch Available)

> Support a means of logging all queries as they were invoked
> ---
>
> Key: CASSANDRA-13983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13983
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Observability, Testing, Tools
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 4.0
>
>
> For correctness testing it's useful to be able to capture production traffic 
> so that it can be replayed against both the old and new versions of Cassandra 
> while comparing the results.
> Implementing this functionality once inside the database is high performance 
> and presents less operational complexity.
> In [this patch|https://github.com/apache/cassandra/pull/169] there is an 
> implementation of a full query log that logs uses chronicle-queue (apache 
> licensed, the maven artifacts are labeled incorrectly in some cases, 
> dependencies are also apache licensed) to implement a rotating log of queries.
> * Single thread asynchronously writes log entries to disk to reduce impact on 
> query latency
> * Heap memory usage bounded by a weighted queue with configurable maximum 
> weight sitting in front of logging thread
> * If the weighted queue is full producers can be blocked or samples can be 
> dropped
> * Disk utilization is bounded by deleting old log segments once a 
> configurable size is reached
> * The on disk serialization uses a flexible schema binary format 
> (chronicle-wire) making it easy to skip unrecognized fields, add new ones, 
> and omit old ones.
> * Can be enabled and configured via JMX, disabled, and reset (delete on disk 
> data), logging path is configurable via both JMX and YAML
> * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which 
> can dump in a human readable format full query logs as well as follow active 
> full query logs
> Follow up work:
> * Introduce new {{fqltool}} command Replay which can replay N full query logs 
> to two different clusters and compare the result and check for 
> inconsistencies. <- Actively working on getting this done
> * Log not just queries but their results to facilitate a comparison between 
> the original query result and the replayed result. <- Really just don't have 
> specific use case at the moment
> * "Consistent" query logging allowing replay to fully replicate the original 
> order of execution and completion even in the face of races (including CAS). 
> <- This is more speculative



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13308:
---
Component/s: Hints

> Gossip breaks, Hint files not being deleted on nodetool decommission
> 
>
> Key: CASSANDRA-13308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hints, Streaming and Messaging
> Environment: Using Cassandra version 3.0.9
>Reporter: Arijit
>Assignee: Jeff Jirsa
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: 28207.stack, logs, logs_decommissioned_node
>
>
> How to reproduce the issue I'm seeing:
> Shut down Cassandra on one node of the cluster and wait until we accumulate a 
> ton of hints. Start Cassandra on the node and immediately run "nodetool 
> decommission" on it.
> The node streams its replicas and marks itself as DECOMMISSIONED, but other 
> nodes do not seem to see this message. "nodetool status" shows the 
> decommissioned node in state "UL" on all other nodes (it is also present in 
> system.peers), and Cassandra logs show that gossip tasks on nodes are not 
> proceeding (number of pending tasks keeps increasing). Jstack suggests that a 
> gossip task is blocked on hints dispatch (I can provide traces if this is not 
> obvious). Because the cluster is large and there are a lot of hints, this is 
> taking a while. 
> On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint 
> files for the decommissioned node. Documentation seems to suggest that these 
> hints should be deleted during "nodetool decommission", but it does not seem 
> to be the case here. This is the bug being reported.
> To recover from this scenario, if I manually delete hint files on the nodes, 
> the hints dispatcher threads throw a bunch of exceptions and the 
> decommissioned node is now in state "DL" (perhaps it missed some gossip 
> messages?). The node is still in my "system.peers" table
> Restarting Cassandra on all nodes after this step does not fix the issue (the 
> node remains in the peers table). In fact, after this point the 
> decommissioned node is in state "DN"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13740:
---
Component/s: Hints

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, Hints
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14080) Handling 0 size hint files during start

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14080:
---
Component/s: Hints

> Handling 0 size hint files during start
> ---
>
> Key: CASSANDRA-14080
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14080
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hints
>Reporter: Aleksandr Ivanov
>
> Continuation of CASSANDRA-12728 bug.
> Problem: Cassandra didn't start due to 0 size hints files
> Log form v3.0.14:
> {code:java}
> INFO  [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra 
> version: 3.0.14
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API 
> version: 20.1.0
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported 
> versions: 3.4.0 (default: 3.4.0)
> ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception 
> encountered during startup
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) 
> ~[na:1.8.0_141]
> at java.util.Iterator.forEachRemaining(Iterator.java:116) 
> ~[na:1.8.0_141]
> at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>  ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) 
> ~[na:1.8.0_141]
> at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:88) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:63) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141]
> at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:570)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> Caused by: java.io.EOFException: null
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) 
> ~[na:1.8.0_141]
> at 
> org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> ... 20 common frames omitted
> {code}
> After several 0 size hints files deletion Cassandra started successfully.
> Jeff Jirsa added a comment - Yesterday
> Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's 
> possible that the original patch didn't consider 0 byte files (I don't have 
> time to go back and look at the commit, and it was long enough ago that I've 
> forgotten) - were all of your files 0 bytes?
> Not all, 8..10 hints files were with 0 size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-12728) Handling partially written hint files

2017-11-30 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-12728:
---
Component/s: Hints

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hints
>Reporter: Sharvanath Pathak
>Assignee: Garvit Juniwal
>  Labels: lhf
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard 
> reboot around the time of last write to the file. I think we basically need 
> to handle partially written hint files. Also, the CRC file does not exist in 
> this case (probably because it crashed while writing the hints file). May be 
> ignoring and cleaning up such partially written hint files can be a way to 
> fix this?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14080) Handling 0 size hint files during start

2017-11-30 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273111#comment-16273111
 ] 

Jeff Jirsa commented on CASSANDRA-14080:


Probably: CASSANDRA-13740

> Handling 0 size hint files during start
> ---
>
> Key: CASSANDRA-14080
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14080
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksandr Ivanov
>
> Continuation of CASSANDRA-12728 bug.
> Problem: Cassandra didn't start due to 0 size hints files
> Log form v3.0.14:
> {code:java}
> INFO  [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra 
> version: 3.0.14
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API 
> version: 20.1.0
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported 
> versions: 3.4.0 (default: 3.4.0)
> ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception 
> encountered during startup
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) 
> ~[na:1.8.0_141]
> at java.util.Iterator.forEachRemaining(Iterator.java:116) 
> ~[na:1.8.0_141]
> at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>  ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) 
> ~[na:1.8.0_141]
> at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:88) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:63) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141]
> at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:570)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> Caused by: java.io.EOFException: null
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) 
> ~[na:1.8.0_141]
> at 
> org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> ... 20 common frames omitted
> {code}
> After several 0 size hints files deletion Cassandra started successfully.
> Jeff Jirsa added a comment - Yesterday
> Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's 
> possible that the original patch didn't consider 0 byte files (I don't have 
> time to go back and look at the commit, and it was long enough ago that I've 
> forgotten) - were all of your files 0 bytes?
> Not all, 8..10 hints files were with 0 size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14079:

   Resolution: Fixed
Fix Version/s: 4.0
   3.11.2
   Status: Resolved  (was: Ready to Commit)

> Prevent compaction strategies from looping indefinitely
> ---
>
> Key: CASSANDRA-14079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14079
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.11.2, 4.0
>
>
> As a result of CASSANDRA-13948, LCS was looping indefinitely trying to 
> generate the same candidates for SSTables which were not on the tracker.
> We should add a protection on compaction strategies against looping 
> indefinitely to avoid similar bugs in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273092#comment-16273092
 ] 

Paulo Motta commented on CASSANDRA-14079:
-

Committed to {{c253ed4fa7b7b5667879bb41be09fe9658224c4e}} to cassandra-3.11 and 
merged up to trunk.

Thanks for the review!

> Prevent compaction strategies from looping indefinitely
> ---
>
> Key: CASSANDRA-14079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14079
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Fix For: 3.11.2, 4.0
>
>
> As a result of CASSANDRA-13948, LCS was looping indefinitely trying to 
> generate the same candidates for SSTables which were not on the tracker.
> We should add a protection on compaction strategies against looping 
> indefinitely to avoid similar bugs in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2017-11-30 Thread paulo

Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a01019d2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a01019d2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a01019d2

Branch: refs/heads/trunk
Commit: a01019d2c80d6cada5751fe23a7504ce549d2517
Parents: 4190468 c253ed4
Author: Paulo Motta 
Authored: Fri Dec 1 05:07:40 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 05:07:40 2017 +1100

--
 CHANGES.txt |   1 +
 .../DateTieredCompactionStrategy.java   |  16 ++-
 .../compaction/LeveledCompactionStrategy.java   |  12 ++
 .../db/compaction/LeveledManifest.java  |  22 ++-
 .../SizeTieredCompactionStrategy.java   |  12 ++
 .../TimeWindowCompactionStrategy.java   |  12 ++
 .../AbstractCompactionStrategyTest.java | 144 +++
 7 files changed, 217 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a01019d2/CHANGES.txt
--
diff --cc CHANGES.txt
index 4456af5,ce279f2..009dcb5
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,168 -1,5 +1,169 @@@
 +4.0
 + * Fix flaky SecondaryIndexManagerTest.assert[Not]MarkedAsBuilt 
(CASSANDRA-13965)
 + * Make LWTs send resultset metadata on every request (CASSANDRA-13992)
 + * Fix flaky indexWithFailedInitializationIsNotQueryableAfterPartialRebuild 
(CASSANDRA-13963)
 + * Introduce leaf-only iterator (CASSANDRA-9988)
 + * Upgrade Guava to 23.3 and Airline to 0.8 (CASSANDRA-13997)
 + * Allow only one concurrent call to StatusLogger (CASSANDRA-12182)
 + * Refactoring to specialised functional interfaces (CASSANDRA-13982)
 + * Speculative retry should allow more friendly params (CASSANDRA-13876)
 + * Throw exception if we send/receive repair messages to incompatible nodes 
(CASSANDRA-13944)
 + * Replace usages of MessageDigest with Guava's Hasher (CASSANDRA-13291)
 + * Add nodetool cmd to print hinted handoff window (CASSANDRA-13728)
 + * Fix some alerts raised by static analysis (CASSANDRA-13799)
 + * Checksum sstable metadata (CASSANDRA-13321, CASSANDRA-13593)
 + * Add result set metadata to prepared statement MD5 hash calculation 
(CASSANDRA-10786)
 + * Refactor GcCompactionTest to avoid boxing (CASSANDRA-13941)
 + * Expose recent histograms in JmxHistograms (CASSANDRA-13642)
 + * Fix buffer length comparison when decompressing in netty-based streaming 
(CASSANDRA-13899)
 + * Properly close StreamCompressionInputStream to release any ByteBuf 
(CASSANDRA-13906)
 + * Add SERIAL and LOCAL_SERIAL support for cassandra-stress (CASSANDRA-13925)
 + * LCS needlessly checks for L0 STCS candidates multiple times 
(CASSANDRA-12961)
 + * Correctly close netty channels when a stream session ends (CASSANDRA-13905)
 + * Update lz4 to 1.4.0 (CASSANDRA-13741)
 + * Optimize Paxos prepare and propose stage for local requests 
(CASSANDRA-13862)
 + * Throttle base partitions during MV repair streaming to prevent OOM 
(CASSANDRA-13299)
 + * Use compaction threshold for STCS in L0 (CASSANDRA-13861)
 + * Fix problem with min_compress_ratio: 1 and disallow ratio < 1 
(CASSANDRA-13703)
 + * Add extra information to SASI timeout exception (CASSANDRA-13677)
 + * Add incremental repair support for --hosts, --force, and subrange repair 
(CASSANDRA-13818)
 + * Rework CompactionStrategyManager.getScanners synchronization 
(CASSANDRA-13786)
 + * Add additional unit tests for batch behavior, TTLs, Timestamps 
(CASSANDRA-13846)
 + * Add keyspace and table name in schema validation exception 
(CASSANDRA-13845)
 + * Emit metrics whenever we hit tombstone failures and warn thresholds 
(CASSANDRA-13771)
 + * Make netty EventLoopGroups daemon threads (CASSANDRA-13837)
 + * Race condition when closing stream sessions (CASSANDRA-13852)
 + * NettyFactoryTest is failing in trunk on macOS (CASSANDRA-13831)
 + * Allow changing log levels via nodetool for related classes 
(CASSANDRA-12696)
 + * Add stress profile yaml with LWT (CASSANDRA-7960)
 + * Reduce memory copies and object creations when acting on ByteBufs 
(CASSANDRA-13789)
 + * Simplify mx4j configuration (Cassandra-13578)
 + * Fix trigger example on 4.0 (CASSANDRA-13796)
 + * Force minumum timeout value (CASSANDRA-9375)
 + * Use netty for streaming (CASSANDRA-12229)
 + * Use netty for internode messaging (CASSANDRA-8457)
 + * Add bytes repaired/unrepaired to nodetool tablestats (CASSANDRA-13774)
 + * Don't delete incremental repair sessions if they still have sstables 
(CASSANDRA-13758)
 + * Fix pending repair manager index out of bounds check (CASSANDRA-13769)
 + * Don't use RangeFetchMapCalculator when RF=1 (CASSANDRA-13576)
 + * Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-1

[1/3] cassandra git commit: Prevent compaction strategies from looping indefinitely

2017-11-30 Thread paulo

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 14e46e462 -> c253ed4fa
  refs/heads/trunk 41904684b -> a01019d2c


Prevent compaction strategies from looping indefinitely

Patch by Paulo Motta; Reviewed by Marcus Eriksson for CASSANDRA-14079


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c253ed4f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c253ed4f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c253ed4f

Branch: refs/heads/cassandra-3.11
Commit: c253ed4fa7b7b5667879bb41be09fe9658224c4e
Parents: 14e46e4
Author: Paulo Motta 
Authored: Sat Nov 25 01:55:35 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 05:07:31 2017 +1100

--
 CHANGES.txt |   1 +
 .../DateTieredCompactionStrategy.java   |  16 ++-
 .../compaction/LeveledCompactionStrategy.java   |  22 ++-
 .../db/compaction/LeveledManifest.java  |  22 ++-
 .../SizeTieredCompactionStrategy.java   |  12 ++
 .../TimeWindowCompactionStrategy.java   |  12 ++
 .../AbstractCompactionStrategyTest.java | 144 +++
 7 files changed, 222 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index fc18dc3..ce279f2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.2
+ * Prevent compaction strategies from looping indefinitely (CASSANDRA-14079)
  * Cache disk boundaries (CASSANDRA-13215)
  * Add asm jar to build.xml for maven builds (CASSANDRA-11193)
  * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
index 729ddc0..bb9f4b9 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
@@ -73,6 +73,7 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @SuppressWarnings("resource")
 public AbstractCompactionTask getNextBackgroundTask(int gcBefore)
 {
+List previousCandidate = null;
 while (true)
 {
 List latestBucket = 
getNextBackgroundSSTables(gcBefore);
@@ -80,9 +81,20 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 if (latestBucket.isEmpty())
 return null;
 
+// Already tried acquiring references without success. It means 
there is a race with
+// the tracker but candidate SSTables were not yet replaced in the 
compaction strategy manager
+if (latestBucket.equals(previousCandidate))
+{
+logger.warn("Could not acquire references for compacting 
SSTables {} which is not a problem per se," +
+"unless it happens frequently, in which case it 
must be reported. Will retry later.",
+latestBucket);
+return null;
+}
+
 LifecycleTransaction modifier = 
cfs.getTracker().tryModify(latestBucket, OperationType.COMPACTION);
 if (modifier != null)
 return new CompactionTask(cfs, modifier, gcBefore);
+previousCandidate = latestBucket;
 }
 }
 
@@ -170,6 +182,8 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 // no need to convert to collection if had an Iterables.max(), but not 
present in standard toolkit, and not worth adding
 List list = new ArrayList<>();
 Iterables.addAll(list, cfs.getSSTables(SSTableSet.LIVE));
+if (list.isEmpty())
+return 0;
 return Collections.max(list, (o1, o2) -> 
Long.compare(o1.getMaxTimestamp(), o2.getMaxTimestamp()))
   .getMaxTimestamp();
 }
@@ -462,7 +476,7 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 return uncheckedOptions;
 }
 
-public CompactionLogger.Strategy strategyLogger() 
+public CompactionLogger.Strategy strategyLogger()
 {
 return new CompactionLogger.Strategy()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java
-

[2/3] cassandra git commit: Prevent compaction strategies from looping indefinitely

2017-11-30 Thread paulo

Prevent compaction strategies from looping indefinitely

Patch by Paulo Motta; Reviewed by Marcus Eriksson for CASSANDRA-14079


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c253ed4f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c253ed4f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c253ed4f

Branch: refs/heads/trunk
Commit: c253ed4fa7b7b5667879bb41be09fe9658224c4e
Parents: 14e46e4
Author: Paulo Motta 
Authored: Sat Nov 25 01:55:35 2017 +1100
Committer: Paulo Motta 
Committed: Fri Dec 1 05:07:31 2017 +1100

--
 CHANGES.txt |   1 +
 .../DateTieredCompactionStrategy.java   |  16 ++-
 .../compaction/LeveledCompactionStrategy.java   |  22 ++-
 .../db/compaction/LeveledManifest.java  |  22 ++-
 .../SizeTieredCompactionStrategy.java   |  12 ++
 .../TimeWindowCompactionStrategy.java   |  12 ++
 .../AbstractCompactionStrategyTest.java | 144 +++
 7 files changed, 222 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index fc18dc3..ce279f2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.2
+ * Prevent compaction strategies from looping indefinitely (CASSANDRA-14079)
  * Cache disk boundaries (CASSANDRA-13215)
  * Add asm jar to build.xml for maven builds (CASSANDRA-11193)
  * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
index 729ddc0..bb9f4b9 100644
--- 
a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
+++ 
b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java
@@ -73,6 +73,7 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 @SuppressWarnings("resource")
 public AbstractCompactionTask getNextBackgroundTask(int gcBefore)
 {
+List previousCandidate = null;
 while (true)
 {
 List latestBucket = 
getNextBackgroundSSTables(gcBefore);
@@ -80,9 +81,20 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 if (latestBucket.isEmpty())
 return null;
 
+// Already tried acquiring references without success. It means 
there is a race with
+// the tracker but candidate SSTables were not yet replaced in the 
compaction strategy manager
+if (latestBucket.equals(previousCandidate))
+{
+logger.warn("Could not acquire references for compacting 
SSTables {} which is not a problem per se," +
+"unless it happens frequently, in which case it 
must be reported. Will retry later.",
+latestBucket);
+return null;
+}
+
 LifecycleTransaction modifier = 
cfs.getTracker().tryModify(latestBucket, OperationType.COMPACTION);
 if (modifier != null)
 return new CompactionTask(cfs, modifier, gcBefore);
+previousCandidate = latestBucket;
 }
 }
 
@@ -170,6 +182,8 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 // no need to convert to collection if had an Iterables.max(), but not 
present in standard toolkit, and not worth adding
 List list = new ArrayList<>();
 Iterables.addAll(list, cfs.getSSTables(SSTableSet.LIVE));
+if (list.isEmpty())
+return 0;
 return Collections.max(list, (o1, o2) -> 
Long.compare(o1.getMaxTimestamp(), o2.getMaxTimestamp()))
   .getMaxTimestamp();
 }
@@ -462,7 +476,7 @@ public class DateTieredCompactionStrategy extends 
AbstractCompactionStrategy
 return uncheckedOptions;
 }
 
-public CompactionLogger.Strategy strategyLogger() 
+public CompactionLogger.Strategy strategyLogger()
 {
 return new CompactionLogger.Strategy()
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java
--
diff --git 
a/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java 
b/src/java/org/apache/cassandra/db/compaction/Le

[jira] [Commented] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272977#comment-16272977
 ] 

Paulo Motta commented on CASSANDRA-14084:
-

This situation is reproduced by [this 
dest|https://github.com/pauloricardomg/cassandra-dtest/commit/1b96dfd855d1b2fc10cbb4cf2e4c95d236ecd951#diff-1ef92939c7765f8c4041bada71208eebR51].

The simple fix is to use normal tokens for replacement nodes with the same 
address:
* [3.11|https://github.com/pauloricardomg/cassandra/tree/3.11-14084]

CI looked clean when this was in CASSANDRA-13948, but I will submit again just 
to make sure this will not cause problems when committed separately.

> Disks can be imbalanced during replace of same address when using JBOD
> --
>
> Key: CASSANDRA-14084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14084
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> While investigating CASSANDRA-14083, I noticed that [we use the pending 
> ranges to calculate the disk 
> boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91]
>  when the node is bootstrapping.
> The problem is that when the node is replacing a node with the same address, 
> it [sets itself as normal 
> locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449]
>  (for other unrelated reasons), so the local ranges will be null and 
> consequently the disk boundaries will be null. This will cause the sstables 
> to be randomly spread across disks potentially causing imbalance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14084:

Status: Patch Available  (was: In Progress)

> Disks can be imbalanced during replace of same address when using JBOD
> --
>
> Key: CASSANDRA-14084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14084
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> While investigating CASSANDRA-14083, I noticed that [we use the pending 
> ranges to calculate the disk 
> boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91]
>  when the node is bootstrapping.
> The problem is that when the node is replacing a node with the same address, 
> it [sets itself as normal 
> locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449]
>  (for other unrelated reasons), so the local ranges will be null and 
> consequently the disk boundaries will be null. This will cause the sstables 
> to be randomly spread across disks potentially causing imbalance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD

2017-11-30 Thread Paulo Motta (JIRA)

Paulo Motta created CASSANDRA-14084:
---

 Summary: Disks can be imbalanced during replace of same address 
when using JBOD
 Key: CASSANDRA-14084
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14084
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta
Assignee: Paulo Motta


While investigating CASSANDRA-14083, I noticed that [we use the pending ranges 
to calculate the disk 
boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91]
 when the node is bootstrapping.

The problem is that when the node is replacing a node with the same address, it 
[sets itself as normal 
locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449]
 (for other unrelated reasons), so the local ranges will be null and 
consequently the disk boundaries will be null. This will cause the sstables to 
be randomly spread across disks potentially causing imbalance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272918#comment-16272918
 ] 

Paulo Motta commented on CASSANDRA-14083:
-

After doing the trivial change of only invalidating disk boundaries when the 
replication settings change, 
{{disk_balance_test.py:TestDiskBalance.disk_balance_bootstrap_test}} started 
failing with imbalanced disks.

After investigation, it turned out that when the node starts bootstrapping, it 
doesn't have any information about itself on {{TokenMetadata}}, so the disk 
boundaries will be empty. When the node adds itself to gossip, the cached ring 
version does not change, so the disk boundaries is never invalidated what 
affects the disk balance.

This test was not failing before this change because during keyspace 
initialization, the disk boundaries were being invalidated by 
[Keyspace.setMetatada|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187],
 and properly reloaded with the correct boundaries during streaming - but if 
some consumer read the disk boundaries before it was set by the bootstrap 
process, it would cache an older version.

My simple fix simply [invalidates the cached 
ring|https://github.com/pauloricardomg/cassandra/commit/fb66c3c451caec936447929f45be3c5f90725a48]
 after the node is added as bootstrapping to gossip, but this will also 
invalidate cached rings unnecessarily only to invalidate the disk boundaries. 
Perhaps we could decouple the cached ring version from the actual ring version 
which takes into account pending node changes (bootstrapping, leaving)?

Patch:
* [3.11|https://github.com/pauloricardomg/cassandra/tree/3.11-14083]

Since this depends on CASSANDRA-13948, I will wait until that is committed 
before setting this as PA.

> Avoid invalidating disk boundaries unnecessarily
> 
>
> Key: CASSANDRA-14083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14083
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> We currently invalidate disk boundaries whenever [instantiating a new 
> replication 
> strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359],
>  but this is done whenever [updating keyspace 
> settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187].
>  
> Computing new boundaries is expensive and unnecessarily invalidating them 
> will cause {{DiskBoundaries}} consumers to also invalidate their work 
> unnecessarily. For instance, after CASSANDRA-13948 the 
> {{CompactionStrategyManager}} will reload all compaction strategies when the 
> boundaries are invalidated.
> In this case, we should only invalidate the disk boundaries when the 
> replication settings change to avoid doing unnecessary work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14083:

Issue Type: Improvement  (was: Bug)

> Avoid invalidating disk boundaries unnecessarily
> 
>
> Key: CASSANDRA-14083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14083
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> We currently invalidate disk boundaries whenever [instantiating a new 
> replication 
> strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359],
>  but this is done whenever [updating keyspace 
> settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187].
>  
> Computing new boundaries is expensive and unnecessarily invalidating them 
> will cause {{DiskBoundaries}} consumers to also invalidate their work 
> unnecessarily. For instance, after CASSANDRA-13948 the 
> {{CompactionStrategyManager}} will reload all compaction strategies when the 
> boundaries are invalidated.
> In this case, we should only invalidate the disk boundaries when the 
> replication settings change to avoid doing unnecessary work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14083:

Description: 
We currently invalidate disk boundaries whenever [instantiating a new 
replication 
strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359],
 but this is done whenever [updating keyspace 
settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187].
 

Computing new boundaries is expensive and unnecessarily invalidating them will 
cause {{DiskBoundaries}} consumers to also invalidate their work unnecessarily. 
For instance, after CASSANDRA-13948 the {{CompactionStrategyManager}} will 
reload all compaction strategies when the boundaries are invalidated.

In this case, we should only invalidate the disk boundaries when the 
replication settings change to avoid doing unnecessary work.

> Avoid invalidating disk boundaries unnecessarily
> 
>
> Key: CASSANDRA-14083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14083
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> We currently invalidate disk boundaries whenever [instantiating a new 
> replication 
> strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359],
>  but this is done whenever [updating keyspace 
> settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187].
>  
> Computing new boundaries is expensive and unnecessarily invalidating them 
> will cause {{DiskBoundaries}} consumers to also invalidate their work 
> unnecessarily. For instance, after CASSANDRA-13948 the 
> {{CompactionStrategyManager}} will reload all compaction strategies when the 
> boundaries are invalidated.
> In this case, we should only invalidate the disk boundaries when the 
> replication settings change to avoid doing unnecessary work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily

2017-11-30 Thread Paulo Motta (JIRA)

Paulo Motta created CASSANDRA-14083:
---

 Summary: Avoid invalidating disk boundaries unnecessarily
 Key: CASSANDRA-14083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14083
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta
Assignee: Paulo Motta






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14082) Do not expose compaction strategy index publicly

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272656#comment-16272656
 ] 

Paulo Motta commented on CASSANDRA-14082:
-

Currently the scrubber and relocate sstables were relying on the compaction 
strategy index, so this patches change these operation to use a 
{{DiskBoundaries}} object instead and make {{CSM.getCompactionStrategyIndex}} 
private.

* [3.11 patch|https://github.com/pauloricardomg/cassandra/tree/3.11-14082]

Since this depends on CASSANDRA-13948, I will wait until that is committed 
before setting this as PA.

> Do not expose compaction strategy index publicly
> 
>
> Key: CASSANDRA-14082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14082
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> Before CASSANDRA-13215 we used the compaction strategy index to decide which 
> disk to place a given sstable, but now we can get this directly from the disk 
> boundary manager and keep the compaction strategy index internal only.
> This will ensure external consumers will use a consistent {{DiskBoundaries}} 
> object to perform operations on multiple disks, rather than risking getting 
> inconsistent indexes if the compaction strategy indexes change between 
> successive calls to {{CSM.getCompactionStrategyIndex}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14082) Do not expose compaction strategy index publicly

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta reassigned CASSANDRA-14082:
---

Assignee: Paulo Motta

> Do not expose compaction strategy index publicly
> 
>
> Key: CASSANDRA-14082
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14082
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> Before CASSANDRA-13215 we used the compaction strategy index to decide which 
> disk to place a given sstable, but now we can get this directly from the disk 
> boundary manager and keep the compaction strategy index internal only.
> This will ensure external consumers will use a consistent {{DiskBoundaries}} 
> object to perform operations on multiple disks, rather than risking getting 
> inconsistent indexes if the compaction strategy indexes change between 
> successive calls to {{CSM.getCompactionStrategyIndex}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14082) Do not expose compaction strategy index publicly

2017-11-30 Thread Paulo Motta (JIRA)

Paulo Motta created CASSANDRA-14082:
---

 Summary: Do not expose compaction strategy index publicly
 Key: CASSANDRA-14082
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14082
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta


Before CASSANDRA-13215 we used the compaction strategy index to decide which 
disk to place a given sstable, but now we can get this directly from the disk 
boundary manager and keep the compaction strategy index internal only.

This will ensure external consumers will use a consistent {{DiskBoundaries}} 
object to perform operations on multiple disks, rather than risking getting 
inconsistent indexes if the compaction strategy indexes change between 
successive calls to {{CSM.getCompactionStrategyIndex}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272641#comment-16272641
 ] 

Paulo Motta commented on CASSANDRA-13948:
-

bq. This ticket is getting quite big and very hard to review

I tried to make things easier by splitting in different commits, but I agree it 
became a bit complicated for review.

bq. Could we split out all the pre-existing bugs in other tickets and get them 
committed separately? Especially this as it involves tokenmetadata.

The problem is that some bugs (even though were pre-existing) only started 
showing up after this, so they have a dependency on this. 

I reorganized [this 
branch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948] to keep 
only things essential to this ticket, created CASSANDRA-14079 and 
CASSANDRA-14081 with unrelated minor fixes, and will create two follow-up 
tickets which depend on this.

This should be ready for review now, please let me know if some of the changes 
are not clear for you and needs better explanation. CI looked clean before the 
reorganization, but I will resubmit with the essential ticket just to make sure 
we didn't miss anything:

* [3.11 patch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948]
* [dtest|https://github.com/pauloricardomg/cassandra-dtest/tree/13948]

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, dtest13948.png, dtest2.png, 
> threaddump-cleanup.txt, threaddump.txt, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
>  java.util.Collection, org.apache.cassandra.db.compaction.OperationType, 
> java.lang.Throwable) @bci=50, line=409 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
>  @bci=157, line=227 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
>  @bci=61, line=116 (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
>  @bci=2, line=200 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
>  @bci=5, line=185 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
>  @bci=559, line=130 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=9, line=1420 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=4, line=250 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() 
> @bci=30, line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() 
> @bci=4, line=125 (Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$Uncomplaini

[jira] [Commented] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed

2017-11-30 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272625#comment-16272625
 ] 

Marcus Eriksson commented on CASSANDRA-14081:
-

+1 to remove this in trunk (this was added to give 3rd party compaction 
strategies more control, but I doubt it is needed anymore)

should we remove {{ACS#getMemtableReservedSize()}} and 
{{ACS#isAffectedByMeteredFlusher()}} at the same time?


> Remove AbstractCompactionStrategy.replaceFlushed
> 
>
> Key: CASSANDRA-14081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14081
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
>
> I didn't find a reason for why we need to send flush notifications from CFs 
> -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker 
> from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed

2017-11-30 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272603#comment-16272603
 ] 

Paulo Motta commented on CASSANDRA-14081:
-

Trivial patch 
[here|https://github.com/pauloricardomg/cassandra/tree/trunk-14081]. 

CI looked clean when this was in CASSANDRA-13948, but I will submit again just 
to make sure this will not cause problems when committed separately on trunk.

> Remove AbstractCompactionStrategy.replaceFlushed
> 
>
> Key: CASSANDRA-14081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14081
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
>
> I didn't find a reason for why we need to send flush notifications from CFs 
> -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker 
> from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed

2017-11-30 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14081:

Status: Patch Available  (was: Open)

> Remove AbstractCompactionStrategy.replaceFlushed
> 
>
> Key: CASSANDRA-14081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14081
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
>
> I didn't find a reason for why we need to send flush notifications from CFs 
> -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker 
> from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed

2017-11-30 Thread Paulo Motta (JIRA)

Paulo Motta created CASSANDRA-14081:
---

 Summary: Remove AbstractCompactionStrategy.replaceFlushed
 Key: CASSANDRA-14081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14081
 Project: Cassandra
  Issue Type: Improvement
Reporter: Paulo Motta
Assignee: Paulo Motta
Priority: Minor


I didn't find a reason for why we need to send flush notifications from CFs -> 
CSM -> Tracker, if we can bypass the CSM and send directly to the tracker from 
the CFS (and handle it on the CSM via {{SSTableAddedNotification}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13987) Multithreaded commitlog subtly changed durability

2017-11-30 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272601#comment-16272601
 ] 

Jason Brown commented on CASSANDRA-13987:
-

Spent about a week tracking down a race condition, and thankfully it was just a 
stupid bug which is fixed. I've also backported to 3.0 and 3.11

||3.0||3.11||trunk||
|[branch|https://github.com/jasobrown/cassandra/tree/13987-3.0]|[branch|https://github.com/jasobrown/cassandra/tree/13987-3.11]|[branch|https://github.com/jasobrown/cassandra/tree/commitlog_mmap-more-frequent-markers]|
|[utests & 
dtests|https://circleci.com/gh/jasobrown/cassandra/tree/13987-3.0]|[utests 
& 
dtests|https://circleci.com/gh/jasobrown/cassandra/tree/13987-3.11]|[utests 
& 
dtests|https://circleci.com/gh/jasobrown/cassandra/tree/commitlog_mmap-more-frequent-markers]|
||

The trunk branch is a continuation of the previous development branch, while 
the 3.0/3.11 branched are squashed backports. 3.11 is trivially close to the 
trunk code (minor compilation fixes were needed), but 3.0 required a bit more 
work.

utests look good across the branches, and I'm waiting for the dtests to finish. 
Note: I'm using an updated circleci config which won't be committed into the 
apache repo.

bq. Should we move the call to writeCDCIndexFile ...

Yup, certainly seems the correct thing to do ;) I addressed the other nits, as 
well.

While running the utests on circleci, I there were some failures, related to 
not forcing the flush when shutting down the {{AbstractCommitLogService}}. 
That's fixed in the latest commit. 

bq. do you think it's worth adding a unit test or two for this?

Yes, and I've added {{CommitLogChainedMarkersTest}}. Looking at it now, perhaps 
it could do with a better name and/or a comment at the top of the file 
explaining what, specifically, it's testing.

Also, I've fixed a few minor things for a few tests. 

- {{AbstractCommitLogService#requestExtraSync()}} was correctly unparking the 
sync thread, but commit log data was not being flushed to disk. Thus, I added a 
volatile boolean to the class for {{#requestExtraSync()}} to indicate that a 
sync should happen. This is mostly to support batch commit log mode.

> Multithreaded commitlog subtly changed durability
> -
>
> Key: CASSANDRA-13987
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13987
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jason Brown
>Assignee: Jason Brown
> Fix For: 4.x
>
>
> When multithreaded commitlog was introduced in CASSANDRA-3578, we subtly 
> changed the way that commitlog durability worked. Everything still gets 
> written to an mmap file. However, not everything is replayable from the 
> mmaped file after a process crash, in periodic mode.
> In brief, the reason this changesd is due to the chained markers that are 
> required for the multithreaded commit log. At each msync, we wait for 
> outstanding mutations to serialize into the commitlog, and update a marker 
> before and after the commits that have accumluated since the last sync. With 
> those markers, we can safely replay that section of the commitlog. Without 
> the markers, we have no guarantee that the commits in that section were 
> successfully written, thus we abandon those commits on replay.
> If you have correlated process failures of multiple nodes at "nearly" the 
> same time (see ["There Is No 
> Now"|http://queue.acm.org/detail.cfm?id=2745385]), it is possible to have 
> data loss if none of the nodes msync the commitlog. For example, with RF=3, 
> if quorum write succeeds on two nodes (and we acknowledge the write back to 
> the client), and then the process on both nodes OOMs (say, due to reading the 
> index for a 100GB partition), the write will be lost if neither process 
> msync'ed the commitlog. More exactly, the commitlog cannot be fully replayed. 
> The reason why this data is silently lost is due to the chained markers that 
> were introduced with CASSANDRA-3578.
> The problem we are addressing with this ticket is incrementally improving 
> 'durability' due to process crash, not host crash. (Note: operators should 
> use batch mode to ensure greater durability, but batch mode in it's current 
> implementation is a) borked, and b) will burn through, *very* rapidly, SSDs 
> that don't have a non-volatile write cache sitting in front.) 
> The current default for {{commitlog_sync_period_in_ms}} is 10 seconds, which 
> means that a node could lose up to ten seconds of data due to process crash. 
> The unfortunate thing is that the data is still avaialble, in the mmap file, 
> but we can't replay it due to incomplete chained markers.
> ftr, I don't believe we've ever had a stated policy about commitlog 
> durability wrt process crash. Pre-2.0 we naturally pigg

[jira] [Commented] (CASSANDRA-14080) Handling 0 size hint files during start

2017-11-30 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272592#comment-16272592
 ] 

Aleksey Yeschenko commented on CASSANDRA-14080:
---

Sorry. I don't mean that you shouldn't file/have filed this JIRA. Just saying 
that the similar one we closed recenlty-ish might have some useful context, so 
you might want to look it up and link to this one.

> Handling 0 size hint files during start
> ---
>
> Key: CASSANDRA-14080
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14080
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksandr Ivanov
>
> Continuation of CASSANDRA-12728 bug.
> Problem: Cassandra didn't start due to 0 size hints files
> Log form v3.0.14:
> {code:java}
> INFO  [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra 
> version: 3.0.14
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API 
> version: 20.1.0
> INFO  [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported 
> versions: 3.4.0 (default: 3.4.0)
> ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception 
> encountered during startup
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) 
> ~[na:1.8.0_141]
> at java.util.Iterator.forEachRemaining(Iterator.java:116) 
> ~[na:1.8.0_141]
> at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>  ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
> ~[na:1.8.0_141]
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) 
> ~[na:1.8.0_141]
> at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:88) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsService.(HintsService.java:63) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141]
> at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:570)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) 
> [apache-cassandra-3.0.14.jar:3.0.14]
> Caused by: java.io.EOFException: null
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) 
> ~[na:1.8.0_141]
> at 
> org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> ... 20 common frames omitted
> {code}
> After several 0 size hints files deletion Cassandra started successfully.
> Jeff Jirsa added a comment - Yesterday
> Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's 
> possible that the original patch didn't consider 0 byte files (I don't have 
> time to go back and look at the commit, and it was long enough ago that I've 
> forgotten) - were all of your files 0 bytes?
> Not all, 8..10 hints files were with 0 size.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

1 2 >

1 - 100 of 117 matches

Mail list logo