[jira] [Commented] (CASSANDRA-12971) Add CAS option to WRITE test to stress tool
[ https://issues.apache.org/jira/browse/CASSANDRA-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273986#comment-16273986 ] Vladimir Yudovin commented on CASSANDRA-12971: -- I guess yes. > Add CAS option to WRITE test to stress tool > --- > > Key: CASSANDRA-12971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12971 > Project: Cassandra > Issue Type: New Feature > Components: Stress, Tools >Reporter: Vladimir Yudovin >Assignee: Vladimir Yudovin > Attachments: stress-cass.patch > > > If -cas option is present each UPDATE is performed with true IF condition, > thus data is inserted anyway. > It's implemented, if it's needed I proceed with the patch. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14078) Fix dTest test_bulk_round_trip_blogposts_with_max_connections
[ https://issues.apache.org/jira/browse/CASSANDRA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273946#comment-16273946 ] Kurt Greaves commented on CASSANDRA-14078: -- Could we not do this test by just setting a low {{native_transport_max_concurrent_connections}} on one node and then have a much higher value on the other nodes, so we just trigger a fail-over on one node? That way we aren't relying on completely overloading the cluster just to test this. > Fix dTest test_bulk_round_trip_blogposts_with_max_connections > - > > Key: CASSANDRA-14078 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14078 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > > This ticket is regarding following dTest > {{cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_blogposts_with_max_connections}} > This test is trying to limit number of client connections and assumes that > once connection count has reached then client will fail-over to other node > and do the request. The reason is, it is not deterministic test case as it > totally depends on what hardware you run, timing, etc. > For example > If we look at > https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/353/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_blogposts_with_max_connections/ > {quote} > ... > Processed: 5000 rows; Rate:2551 rows/s; Avg. rate:2551 rows/s > All replicas busy, sleeping for 4 second(s)... > Processed: 1 rows; Rate:2328 rows/s; Avg. rate:2307 rows/s > All replicas busy, sleeping for 1 second(s)... > Processed: 15000 rows; Rate:2137 rows/s; Avg. rate:2173 rows/s > All replicas busy, sleeping for 11 second(s)... > Processed: 2 rows; Rate:2138 rows/s; Avg. rate:2164 rows/s > Processed: 25000 rows; Rate:2403 rows/s; Avg. rate:2249 rows/s > Processed: 3 rows; Rate:2582 rows/s; Avg. rate:2321 rows/s > Processed: 35000 rows; Rate:2835 rows/s; Avg. rate:2406 rows/s > Processed: 4 rows; Rate:2867 rows/s; Avg. rate:2458 rows/s > Processed: 45000 rows; Rate:3163 rows/s; Avg. rate:2540 rows/s > Processed: 5 rows; Rate:3200 rows/s; Avg. rate:2596 rows/s > Processed: 50234 rows; Rate:2032 rows/s; Avg. rate:2572 rows/s > All replicas busy, sleeping for 23 second(s)... > Replicas too busy, given up > ... > {quote} > Here we can see request is timing out, sometimes it resumes after 1 second, > next time 11 seconds and some times it doesn't work at all. > In my opinion this test is not a good fit for dTest as dTest(s) should be > deterministic. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13873) Ref bug in Scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273882#comment-16273882 ] Joel Knighton commented on CASSANDRA-13873: --- Thanks for the patches and CI. Both your remarks look correct to me; frankly, I have no idea how I missed that in anticompaction. Test results look good for the most part. There's a few flaky unit tests on 3.0/3.11 that appear to have failed the same way before the patch, pass for me locally, and appear to be at the limits of CircleCI's timeouts/resources. The 2.2 dtests timed out, so it seems worthwhile to trigger those again just in case. The only unusual failures on 3.0 dtests are a bunch of tests where Jolokia failed to attach for JMX. I'm not sure if this is a known environmental problem on ASF dtests, but I was unable to reproduce this elsewhere. Overall, +1 to the patch for me, and this looks good to merge if none of the test issues I raised above worry you. > Ref bug in Scrub > > > Key: CASSANDRA-13873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13873 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > I'm hitting a Ref bug when many scrubs run against a node. This doesn't > happen on 3.0.X. I'm not sure if/if not this happens with compactions too > but I suspect it does. > I'm not seeing any Ref leaks or double frees. > To Reproduce: > {quote} > ./tools/bin/cassandra-stress write n=10m -rate threads=100 > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > {quote} > Eventually in the logs you get: > WARN [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 > NoSpamLogger.java:97 - Spinning trying to capture readers > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')], > *released: > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],* > > This released table has a selfRef of 0 but is in the Tracker -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13873) Ref bug in Scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton reassigned CASSANDRA-13873: - Assignee: Marcus Eriksson (was: Joel Knighton) > Ref bug in Scrub > > > Key: CASSANDRA-13873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13873 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > I'm hitting a Ref bug when many scrubs run against a node. This doesn't > happen on 3.0.X. I'm not sure if/if not this happens with compactions too > but I suspect it does. > I'm not seeing any Ref leaks or double frees. > To Reproduce: > {quote} > ./tools/bin/cassandra-stress write n=10m -rate threads=100 > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > {quote} > Eventually in the logs you get: > WARN [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 > NoSpamLogger.java:97 - Spinning trying to capture readers > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')], > *released: > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],* > > This released table has a selfRef of 0 but is in the Tracker -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13873) Ref bug in Scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-13873: -- Reviewer: Joel Knighton (was: Marcus Eriksson) > Ref bug in Scrub > > > Key: CASSANDRA-13873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13873 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > I'm hitting a Ref bug when many scrubs run against a node. This doesn't > happen on 3.0.X. I'm not sure if/if not this happens with compactions too > but I suspect it does. > I'm not seeing any Ref leaks or double frees. > To Reproduce: > {quote} > ./tools/bin/cassandra-stress write n=10m -rate threads=100 > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > #Ctrl-C > ./bin/nodetool scrub > {quote} > Eventually in the logs you get: > WARN [RMI TCP Connection(4)-127.0.0.1] 2017-09-14 15:51:26,722 > NoSpamLogger.java:97 - Spinning trying to capture readers > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-32-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-31-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-29-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-27-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-26-big-Data.db'), > > BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-20-big-Data.db')], > *released: > [BigTableReader(path='/home/jake/workspace/cassandra2/data/data/keyspace1/standard1-2eb5c780998311e79e09311efffdcd17/mc-5-big-Data.db')],* > > This released table has a selfRef of 0 but is in the Tracker -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Attachment: (was: Pasted image at 2017_12_01 11_44 AM.png) > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, cleanup.png, multiple operations.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Comment: was deleted (was: MultipleDirectoriesScreenshot) > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, Pasted image at 2017_12_01 11_44 AM.png, > cleanup.png, multiple operations.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Attachment: Pasted image at 2017_12_01 11_44 AM.png MultipleDirectoriesScreenshot > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, Pasted image at 2017_12_01 11_44 AM.png, > cleanup.png, multiple operations.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273867#comment-16273867 ] Alex Lourie commented on CASSANDRA-13010: - [~rustyrazorblade] I've got back to working on this ticket. I think I've covered all possible operations and the patch is now in a good shape. I've tested it with compactions(including split and user-defined), repair, scrub and cleanup operations; I also tested with multiple data directories. It looks ok for all of them, here are a couple of screenshots: [^cleanup.png] [^multiple operations.png] I think that the patch is ready for review at github (https://github.com/apache/cassandra/compare/trunk...alourie:CASSANDRA-13010) or as a patch [^13010.patch] Would appreciate any feedback. Thanks. > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, cleanup.png, multiple operations.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13530) GroupCommitLogService
[ https://issues.apache.org/jira/browse/CASSANDRA-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273855#comment-16273855 ] Jason Brown commented on CASSANDRA-13530: - OK, I've reverted the patch that lengthened the long tests' timeout, and refactored {{CommitLogStressTest}} properly. For each commit log mode, I've created a subclass of {{CommitLogStress}} and they run (obviously) in ~1/3 the time of the test with all of the modes. The only thing I wasn't sure about was the {{main()}} function in {{CommitLogStressTest}}. I think it's there for convenience, but I'm not sure what it's convenient for. I'm all for removing it as there's no infra that depends on it and it's behavior was exactly the same running the long-test. wdyt? > GroupCommitLogService > - > > Key: CASSANDRA-13530 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13530 > Project: Cassandra > Issue Type: Improvement >Reporter: Yuji Ito >Assignee: Yuji Ito > Fix For: 2.2.x, 3.0.x, 3.11.x > > Attachments: GuavaRequestThread.java, MicroRequestThread.java, > groupAndBatch.png, groupCommit22.patch, groupCommit30.patch, > groupCommit3x.patch, groupCommitLog_noSerial_result.xlsx, > groupCommitLog_result.xlsx > > > I propose a new CommitLogService, GroupCommitLogService, to improve the > throughput when lots of requests are received. > It improved the throughput by maximum 94%. > I'd like to discuss about this CommitLogService. > Currently, we can select either 2 CommitLog services; Periodic and Batch. > In Periodic, we might lose some commit log which hasn't written to the disk. > In Batch, we can write commit log to the disk every time. The size of commit > log to write is too small (< 4KB). When high concurrency, these writes are > gathered and persisted to the disk at once. But, when insufficient > concurrency, many small writes are issued and the performance decreases due > to the latency of the disk. Even if you use SSD, processes of many IO > commands decrease the performance. > GroupCommitLogService writes some commitlog to the disk at once. > The patch adds GroupCommitLogService (It is enabled by setting > `commitlog_sync` and `commitlog_sync_group_window_in_ms` in cassandra.yaml). > The difference from Batch is just only waiting for the semaphore. > By waiting for the semaphore, some writes for commit logs are executed at the > same time. > In GroupCommitLogService, the latency becomes worse if the there is no > concurrency. > I measured the performance with my microbench (MicroRequestThread.java) by > increasing the number of threads.The cluster has 3 nodes (Replication factor: > 3). Each nodes is AWS EC2 m4.large instance + 200IOPS io1 volume. > The result is as below. The GroupCommitLogService with 10ms window improved > update with Paxos by 94% and improved select with Paxos by 76%. > h6. SELECT / sec > ||\# of threads||Batch 2ms||Group 10ms|| > |1|192|103| > |2|163|212| > |4|264|416| > |8|454|800| > |16|744|1311| > |32|1151|1481| > |64|1767|1844| > |128|2949|3011| > |256|4723|5000| > h6. UPDATE / sec > ||\# of threads||Batch 2ms||Group 10ms|| > |1|45|26| > |2|39|51| > |4|58|102| > |8|102|198| > |16|167|213| > |32|289|295| > |64|544|548| > |128|1046|1058| > |256|2020|2061| -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Attachment: multiple operations.png > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, cleanup.png, multiple operations.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Attachment: cleanup.png > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch, cleanup.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13010) nodetool compactionstats should say which disk a compaction is writing to
[ https://issues.apache.org/jira/browse/CASSANDRA-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Lourie updated CASSANDRA-13010: Attachment: 13010.patch > nodetool compactionstats should say which disk a compaction is writing to > - > > Key: CASSANDRA-13010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13010 > Project: Cassandra > Issue Type: New Feature > Components: Compaction, Tools >Reporter: Jon Haddad >Assignee: Alex Lourie > Labels: lhf > Attachments: 13010.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273767#comment-16273767 ] Jeff Jirsa commented on CASSANDRA-13976: Encourage you to ask dev@ - I was going to suggest that as well. Pretty -0 on this right now (it's pretty firmly in the "I wouldn't do this, but I'm not going to really go out of my way to hard veto it" category). My primary concern is that as we change yaml params, years of blog posts become irrelevant, and eventually we'll deprecate out the old ones and remove them, and then someone's rolling upgrade will break. "Of course you have to change yaml with major versions", you say, but the less true that is, the better life is for users. > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273764#comment-16273764 ] Jon Haddad commented on CASSANDRA-13976: I'd actually like to get some feedback on -dev regarding this. I'd like to change *every* setting to use duration types, because it makes it less error prone to set. Mistyping a millisecond based config is pretty easy and hard to catch when it's wrong. > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273758#comment-16273758 ] Jeff Jirsa commented on CASSANDRA-13976: We have 25 other config options that take millis. Why are we changing one, when it's one that's rarely tuned anyway? There are plenty others (auth permission validity) that is also almost certainly never set in milliseconds that you haven't suggested changing. How do you propose we keep consistency there? Is this really something where the ease of setting it once is going to outweigh the config churn for the typical user? > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273727#comment-16273727 ] Jon Haddad edited comment on CASSANDRA-13976 at 12/1/17 12:49 AM: -- I've thought about this a bit more, and I think across the board we should be using duration types and get rid of the _ms label altogether. It's WAY more readable and friendly to be able to do: {code} max_hint_window = 3h {code} Regarding nodetool, it would report back whatever duration labeled setting was in there using "ms" if the old _ms value was provided. Internally, it would convert everything to ms, leaving the current code in place. was (Author: rustyrazorblade): I've thought about this a bit more, and I think across the board we should be using duration types and get rid of the _ms label altogether. It's WAY more readable and friendly to be able to do: {code} max_hint_window = 3h {code} Regarding nodetool, it would report back whatever setting was in there. Internally, it would convert everything to ms, leaving the current code in place. > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14008) RTs at index boundaries in 2.x sstables can create unexpected CQL row in 3.x
[ https://issues.apache.org/jira/browse/CASSANDRA-14008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273735#comment-16273735 ] Jeff Jirsa commented on CASSANDRA-14008: The raw patches that fix the bug in LegacyLayout are at || Branch || CI || | [3.0|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.0-14008] | [!https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-14008.svg?style=svg!|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-14008/] | | [3.11|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.11-14008] | [!https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.1-14008.svg?style=svg!|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.11-14008/]| I was hoping to actually have a solution to un-breaking the broken 3.0 sstables in the same patch, but it's proving to be more difficult than I anticipated. I haven't yet tried to make some sample sstables for regression tests, I agree it'd be nice to have those. Please glance at the code, and I'll work on the regression sstables before committing. > RTs at index boundaries in 2.x sstables can create unexpected CQL row in 3.x > > > Key: CASSANDRA-14008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14008 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Labels: correctness > Fix For: 3.0.x, 3.11.x > > > In 2.1/2.2, it is possible for a range tombstone that isn't a row deletion > and isn't a complex deletion to appear between two cells with the same > clustering. The 8099 legacy code incorrectly treats the two (non-RT) cells as > two distinct CQL rows, despite having the same clustering prefix. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273727#comment-16273727 ] Jon Haddad commented on CASSANDRA-13976: I've thought about this a bit more, and I think across the board we should be using duration types and get rid of the _ms label altogether. It's WAY more readable and friendly to be able to do: {code} max_hint_window = 3h {code} Regarding nodetool, it would report back whatever setting was in there. Internally, it would convert everything to ms, leaving the current code in place. > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14074) Remove "OpenJDK is not recommended" Startup Warning
[ https://issues.apache.org/jira/browse/CASSANDRA-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves resolved CASSANDRA-14074. -- Resolution: Fixed Closing as duplicate of CASSANDRA-13916 > Remove "OpenJDK is not recommended" Startup Warning > --- > > Key: CASSANDRA-14074 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14074 > Project: Cassandra > Issue Type: Improvement >Reporter: Michael Kjellman > Labels: lhf > > We should remove the following warning on C* startup that OpenJDK is not > recommended. Now that with JDK8 OpenJDK is the reference JVM implementation > and things are much more stable -- and that all of our tests run on OpenJDK > builds due to the Oracle JDK license, this warning isn't helpful and is > actually wrong and we should remove it to prevent any user confusion. > WARN [main] 2017-11-28 19:39:08,446 StartupChecks.java:202 - OpenJDK is not > recommended. Please upgrade to the newest Oracle Java release -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13916) Remove OpenJDK log warning
[ https://issues.apache.org/jira/browse/CASSANDRA-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves updated CASSANDRA-13916: - Labels: lhf (was: ) > Remove OpenJDK log warning > -- > > Key: CASSANDRA-13916 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13916 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Anthony Grasso >Priority: Minor > Labels: lhf > > The following warning message will appear in the logs when using OpenJDK > {noformat} > WARN [main] ... OpenJDK is not recommended. Please upgrade to the newest > Oracle Java release > {noformat} > The above warning dates back to when OpenJDK 6 was released and there were > some issues in early releases of this version. The OpenJDK implementation is > used as a reference for the OracleJDK which means the implementations are > very close. In addition, most users have moved off Java 6 so we can probably > remove this warning message. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13976) introduce max_hint_window_in_min, deprecate max_hint_window_in_ms
[ https://issues.apache.org/jira/browse/CASSANDRA-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273722#comment-16273722 ] Kurt Greaves commented on CASSANDRA-13976: -- Not necessary to support both, just convert the minute value to ms if the {{max_hint_window_in_min}} property exists, otherwise use the {{max_hint_window_in_ms}} value or default if neither exist. TBH I don't think we should ever completely get rid of the ms config option as I wouldn't be surprised if there are tests relying on setting it to <1min, but we could remove it from the default yaml and add in the minute based config instead. > introduce max_hint_window_in_min, deprecate max_hint_window_in_ms > - > > Key: CASSANDRA-13976 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13976 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Assignee: Kirk True >Priority: Minor > Labels: lhf > Fix For: 4.0 > > > Milliseconds is unnecessarily precise. At most, minutes would be used. > Config in 4.0 should default to a minute granularity, but if the > max_hint_window_in_min isn't set should fall back on max_hint_window_in_ms > and emit a warning. > max_hint_window_in_min: 180 # new default, still at 3 hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart
[ https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273709#comment-16273709 ] Vincent White commented on CASSANDRA-14013: --- I've create a patch for 3.0.x and trunk using the same method. I guess it should be safe to work with just absolute paths rather than canonical paths here, I haven't made that change on the 3.x.x patches yet. I also had to fiddle with the unit tests since there is now a dependancy on DatabaseDescriptor and passing in file paths that exist in the configured data directory. [3.0.x|https://github.com/vincewhite/cassandra/commits/14013-30] [3.11.x|https://github.com/vincewhite/cassandra/commits/14013-test] [trunk|https://github.com/vincewhite/cassandra/commits/14013-trunk] > Data loss in snapshots keyspace after service restart > - > > Key: CASSANDRA-14013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14013 > Project: Cassandra > Issue Type: Bug >Reporter: Gregor Uhlenheuer >Assignee: Vincent White > > I am posting this bug in hope to discover the stupid mistake I am doing > because I can't imagine a reasonable answer for the behavior I see right now > :-) > In short words, I do observe data loss in a keyspace called *snapshots* after > restarting the Cassandra service. Say I do have 1000 records in a table > called *snapshots.test_idx* then after restart the table has less entries or > is even empty. > My kind of "mysterious" observation is that it happens only in a keyspace > called *snapshots*... > h3. Steps to reproduce > These steps to reproduce show the described behavior in "most" attempts (not > every single time though). > {code} > # create keyspace > CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 1}; > # create table > CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key)); > # insert some test data > INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1); > ... > INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000); > # count entries > SELECT count(*) FROM snapshots.test_idx; > 1000 > # restart service > kill > cassandra -f > # count entries > SELECT count(*) FROM snapshots.test_idx; > 0 > {code} > I hope someone can point me to the obvious mistake I am doing :-) > This happened to me using both Cassandra 3.9 and 3.11.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14020) test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh: pep8 has been renamed to pycodestyle (GitHub issue #466)
[ https://issues.apache.org/jira/browse/CASSANDRA-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269227#comment-16269227 ] Jay Zhuang edited comment on CASSANDRA-14020 at 12/1/17 12:09 AM: -- Thanks [~mkjellman] Fixing [linter_check.sh|https://github.com/apache/cassandra-dtest/blob/master/linter_check.sh#L10] here: CASSANDRA-14076 was (Author: jay.zhuang): Thanks [~mkjellman] Would be great if pep8 is also renamed in [linter_check.sh|https://github.com/apache/cassandra-dtest/blob/master/linter_check.sh#L10] > test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh: pep8 has been > renamed to pycodestyle (GitHub issue #466) > -- > > Key: CASSANDRA-14020 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14020 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Michael Kjellman >Assignee: Michael Kjellman > > test_pep8_compliance - cqlsh_tests.cqlsh_tests.TestCqlsh always fails due to > us catching a informative warning from the pip8 tool.. looks like we just > need to swap out the usage > /home/cassandra/env/local/lib/python2.7/site-packages/pep8.py:2124: > UserWarning: > pep8 has been renamed to pycodestyle (GitHub issue #466) > Use of the pep8 tool will be removed in a future release. > Please install and use `pycodestyle` instead. > $ pip install pycodestyle > $ pycodestyle ... > '\n\n' -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273643#comment-16273643 ] Jay Zhuang edited comment on CASSANDRA-14076 at 11/30/17 11:32 PM: --- Here is the patch, please review: | Branch | TravisCI Build Status | | [14076|https://github.com/cooldoger/cassandra-dtest/tree/14076] | [!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256] | was (Author: jay.zhuang): Here is the patch, please review: | Branch | TravisCI Build Status | | [14076|https://github.com/cooldoger/cassandra/tree/14076] | [!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256] | > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank l
[jira] [Comment Edited] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273645#comment-16273645 ] Jeff Jirsa edited comment on CASSANDRA-14076 at 11/30/17 11:28 PM: --- cc [~philipthompson] (Also actual branch is https://github.com/cooldoger/cassandra-dtest/tree/14076 ) was (Author: jjirsa): cc [~philipthompson] > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:219:32: W503 line break before binary operator > ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for > hanging indent > ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for > hanging indent > ./rep
[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14075: --- Component/s: Testing > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273645#comment-16273645 ] Jeff Jirsa commented on CASSANDRA-14076: cc [~philipthompson] > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:219:32: W503 line break before binary operator > ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for > hanging indent > ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for > hanging indent > ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name > 'l' > ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace > ./repair_tests/incremental
[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely
[ https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273646#comment-16273646 ] Paulo Motta commented on CASSANDRA-14079: - Ninja fixed bad commit/merge as {{d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b}} on cassandra-11 and fixed master as {{88b244a1380c44d36861b6d0be9c78c968d292c2}}. Thanks Joel! > Prevent compaction strategies from looping indefinitely > --- > > Key: CASSANDRA-14079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14079 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Fix For: 3.11.2, 4.0 > > > As a result of CASSANDRA-13948, LCS was looping indefinitely trying to > generate the same candidates for SSTables which were not on the tracker. > We should add a protection on compaction strategies against looping > indefinitely to avoid similar bugs in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273643#comment-16273643 ] Jay Zhuang commented on CASSANDRA-14076: Here is the patch, please review: | Branch | TravisCI Build Status | | [14076|https://github.com/cooldoger/cassandra/tree/14076] | [!https://travis-ci.org/cooldoger/cassandra-dtest.svg?branch=14076!|https://travis-ci.org/cooldoger/cassandra-dtest/builds/309766256] | > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:219:32: W503 line break before binary operator > ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for > hanging indent > ./plugins/dtestxunit.py:277:25: E1
[jira] [Updated] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang updated CASSANDRA-14076: --- Status: Patch Available (was: Open) > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:219:32: W503 line break before binary operator > ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for > hanging indent > ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for > hanging indent > ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name > 'l' > ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace > ./repair_tests/incremental_repair_test.py:773:76: W291 trailing w
[4/4] cassandra git commit: ninja: fix bad #14079 merge (Fix AbstractCompactionStrategyTest TableMetadataRef -> TableMetadata)
ninja: fix bad #14079 merge (Fix AbstractCompactionStrategyTest TableMetadataRef -> TableMetadata) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/88b244a1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/88b244a1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/88b244a1 Branch: refs/heads/trunk Commit: 88b244a1380c44d36861b6d0be9c78c968d292c2 Parents: f81e57e Author: Paulo Motta Authored: Fri Dec 1 10:13:59 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 10:19:24 2017 +1100 -- .../cassandra/db/compaction/AbstractCompactionStrategyTest.java| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/88b244a1/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java -- diff --git a/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java b/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java index 481b394..b77589d 100644 --- a/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java +++ b/test/unit/org/apache/cassandra/db/compaction/AbstractCompactionStrategyTest.java @@ -134,7 +134,7 @@ public class AbstractCompactionStrategyTest long timestamp = System.currentTimeMillis(); DecoratedKey dk = Util.dk(String.format("%03d", key)); ColumnFamilyStore cfs = Keyspace.open(KEYSPACE1).getColumnFamilyStore(table); -new RowUpdateBuilder(cfs.metadata, timestamp, dk.getKey()) +new RowUpdateBuilder(cfs.metadata(), timestamp, dk.getKey()) .clustering(String.valueOf(key)) .add("val", "val") .build() - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/4] cassandra git commit: ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest)
ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d2e4ce48 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d2e4ce48 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d2e4ce48 Branch: refs/heads/trunk Commit: d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b Parents: c253ed4 Author: Paulo Motta Authored: Fri Dec 1 10:08:30 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 10:18:29 2017 +1100 -- src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 1 file changed, 8 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d2e4ce48/src/java/org/apache/cassandra/db/lifecycle/Tracker.java -- diff --git a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java index 6136f79..47efbce 100644 --- a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java +++ b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java @@ -505,4 +505,12 @@ public class Tracker { return view.get(); } + +@VisibleForTesting +public void removeUnsafe(Set toRemove) +{ +Pair result = apply(view -> { +return updateLiveSet(toRemove, emptySet()).apply(view); +}); +} } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/4] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f81e57e4 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f81e57e4 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f81e57e4 Branch: refs/heads/trunk Commit: f81e57e4f6503260f9ba3a36d5d096ed8d97607f Parents: a01019d d2e4ce4 Author: Paulo Motta Authored: Fri Dec 1 10:18:43 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 10:18:43 2017 +1100 -- src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 1 file changed, 8 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f81e57e4/src/java/org/apache/cassandra/db/lifecycle/Tracker.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/4] cassandra git commit: ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest)
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 c253ed4fa -> d2e4ce489 refs/heads/trunk a01019d2c -> 88b244a13 ninja: fix bad #14079 commit (add removeUnsafe method used by AbstractCompactionStrategyTest) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d2e4ce48 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d2e4ce48 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d2e4ce48 Branch: refs/heads/cassandra-3.11 Commit: d2e4ce48959bc56d9c366de20cd4c0f3c9bdf16b Parents: c253ed4 Author: Paulo Motta Authored: Fri Dec 1 10:08:30 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 10:18:29 2017 +1100 -- src/java/org/apache/cassandra/db/lifecycle/Tracker.java | 8 1 file changed, 8 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d2e4ce48/src/java/org/apache/cassandra/db/lifecycle/Tracker.java -- diff --git a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java index 6136f79..47efbce 100644 --- a/src/java/org/apache/cassandra/db/lifecycle/Tracker.java +++ b/src/java/org/apache/cassandra/db/lifecycle/Tracker.java @@ -505,4 +505,12 @@ public class Tracker { return view.get(); } + +@VisibleForTesting +public void removeUnsafe(Set toRemove) +{ +Pair result = apply(view -> { +return updateLiveSet(toRemove, emptySet()).apply(view); +}); +} } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14076) dtest code style check failed
[ https://issues.apache.org/jira/browse/CASSANDRA-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Zhuang reassigned CASSANDRA-14076: -- Assignee: Jay Zhuang > dtest code style check failed > - > > Key: CASSANDRA-14076 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14076 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang > > https://travis-ci.org/cooldoger/cassandra-dtest > {noformat} > $ flake8 --ignore=E501,F811,F812,F822,F823,F831,F841,N8,C9 > --exclude=thrift_bindings,cassandra-thrift . > ./consistency_test.py:547:17: E722 do not use bare except' > ./consistency_test.py:976:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:976:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:981:63: E703 statement ends with a semicolon > ./consistency_test.py:1037:49: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1037:51: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1054:46: E261 at least two spaces before inline comment > ./consistency_test.py:1103:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1103:24: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:22: E251 unexpected spaces around keyword / > parameter equals > ./consistency_test.py:1175:24: E251 unexpected spaces around keyword / > parameter equals > ./counter_tests.py:59:24: E703 statement ends with a semicolon > ./counter_tests.py:383:37: E261 at least two spaces before inline comment > ./dtest.py:586:13: E722 do not use bare except' > ./dtest.py:1130:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:9:1: E302 expected 2 blank lines, found 1 > ./nodetool_test.py:78:1: W293 blank line contains whitespace > ./nodetool_test.py:174:45: E261 at least two spaces before inline comment > ./run_dtests.py:220:54: E221 multiple spaces before operator > ./secondary_indexes_test.py:14:1: F401 'dtest.DtestTimeoutError' imported but > unused > ./secondary_indexes_test.py:17:1: F401 'tools.data.index_is_built' imported > but unused > ./secondary_indexes_test.py:21:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:15:1: E302 expected 2 blank lines, found 1 > ./sslnodetonode_test.py:191:1: W293 blank line contains whitespace > ./sslnodetonode_test.py:191:1: W391 blank line at end of file > ./system_keyspaces_test.py:6:1: E302 expected 2 blank lines, found 1 > ./system_keyspaces_test.py:28:59: E241 multiple spaces after ',' > ./system_keyspaces_test.py:50:62: E241 multiple spaces after ',' > ./write_failures_test.py:5:1: F401 'distutils.version.LooseVersion' imported > but unused > ./plugins/dtestcollect.py:1:1: F401 'collections.namedtuple' imported but > unused > ./plugins/dtestcollect.py:3:1: F401 'pprint.pprint' imported but unused > ./plugins/dtestcollect.py:5:1: F401 'inspect' imported but unused > ./plugins/dtestcollect.py:13:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestcollect.py:44:9: E306 expected 1 blank line before a nested > definition, found 0 > ./plugins/dtestcollect.py:62:22: E703 statement ends with a semicolon > ./plugins/dtestcollect.py:64:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:1:1: F401 'collections.namedtuple' imported but unused > ./plugins/dtesttag.py:4:1: F401 'pprint.pprint' imported but unused > ./plugins/dtesttag.py:8:1: E302 expected 2 blank lines, found 1 > ./plugins/dtesttag.py:20:1: W293 blank line contains whitespace > ./plugins/dtesttag.py:25:1: W293 blank line contains whitespace > ./plugins/dtestxunit.py:43:1: F401 'doctest' imported but unused > ./plugins/dtestxunit.py:46:1: F401 'traceback' imported but unused > ./plugins/dtestxunit.py:62:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:66:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:70:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:76:29: E226 missing whitespace around arithmetic > operator > ./plugins/dtestxunit.py:84:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:107:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:126:1: E302 expected 2 blank lines, found 1 > ./plugins/dtestxunit.py:219:32: W503 line break before binary operator > ./plugins/dtestxunit.py:269:25: E126 continuation line over-indented for > hanging indent > ./plugins/dtestxunit.py:277:25: E126 continuation line over-indented for > hanging indent > ./repair_tests/deprecated_repair_test.py:159:9: E741 ambiguous variable name > 'l' > ./repair_tests/incremental_repair_test.py:772:4: W291 trailing whitespace > ./repair_tests/incremental_repair_test.py:773:76: W291 trailing whitespace
[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-14075: - Status: Ready to Commit (was: Patch Available) > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-14075: - Status: Patch Available (was: Open) > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Kjellman updated CASSANDRA-14075: - Reviewer: Michael Kjellman > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273615#comment-16273615 ] Michael Kjellman commented on CASSANDRA-14075: -- looks good! +1 > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely
[ https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273611#comment-16273611 ] Paulo Motta commented on CASSANDRA-14079: - oops, lost during break up of CASSANDRA-13948, sorry about that, will ninja a fix soon! thanks for the heads up! > Prevent compaction strategies from looping indefinitely > --- > > Key: CASSANDRA-14079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14079 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Fix For: 3.11.2, 4.0 > > > As a result of CASSANDRA-13948, LCS was looping indefinitely trying to > generate the same candidates for SSTables which were not on the tracker. > We should add a protection on compaction strategies against looping > indefinitely to avoid similar bugs in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely
[ https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273603#comment-16273603 ] Joel Knighton commented on CASSANDRA-14079: --- It looks like this broke the build on 3.11/trunk. On trunk only, there's a place in {{AbstractCompactionStrategyTest}} where we pass a {{TableMetadataRef}} instead of a {{TableMetadata}}. On 3.11/trunk, it looks like there's a missing {{removeUnsafe}} test method on {{Tracker}} that {{AbstractCompactionStrategyTest}} uses. It looks like that's missing on all branches, so maybe it just got left out of the commit. [~pauloricardomg] ^ > Prevent compaction strategies from looping indefinitely > --- > > Key: CASSANDRA-14079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14079 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Fix For: 3.11.2, 4.0 > > > As a result of CASSANDRA-13948, LCS was looping indefinitely trying to > generate the same candidates for SSTables which were not on the tracker. > We should add a protection on compaction strategies against looping > indefinitely to avoid similar bugs in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14085: --- Fix Version/s: (was: 3.0.16) (was: 4.0) 4.x 3.11.x 3.0.x > Excessive update of ReadLatency metric in digest calculation > > > Key: CASSANDRA-14085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14085 > Project: Cassandra > Issue Type: Bug > Components: Core, Metrics >Reporter: Andrew Whang >Assignee: Andrew Whang >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > We noticed an increase in read latency after upgrading to 3.x, specifically > for requests with CL>ONE. It turns out the read latency metric is being > doubly updated for digest calculations. This code > (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243) > makes an improper copy of an iterator that's wrapped by MetricRecording, > whose onClose() records the latency of the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14085: --- Component/s: Core > Excessive update of ReadLatency metric in digest calculation > > > Key: CASSANDRA-14085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14085 > Project: Cassandra > Issue Type: Bug > Components: Core, Metrics >Reporter: Andrew Whang >Assignee: Andrew Whang >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.x > > > We noticed an increase in read latency after upgrading to 3.x, specifically > for requests with CL>ONE. It turns out the read latency metric is being > doubly updated for digest calculations. This code > (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243) > makes an improper copy of an iterator that's wrapped by MetricRecording, > whose onClose() records the latency of the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa reassigned CASSANDRA-14085: -- Assignee: Andrew Whang > Excessive update of ReadLatency metric in digest calculation > > > Key: CASSANDRA-14085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14085 > Project: Cassandra > Issue Type: Bug > Components: Metrics >Reporter: Andrew Whang >Assignee: Andrew Whang >Priority: Minor > Fix For: 3.0.16, 4.0 > > > We noticed an increase in read latency after upgrading to 3.x, specifically > for requests with CL>ONE. It turns out the read latency metric is being > doubly updated for digest calculations. This code > (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243) > makes an improper copy of an iterator that's wrapped by MetricRecording, > whose onClose() records the latency of the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation
[ https://issues.apache.org/jira/browse/CASSANDRA-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Whang updated CASSANDRA-14085: - Fix Version/s: 4.0 3.0.16 Status: Patch Available (was: Open) https://github.com/whangsf/cassandra/commit/2ae3589ce9eefd8699bbd4e29bf1c61a486d394e > Excessive update of ReadLatency metric in digest calculation > > > Key: CASSANDRA-14085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14085 > Project: Cassandra > Issue Type: Bug > Components: Metrics >Reporter: Andrew Whang >Priority: Minor > Fix For: 3.0.16, 4.0 > > > We noticed an increase in read latency after upgrading to 3.x, specifically > for requests with CL>ONE. It turns out the read latency metric is being > doubly updated for digest calculations. This code > (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243) > makes an improper copy of an iterator that's wrapped by MetricRecording, > whose onClose() records the latency of the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14085) Excessive update of ReadLatency metric in digest calculation
Andrew Whang created CASSANDRA-14085: Summary: Excessive update of ReadLatency metric in digest calculation Key: CASSANDRA-14085 URL: https://issues.apache.org/jira/browse/CASSANDRA-14085 Project: Cassandra Issue Type: Bug Components: Metrics Reporter: Andrew Whang Priority: Minor We noticed an increase in read latency after upgrading to 3.x, specifically for requests with CL>ONE. It turns out the read latency metric is being doubly updated for digest calculations. This code (https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/partitions/UnfilteredPartitionIterators.java#L243) makes an improper copy of an iterator that's wrapped by MetricRecording, whose onClose() records the latency of the execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation
[ https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-3200: --- Status: Ready to Commit (was: Patch Available) > Repair: compare all trees together (for a given range/cf) instead of by pair > in isolation > - > > Key: CASSANDRA-3200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3200 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Marcus Eriksson >Priority: Minor > Labels: repair > Fix For: 4.x > > > Currently, repair compare merkle trees by pair, in isolation of any other > tree. What that means concretely is that if I have three node A, B and C > (RF=3) with A and B in sync, but C having some range r inconsitent with both > A and B (since those are consistent), we will do the following transfer of r: > A -> C, C -> A, B -> C, C -> B. > The fact that we do both A -> C and C -> A is fine, because we cannot know > which one is more to date from A or C. However, the transfer B -> C is > useless provided we do A -> C if A and B are in sync. Not doing that transfer > will be a 25% improvement in that case. With RF=5 and only one node > inconsistent with all the others, that almost a 40% improvement, etc... > Given that this situation of one node not in sync while the others are is > probably fairly common (one node died so it is behind), this could be a fair > improvement over what is transferred. In the case where we use repair to > rebuild completely a node, this will be a dramatic improvement, because it > will avoid the rebuilded node to get RF times the data it should get. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation
[ https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273511#comment-16273511 ] Blake Eggleston commented on CASSANDRA-3200: The last test run seems to have died. I restarted it [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/448/]. Assuming there aren't any related failures, I'm +1. > Repair: compare all trees together (for a given range/cf) instead of by pair > in isolation > - > > Key: CASSANDRA-3200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3200 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Marcus Eriksson >Priority: Minor > Labels: repair > Fix For: 4.x > > > Currently, repair compare merkle trees by pair, in isolation of any other > tree. What that means concretely is that if I have three node A, B and C > (RF=3) with A and B in sync, but C having some range r inconsitent with both > A and B (since those are consistent), we will do the following transfer of r: > A -> C, C -> A, B -> C, C -> B. > The fact that we do both A -> C and C -> A is fine, because we cannot know > which one is more to date from A or C. However, the transfer B -> C is > useless provided we do A -> C if A and B are in sync. Not doing that transfer > will be a 25% improvement in that case. With RF=5 and only one node > inconsistent with all the others, that almost a 40% improvement, etc... > Given that this situation of one node not in sync while the others are is > probably fairly common (one node died so it is behind), this could be a fair > improvement over what is transferred. In the case where we use repair to > rebuild completely a node, this will be a dramatic improvement, because it > will avoid the rebuilded node to get RF times the data it should get. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12971) Add CAS option to WRITE test to stress tool
[ https://issues.apache.org/jira/browse/CASSANDRA-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273502#comment-16273502 ] Jeff Jirsa commented on CASSANDRA-12971: [~vovodroid] / [~spo...@gmail.com] / [~jay.zhuang] - should this be closed as a duplicate? > Add CAS option to WRITE test to stress tool > --- > > Key: CASSANDRA-12971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12971 > Project: Cassandra > Issue Type: New Feature > Components: Stress, Tools >Reporter: Vladimir Yudovin >Assignee: Vladimir Yudovin > Attachments: stress-cass.patch > > > If -cas option is present each UPDATE is performed with true IF condition, > thus data is inserted anyway. > It's implemented, if it's needed I proceed with the patch. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12922) Bloom filter miss counts are not measured correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-12922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273498#comment-16273498 ] Jeff Jirsa commented on CASSANDRA-12922: [~krishnasun] are you still interested in writing the unit test? > Bloom filter miss counts are not measured correctly > --- > > Key: CASSANDRA-12922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12922 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Branimir Lambov >Assignee: Sundar Srinivasan > Labels: lhf > Fix For: 4.x > > Attachments: 12922-trunk.txt > > > Bloom filter hits and misses are evaluated incorrectly in > {{BigTableReader.getPosition}}: we properly record hits, but not misses. In > particular, if we don't find a match for a key in the index, which is where > almost all non-matches will be rejected, [we don't record a bloom filter > false > positive|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableReader.java#L228]. > This leads to very misleading output from e.g. {{nodetool tablestats}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13901: --- Status: Awaiting Feedback (was: Open) > Linux Script for stopping running cassandra and cqlsh > - > > Key: CASSANDRA-13901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13901 > Project: Cassandra > Issue Type: New Feature >Reporter: Akash Sethi >Assignee: Akash Sethi >Priority: Minor > Fix For: 3.11.0 > > Attachments: > 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch > > > The script for stopping Cassandra and cqlsh if running on any Linux machine. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13901: --- Status: Open (was: Patch Available) > Linux Script for stopping running cassandra and cqlsh > - > > Key: CASSANDRA-13901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13901 > Project: Cassandra > Issue Type: New Feature >Reporter: Akash Sethi >Assignee: Akash Sethi >Priority: Minor > Fix For: 3.11.0 > > Attachments: > 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch > > > The script for stopping Cassandra and cqlsh if running on any Linux machine. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa reassigned CASSANDRA-13901: -- Assignee: Akash Sethi > Linux Script for stopping running cassandra and cqlsh > - > > Key: CASSANDRA-13901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13901 > Project: Cassandra > Issue Type: New Feature >Reporter: Akash Sethi >Assignee: Akash Sethi >Priority: Minor > Fix For: 3.11.0 > > Attachments: > 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch > > > The script for stopping Cassandra and cqlsh if running on any Linux machine. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13901) Linux Script for stopping running cassandra and cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273479#comment-16273479 ] Jeff Jirsa commented on CASSANDRA-13901: I have some concerns here. 1) Cassandra has mechanisms to stop itself (via nodetool), which does a nice clean shutdown, not {{kill -9}} which can potentially lose data in some edge cases, 2) The command to fetch the PID {{ | grep apache-cassandra }} is unlikely to work reliably and safely. It'll probably not match in many environments, and it'll over-match in environments where multiple instances are running. 3) I'm not sure what problem this solves. Can you help explain why you need such utilities? > Linux Script for stopping running cassandra and cqlsh > - > > Key: CASSANDRA-13901 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13901 > Project: Cassandra > Issue Type: New Feature >Reporter: Akash Sethi >Priority: Minor > Fix For: 3.11.0 > > Attachments: > 0001-Added-Linux-script-for-stopping-cassandra-and-cqlsh.patch > > > The script for stopping Cassandra and cqlsh if running on any Linux machine. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13968) Cannot replace a live node on large clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273471#comment-16273471 ] Jeff Jirsa commented on CASSANDRA-13968: Marking Jason as reviewer since he was silly enough to suggest he may be willing to do it. > Cannot replace a live node on large clusters > > > Key: CASSANDRA-13968 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13968 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4) >Reporter: Joseph Lynch >Assignee: Joseph Lynch > Labels: gossip > Attachments: > 0001-During-node-replacement-check-for-updates-in-the-tim.patch, > 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch > > > During forced node replacements we very frequently (~every time for large > clusters) see: > {noformat} > ERROR [main] 2017-10-17 06:54:35,680 CassandraDaemon.java:583 - Exception > encountered during startup > java.lang.UnsupportedOperationException: Cannot replace a live node... > {noformat} > The old node is dead, the new node that is replacing it thinks it is dead (DN > state), and all other nodes think it is dead (all have the DN state). > However, I believe there are two bugs in the "is live" check that can cause > this error, namely that: > 1. We sleep for > [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905] > (hardcoded 60s on 2.1, on later version configurable but still 60s by > default), but > [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919] > for an update in the last RING_DELAY seconds (typically set to 30s). When a > fresh node is joining, in my experience, [the > schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859] > check almost immediately returns true after gossiping with seeds, so in > reality we do not even sleep for RING_DELAY. If operators increase ring delay > past broadcast_interval (as you might do if you think you are victim to the > second bug below), then you guarantee that you will always get the exception > because the gossip update is basically guaranteed to happen in the last > RING_DELAY seconds since you didn't sleep for that duration (you slept for > broadcast). For example if an operator sets ring delay to 300s, then the > check says "oh yea, the last update was 59 seconds ago, which is sooner than > 300s, so fail". > 2. We don't actually check that the node is alive, we just check that a > gossip update has happened in the last X seconds. Sometimes with large > clusters nodes are still converging on the proper generation/version of a > dead node, and the "is live" check prevents an operator from replacing the > node until gossip has settled on the cluster regarding the dead node, which > for large clusters can take a really long time. This can be really hurtful to > availability in cloud environments and every time I've seen this error it's > the case that the new node believes that the old node is down (since > [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954] > [marks > dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962] > first and then triggers a callback to > [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975] > which never triggers because the old node is actually down). > I think that #1 is definitely a bug, #2 might be considered an extra safety" > feature (that you don't allow replacement during gossip convergence), but > given that the operator took the effort to supply the replace_address flag, I > think it's prudent to only fail if we really know something is wrong. > I've attached two patches against 2.1, one that fixes bug #1 and one that > fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the > schema check from exiting the RING_DELAY sleep early but maybe it's just > better to backport configurable broadcast_interval and pick the maximum or > something. If we don't like the way I've worked around #2, maybe I could make > it an option that operators could turn on if they wanted? If folks are happy > with the approach I can attach patches for 2.2, 3.0, and 3.11. > A relevant example of a log showing the first bug
[jira] [Updated] (CASSANDRA-13968) Cannot replace a live node on large clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13968: --- Reviewer: Jason Brown > Cannot replace a live node on large clusters > > > Key: CASSANDRA-13968 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13968 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4) >Reporter: Joseph Lynch >Assignee: Joseph Lynch > Labels: gossip > Attachments: > 0001-During-node-replacement-check-for-updates-in-the-tim.patch, > 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch > > > During forced node replacements we very frequently (~every time for large > clusters) see: > {noformat} > ERROR [main] 2017-10-17 06:54:35,680 CassandraDaemon.java:583 - Exception > encountered during startup > java.lang.UnsupportedOperationException: Cannot replace a live node... > {noformat} > The old node is dead, the new node that is replacing it thinks it is dead (DN > state), and all other nodes think it is dead (all have the DN state). > However, I believe there are two bugs in the "is live" check that can cause > this error, namely that: > 1. We sleep for > [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905] > (hardcoded 60s on 2.1, on later version configurable but still 60s by > default), but > [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919] > for an update in the last RING_DELAY seconds (typically set to 30s). When a > fresh node is joining, in my experience, [the > schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859] > check almost immediately returns true after gossiping with seeds, so in > reality we do not even sleep for RING_DELAY. If operators increase ring delay > past broadcast_interval (as you might do if you think you are victim to the > second bug below), then you guarantee that you will always get the exception > because the gossip update is basically guaranteed to happen in the last > RING_DELAY seconds since you didn't sleep for that duration (you slept for > broadcast). For example if an operator sets ring delay to 300s, then the > check says "oh yea, the last update was 59 seconds ago, which is sooner than > 300s, so fail". > 2. We don't actually check that the node is alive, we just check that a > gossip update has happened in the last X seconds. Sometimes with large > clusters nodes are still converging on the proper generation/version of a > dead node, and the "is live" check prevents an operator from replacing the > node until gossip has settled on the cluster regarding the dead node, which > for large clusters can take a really long time. This can be really hurtful to > availability in cloud environments and every time I've seen this error it's > the case that the new node believes that the old node is down (since > [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954] > [marks > dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962] > first and then triggers a callback to > [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975] > which never triggers because the old node is actually down). > I think that #1 is definitely a bug, #2 might be considered an extra safety" > feature (that you don't allow replacement during gossip convergence), but > given that the operator took the effort to supply the replace_address flag, I > think it's prudent to only fail if we really know something is wrong. > I've attached two patches against 2.1, one that fixes bug #1 and one that > fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the > schema check from exiting the RING_DELAY sleep early but maybe it's just > better to backport configurable broadcast_interval and pick the maximum or > something. If we don't like the way I've worked around #2, maybe I could make > it an option that operators could turn on if they wanted? If folks are happy > with the approach I can attach patches for 2.2, 3.0, and 3.11. > A relevant example of a log showing the first bug (in this case the node that > was being replaced was drained moving it to shutdown before replacement, and > ring delay was
[jira] [Updated] (CASSANDRA-13968) Cannot replace a live node on large clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13968: --- Component/s: Coordination > Cannot replace a live node on large clusters > > > Key: CASSANDRA-13968 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13968 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Environment: Cassandra 2.1.17, Ubuntu Trusty/Xenial (Linux 3.13, 4.4) >Reporter: Joseph Lynch >Assignee: Joseph Lynch > Labels: gossip > Attachments: > 0001-During-node-replacement-check-for-updates-in-the-tim.patch, > 0002-Only-fail-replacement-if-we-_know_-the-node-is-up.patch > > > During forced node replacements we very frequently (~every time for large > clusters) see: > {noformat} > ERROR [main] 2017-10-17 06:54:35,680 CassandraDaemon.java:583 - Exception > encountered during startup > java.lang.UnsupportedOperationException: Cannot replace a live node... > {noformat} > The old node is dead, the new node that is replacing it thinks it is dead (DN > state), and all other nodes think it is dead (all have the DN state). > However, I believe there are two bugs in the "is live" check that can cause > this error, namely that: > 1. We sleep for > [BROADCAST_INTERVAL|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L905] > (hardcoded 60s on 2.1, on later version configurable but still 60s by > default), but > [check|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L919] > for an update in the last RING_DELAY seconds (typically set to 30s). When a > fresh node is joining, in my experience, [the > schema|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/service/StorageService.java#L859] > check almost immediately returns true after gossiping with seeds, so in > reality we do not even sleep for RING_DELAY. If operators increase ring delay > past broadcast_interval (as you might do if you think you are victim to the > second bug below), then you guarantee that you will always get the exception > because the gossip update is basically guaranteed to happen in the last > RING_DELAY seconds since you didn't sleep for that duration (you slept for > broadcast). For example if an operator sets ring delay to 300s, then the > check says "oh yea, the last update was 59 seconds ago, which is sooner than > 300s, so fail". > 2. We don't actually check that the node is alive, we just check that a > gossip update has happened in the last X seconds. Sometimes with large > clusters nodes are still converging on the proper generation/version of a > dead node, and the "is live" check prevents an operator from replacing the > node until gossip has settled on the cluster regarding the dead node, which > for large clusters can take a really long time. This can be really hurtful to > availability in cloud environments and every time I've seen this error it's > the case that the new node believes that the old node is down (since > [markAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L954] > [marks > dead|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L962] > first and then triggers a callback to > [realMarkAlive|https://github.com/apache/cassandra/blob/943db2488c8b62e1fbe03b132102f0e579c9ae17/src/java/org/apache/cassandra/gms/Gossiper.java#L975] > which never triggers because the old node is actually down). > I think that #1 is definitely a bug, #2 might be considered an extra safety" > feature (that you don't allow replacement during gossip convergence), but > given that the operator took the effort to supply the replace_address flag, I > think it's prudent to only fail if we really know something is wrong. > I've attached two patches against 2.1, one that fixes bug #1 and one that > fixes (imo) bug #2. I was thinking for #1 that we may want to prevent the > schema check from exiting the RING_DELAY sleep early but maybe it's just > better to backport configurable broadcast_interval and pick the maximum or > something. If we don't like the way I've worked around #2, maybe I could make > it an option that operators could turn on if they wanted? If folks are happy > with the approach I can attach patches for 2.2, 3.0, and 3.11. > A relevant example of a log showing the first bug (in this case the node that > was being replaced was drained moving it to shutdown before replacement, and > ring delay
[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13974: --- Reviewer: Jeff Jirsa > Bad prefix matching when figuring out data directory for an sstable > --- > > Key: CASSANDRA-13974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13974 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.11.x, 4.x > > > We do a "startsWith" check when getting data directory for an sstable, we > should match including File.separator -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273465#comment-16273465 ] Jeff Jirsa commented on CASSANDRA-13974: I'll take review on this, but it'll be a bit. If someone beats me to it, I won't mind ([~stefania_alborghetti] or [~bdeggleston] or [~pauloricardomg]) > Bad prefix matching when figuring out data directory for an sstable > --- > > Key: CASSANDRA-13974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13974 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.11.x, 4.x > > > We do a "startsWith" check when getting data directory for an sstable, we > should match including File.separator -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round
[ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273459#comment-16273459 ] Jeff Jirsa commented on CASSANDRA-13851: Who wants to review a gossip patch? [~jasobrown] or [~jkni], you two have touched it most recently? > Allow existing nodes to use all peers in shadow round > - > > Key: CASSANDRA-13851 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13851 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Kurt Greaves >Assignee: Kurt Greaves > Fix For: 3.11.x, 4.x > > > In CASSANDRA-10134 we made collision checks necessary on every startup. A > side-effect was introduced that then requires a nodes seeds to be contacted > on every startup. Prior to this change an existing node could start up > regardless whether it could contact a seed node or not (because > checkForEndpointCollision() was only called for bootstrapping nodes). > Now if a nodes seeds are removed/deleted/fail it will no longer be able to > start up until live seeds are configured (or itself is made a seed), even > though it already knows about the rest of the ring. This is inconvenient for > operators and has the potential to cause some nasty surprises and increase > downtime. > One solution would be to use all a nodes existing peers as seeds in the > shadow round. Not a Gossip guru though so not sure of implications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14065) Docs: Fix page width exceeding the viewport
[ https://issues.apache.org/jira/browse/CASSANDRA-14065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273454#comment-16273454 ] Jeff Jirsa commented on CASSANDRA-14065: I genuinely have no idea how to review this. It looks reasonable, but it's been a very long time since I tried to do cross-browser/cross-device CSS validation? If it's not referenced, maybe there's no harm anyway? > Docs: Fix page width exceeding the viewport > --- > > Key: CASSANDRA-14065 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14065 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Stefan Podkowinski > Fix For: 4.x > > Attachments: 14065-trunk.patch > > > Ticket for [#175|https://github.com/apache/cassandra/pull/175] / > [#176|https://github.com/apache/cassandra/pull/176]. > The layout seems to adapt more natural after applying the patch with less > overlapping content. Seems to fix a real issue with our template. > However, I'm not really sure about the extra.css changes, as the compile > website (build via jekyll) doesn't seem to reference the css file anywhere.. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14060) Separate CorruptSSTableException and FSError handling policies
[ https://issues.apache.org/jira/browse/CASSANDRA-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273442#comment-16273442 ] Jeff Jirsa commented on CASSANDRA-14060: I'll take review on this, but it'll be a few days. Feel free to replace me if another committers will get to it faster than me. > Separate CorruptSSTableException and FSError handling policies > -- > > Key: CASSANDRA-14060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14060 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > > Currently, if > [{{disk_failure_policy}}|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L230] > is set to {{stop}} (default), StorageService will shutdown for {{FSError}}, > but not {{CorruptSSTableException}} > [DefaultFSErrorHandler.java:40|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DefaultFSErrorHandler.java#L40]. > But when we use policy: {{die}}, it has different behave, JVM will be killed > for both {{FSError}} and {{CorruptSSTableException}} > [JVMStabilityInspector.java:63|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L63]: > ||{{disk_failure_policy}}|| hit {{FSError}} Exception || hit > {{CorruptSSTableException}} || > |{{stop}}| (/) stop | (x) not stop | > |{{die}}| (/) die | (/) die | > We saw {{CorruptSSTableException}} from time to time in our production, but > mostly it's *not* because of a disk issue. So I would suggest having a > separate policy for CorruptSSTable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14060) Separate CorruptSSTableException and FSError handling policies
[ https://issues.apache.org/jira/browse/CASSANDRA-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14060: --- Reviewer: Jeff Jirsa > Separate CorruptSSTableException and FSError handling policies > -- > > Key: CASSANDRA-14060 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14060 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > > Currently, if > [{{disk_failure_policy}}|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L230] > is set to {{stop}} (default), StorageService will shutdown for {{FSError}}, > but not {{CorruptSSTableException}} > [DefaultFSErrorHandler.java:40|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DefaultFSErrorHandler.java#L40]. > But when we use policy: {{die}}, it has different behave, JVM will be killed > for both {{FSError}} and {{CorruptSSTableException}} > [JVMStabilityInspector.java:63|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L63]: > ||{{disk_failure_policy}}|| hit {{FSError}} Exception || hit > {{CorruptSSTableException}} || > |{{stop}}| (/) stop | (x) not stop | > |{{die}}| (/) die | (/) die | > We saw {{CorruptSSTableException}} from time to time in our production, but > mostly it's *not* because of a disk issue. So I would suggest having a > separate policy for CorruptSSTable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14055) Index redistribution breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa reassigned CASSANDRA-14055: -- Assignee: Ludovic Boutros > Index redistribution breaks SASI index > -- > > Key: CASSANDRA-14055 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14055 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Ludovic Boutros >Assignee: Ludovic Boutros > Labels: patch > Fix For: 3.11.x > > Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, > CASSANDRA-14055.patch > > > During index redistribution process, a new view is created. > During this creation, old indexes should be released. > But, new indexes are "attached" to the same SSTable as the old indexes. > This leads to the deletion of the last SASI index file and breaks the index. > The issue is in this function : > [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14055) Index redistribution breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273438#comment-16273438 ] Jeff Jirsa commented on CASSANDRA-14055: [~ifesdjeen] are you still reviewing SASI patches or do we need to find someone else? > Index redistribution breaks SASI index > -- > > Key: CASSANDRA-14055 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14055 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Ludovic Boutros >Assignee: Ludovic Boutros > Labels: patch > Fix For: 3.11.x > > Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, > CASSANDRA-14055.patch > > > During index redistribution process, a new view is created. > During this creation, old indexes should be released. > But, new indexes are "attached" to the same SSTable as the old indexes. > This leads to the deletion of the last SASI index file and breaks the index. > The issue is in this function : > [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62] -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14059) Root logging formatter broken in dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273437#comment-16273437 ] Jeff Jirsa commented on CASSANDRA-14059: [~spo...@gmail.com] can I mark you as reviewer here as well? > Root logging formatter broken in dtests > --- > > Key: CASSANDRA-14059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14059 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Joel Knighton >Assignee: Joel Knighton >Priority: Minor > > Since the ccm dependency in dtest was bumped to {{3.1.0}} in > {{7cc06a086f89ed76499837558ff263d84337acba}}, when dtests are run with > --nologcapture, errors of the following form are printed: > {code} > Traceback (most recent call last): > File "/usr/lib64/python2.7/logging/__init__.py", line 861, in emit > msg = self.format(record) > File "/usr/lib64/python2.7/logging/__init__.py", line 734, in format > return fmt.format(record) > File "/usr/lib64/python2.7/logging/__init__.py", line 469, in format > s = self._fmt % record.__dict__ > KeyError: 'current_test' > Logged from file dtest.py, line 485 > {code} > This is because CCM no longer installs a basic root logger configuration, > which is probably a more correct behavior than what it did prior to this > change. Now, dtest installs its own basic root logger configuration which > writes to 'dtest.log' using the formatter {{'%(asctime)s,%(msecs)d %(name)s > %(current_test)s %(levelname)s %(message)s'}}. This means that anything > logging a message must provide the current_test key in its extras map. The > dtest {{debug}} and {{warning}} functions do this, but logging from > dependencies doesn't, producing these {{KeyError}} s. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14061) trunk eclipse-warnings
[ https://issues.apache.org/jira/browse/CASSANDRA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273436#comment-16273436 ] Jeff Jirsa commented on CASSANDRA-14061: [~spo...@gmail.com] interested in being the official reviewer on this? > trunk eclipse-warnings > -- > > Key: CASSANDRA-14061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14061 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Jay Zhuang >Assignee: Jay Zhuang >Priority: Minor > > {noformat} > eclipse-warnings: > [mkdir] Created dir: /home/ubuntu/cassandra/build/ecj > [echo] Running Eclipse Code Analysis. Output logged to > /home/ubuntu/cassandra/build/ecj/eclipse_compiler_checks.txt > [java] -- > [java] 1. ERROR in > /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java > (at line 59) > [java] return new SSTableIdentityIterator(sstable, key, > partitionLevelDeletion, file.getPath(), iterator); > [java] > ^^^ > [java] Potential resource leak: 'iterator' may not be closed at this > location > [java] -- > [java] 2. ERROR in > /home/ubuntu/cassandra/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java > (at line 79) > [java] return new SSTableIdentityIterator(sstable, key, > partitionLevelDeletion, dfile.getPath(), iterator); > [java] > > [java] Potential resource leak: 'iterator' may not be closed at this > location > [java] -- > [java] 2 problems (2 errors) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13917) COMPACT STORAGE inserts on tables without clusterings accept hidden column1 and value columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273425#comment-16273425 ] Jeff Jirsa commented on CASSANDRA-13917: [~ifesdjeen] are you able to review this as the reporter? > COMPACT STORAGE inserts on tables without clusterings accept hidden column1 > and value columns > - > > Key: CASSANDRA-13917 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13917 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Aleksandr Sorokoumov >Priority: Minor > Labels: lhf > Fix For: 3.0.x, 3.11.x > > > Test for the issue: > {code} > @Test > public void testCompactStorage() throws Throwable > { > createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH > COMPACT STORAGE"); > assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, > ?)", 1, 1, 1, ByteBufferUtil.bytes('a')); > // This one fails with Some clustering keys are missing: column1, > which is still wrong > assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", > 1, 1, 1, ByteBufferUtil.bytes('a')); > assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, > ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b')); > assertEmpty(execute("SELECT * FROM %s")); > } > {code} > Gladly, these writes are no-op, even though they succeed. > {{value}} and {{column1}} should be completely hidden. Fixing this one should > be as easy as just adding validations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston reassigned CASSANDRA-10726: --- Assignee: Blake Eggleston (was: Xiaolong Jiang) > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Blake Eggleston > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14075) Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with "Please remove properties [optional, enabled] from your cassandra.yaml"
[ https://issues.apache.org/jira/browse/CASSANDRA-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273188#comment-16273188 ] Jason Brown commented on CASSANDRA-14075: - [~mkjellman]'s evaluations is correct: in CASSANDRA-10404, I didn't correctly support pre-4.0 in this dtest. Here is a [dtest patch|https://github.com/jasobrown/cassandra-dtest/tree/14075] that checks the cluster version and only adds the new props if the it's greater than or equal to 4.0. Here are runs of the dtest patch against both 3.11 and trunk: ||3.11||trunk|| |[utests & dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14075-3.11]|[utests & dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14075-trunk]| || Note: I also ran this locally with jdk1.8.0_151, and started getting this warning: {noformat} Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /tmp/tmpICn9py/ca.keystore -destkeystore /tmp/tmpICn9py/ca.keystore -deststoretype pkcs12". {noformat} I've also updated {{sslkeygen.py}} in this patch with a trivial fix to eliminate the warning. > Many sslnodetonode_test.TestNodeToNodeSSLEncryption tests failing with > "Please remove properties [optional, enabled] from your cassandra.yaml" > -- > > Key: CASSANDRA-14075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14075 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Kjellman >Assignee: Jason Brown > > Many sslnodetonode_test.TestNodeToNodeSSLEncryption dtests are failing on > 3.11 with an exception on startup due to invalid yaml properties. > Unexpected error in node1 log, error: > ERROR [main] 2017-11-18 21:01:54,781 CassandraDaemon.java:706 - Exception > encountered during startup: Invalid yaml. Please remove properties [optional, > enabled] from your cassandra.yaml > Although ccm was updated in > https://github.com/pcmanus/ccm/commit/eaaa425b70edb84786924516aee3920d685c0e53 > to include a version check for >= 4.0, enabled and optional are emitted > unconditionally in the actual dtest itself -- they should also be conditional > on >= 4.0 > {code:java} > node.set_configuration_options(values={ > 'server_encryption_options': { > 'enabled': encryption_enabled, > 'optional': encryption_optional, > 'internode_encryption': internode_encryption, > 'keystore': kspath, > 'keystore_password': 'cassandra', > 'truststore': tspath, > 'truststore_password': 'cassandra', > 'require_endpoint_verification': endpoint_verification, > 'require_client_auth': client_auth, > } > }) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13983) Support a means of logging all queries as they were invoked
[ https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273156#comment-16273156 ] Blake Eggleston commented on CASSANDRA-13983: - +1 with the recent changes. Thanks for dividing the fixes between a few commits > Support a means of logging all queries as they were invoked > --- > > Key: CASSANDRA-13983 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13983 > Project: Cassandra > Issue Type: New Feature > Components: CQL, Observability, Testing, Tools >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > For correctness testing it's useful to be able to capture production traffic > so that it can be replayed against both the old and new versions of Cassandra > while comparing the results. > Implementing this functionality once inside the database is high performance > and presents less operational complexity. > In [this patch|https://github.com/apache/cassandra/pull/169] there is an > implementation of a full query log that logs uses chronicle-queue (apache > licensed, the maven artifacts are labeled incorrectly in some cases, > dependencies are also apache licensed) to implement a rotating log of queries. > * Single thread asynchronously writes log entries to disk to reduce impact on > query latency > * Heap memory usage bounded by a weighted queue with configurable maximum > weight sitting in front of logging thread > * If the weighted queue is full producers can be blocked or samples can be > dropped > * Disk utilization is bounded by deleting old log segments once a > configurable size is reached > * The on disk serialization uses a flexible schema binary format > (chronicle-wire) making it easy to skip unrecognized fields, add new ones, > and omit old ones. > * Can be enabled and configured via JMX, disabled, and reset (delete on disk > data), logging path is configurable via both JMX and YAML > * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which > can dump in a human readable format full query logs as well as follow active > full query logs > Follow up work: > * Introduce new {{fqltool}} command Replay which can replay N full query logs > to two different clusters and compare the result and check for > inconsistencies. <- Actively working on getting this done > * Log not just queries but their results to facilitate a comparison between > the original query result and the replayed result. <- Really just don't have > specific use case at the moment > * "Consistent" query logging allowing replay to fully replicate the original > order of execution and completion even in the face of races (including CAS). > <- This is more speculative -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13983) Support a means of logging all queries as they were invoked
[ https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13983: Status: Ready to Commit (was: Patch Available) > Support a means of logging all queries as they were invoked > --- > > Key: CASSANDRA-13983 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13983 > Project: Cassandra > Issue Type: New Feature > Components: CQL, Observability, Testing, Tools >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > For correctness testing it's useful to be able to capture production traffic > so that it can be replayed against both the old and new versions of Cassandra > while comparing the results. > Implementing this functionality once inside the database is high performance > and presents less operational complexity. > In [this patch|https://github.com/apache/cassandra/pull/169] there is an > implementation of a full query log that logs uses chronicle-queue (apache > licensed, the maven artifacts are labeled incorrectly in some cases, > dependencies are also apache licensed) to implement a rotating log of queries. > * Single thread asynchronously writes log entries to disk to reduce impact on > query latency > * Heap memory usage bounded by a weighted queue with configurable maximum > weight sitting in front of logging thread > * If the weighted queue is full producers can be blocked or samples can be > dropped > * Disk utilization is bounded by deleting old log segments once a > configurable size is reached > * The on disk serialization uses a flexible schema binary format > (chronicle-wire) making it easy to skip unrecognized fields, add new ones, > and omit old ones. > * Can be enabled and configured via JMX, disabled, and reset (delete on disk > data), logging path is configurable via both JMX and YAML > * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which > can dump in a human readable format full query logs as well as follow active > full query logs > Follow up work: > * Introduce new {{fqltool}} command Replay which can replay N full query logs > to two different clusters and compare the result and check for > inconsistencies. <- Actively working on getting this done > * Log not just queries but their results to facilitate a comparison between > the original query result and the replayed result. <- Really just don't have > specific use case at the moment > * "Consistent" query logging allowing replay to fully replicate the original > order of execution and completion even in the face of races (including CAS). > <- This is more speculative -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13308) Gossip breaks, Hint files not being deleted on nodetool decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13308: --- Component/s: Hints > Gossip breaks, Hint files not being deleted on nodetool decommission > > > Key: CASSANDRA-13308 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13308 > Project: Cassandra > Issue Type: Bug > Components: Hints, Streaming and Messaging > Environment: Using Cassandra version 3.0.9 >Reporter: Arijit >Assignee: Jeff Jirsa > Fix For: 3.0.14, 3.11.0, 4.0 > > Attachments: 28207.stack, logs, logs_decommissioned_node > > > How to reproduce the issue I'm seeing: > Shut down Cassandra on one node of the cluster and wait until we accumulate a > ton of hints. Start Cassandra on the node and immediately run "nodetool > decommission" on it. > The node streams its replicas and marks itself as DECOMMISSIONED, but other > nodes do not seem to see this message. "nodetool status" shows the > decommissioned node in state "UL" on all other nodes (it is also present in > system.peers), and Cassandra logs show that gossip tasks on nodes are not > proceeding (number of pending tasks keeps increasing). Jstack suggests that a > gossip task is blocked on hints dispatch (I can provide traces if this is not > obvious). Because the cluster is large and there are a lot of hints, this is > taking a while. > On inspecting "/var/lib/cassandra/hints" on the nodes, I see a bunch of hint > files for the decommissioned node. Documentation seems to suggest that these > hints should be deleted during "nodetool decommission", but it does not seem > to be the case here. This is the bug being reported. > To recover from this scenario, if I manually delete hint files on the nodes, > the hints dispatcher threads throw a bunch of exceptions and the > decommissioned node is now in state "DL" (perhaps it missed some gossip > messages?). The node is still in my "system.peers" table > Restarting Cassandra on all nodes after this step does not fix the issue (the > node remains in the peers table). In fact, after this point the > decommissioned node is in state "DN" -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13740: --- Component/s: Hints > Orphan hint file gets created while node is being removed from cluster > -- > > Key: CASSANDRA-13740 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13740 > Project: Cassandra > Issue Type: Bug > Components: Core, Hints >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13740-3.0.15.txt, gossip_hang_test.py > > > I have found this new issue during my test, whenever node is being removed > then hint file for that node gets written and stays inside the hint directory > forever. I debugged the code and found that it is due to the race condition > between [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > and [HintsWriteExecutor.java::closeWriter | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106] > . > > *Time t1* Node is down, as a result Hints are being written by > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195] > *Time t2* Node is removed from cluster as a result it calls > [HintsService.java-exciseStore | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327] > which removes hint files for the node being removed > *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write > | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145] > which again calls [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215] > and new orphan file gets created > I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that > helped me reproduce this new bug. I will submit patch for this new dtest > later. > I also tried following to check how this orphan hint file responds: > 1. I tried {{nodetool truncatehints }} but it fails as node is no > longer part of the ring > 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint > file because it is not yet included in the [dispatchDequeue | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53] > Reproducible steps: > Please find dTest python file {{gossip_hang_test.py}} attached which > reproduces this bug. > Solution: > This is due to race condition as mentioned above. Since > {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so > solution becomes little simple. Whenever we [HintService.java::excise | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303] > a host, just store it in-memory, and check for already evicted host inside > [HintsWriteExecutor.java::flush | > https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]. > If already evicted host is found then ignore hints. > Jaydeep -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14080) Handling 0 size hint files during start
[ https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14080: --- Component/s: Hints > Handling 0 size hint files during start > --- > > Key: CASSANDRA-14080 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14080 > Project: Cassandra > Issue Type: Bug > Components: Hints >Reporter: Aleksandr Ivanov > > Continuation of CASSANDRA-12728 bug. > Problem: Cassandra didn't start due to 0 size hints files > Log form v3.0.14: > {code:java} > INFO [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra > version: 3.0.14 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API > version: 20.1.0 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported > versions: 3.4.0 (default: 3.4.0) > ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception > encountered during startup > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > ~[na:1.8.0_141] > at java.util.Iterator.forEachRemaining(Iterator.java:116) > ~[na:1.8.0_141] > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > ~[na:1.8.0_141] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > ~[na:1.8.0_141] > at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:88) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:63) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141] > at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:570) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) > [apache-cassandra-3.0.14.jar:3.0.14] > Caused by: java.io.EOFException: null > at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) > ~[na:1.8.0_141] > at > org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138) > ~[apache-cassandra-3.0.14.jar:3.0.14] > ... 20 common frames omitted > {code} > After several 0 size hints files deletion Cassandra started successfully. > Jeff Jirsa added a comment - Yesterday > Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's > possible that the original patch didn't consider 0 byte files (I don't have > time to go back and look at the commit, and it was long enough ago that I've > forgotten) - were all of your files 0 bytes? > Not all, 8..10 hints files were with 0 size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-12728: --- Component/s: Hints > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug > Components: Hints >Reporter: Sharvanath Pathak >Assignee: Garvit Juniwal > Labels: lhf > Fix For: 3.0.14, 3.11.0, 4.0 > > Attachments: CASSANDRA-12728.patch > > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.0.6.jar:3.0.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.EOFException: null > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278) > ~[apache-cassandra-3.0.6.jar:3.0.6] > ... 15 common frames omitted > {noformat} > We've found out that the hint file was truncated because there was a hard > reboot around the time of last write to the file. I think we basically need > to handle partially written hint files. Also, the CRC file does not exist in > this case (probably because it crashed while writing the hints file). May be > ignoring and cleaning up such partially written hint files can be a way to > fix this? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14080) Handling 0 size hint files during start
[ https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273111#comment-16273111 ] Jeff Jirsa commented on CASSANDRA-14080: Probably: CASSANDRA-13740 > Handling 0 size hint files during start > --- > > Key: CASSANDRA-14080 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14080 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksandr Ivanov > > Continuation of CASSANDRA-12728 bug. > Problem: Cassandra didn't start due to 0 size hints files > Log form v3.0.14: > {code:java} > INFO [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra > version: 3.0.14 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API > version: 20.1.0 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported > versions: 3.4.0 (default: 3.4.0) > ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception > encountered during startup > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > ~[na:1.8.0_141] > at java.util.Iterator.forEachRemaining(Iterator.java:116) > ~[na:1.8.0_141] > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > ~[na:1.8.0_141] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > ~[na:1.8.0_141] > at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:88) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:63) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141] > at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:570) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) > [apache-cassandra-3.0.14.jar:3.0.14] > Caused by: java.io.EOFException: null > at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) > ~[na:1.8.0_141] > at > org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138) > ~[apache-cassandra-3.0.14.jar:3.0.14] > ... 20 common frames omitted > {code} > After several 0 size hints files deletion Cassandra started successfully. > Jeff Jirsa added a comment - Yesterday > Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's > possible that the original patch didn't consider 0 byte files (I don't have > time to go back and look at the commit, and it was long enough ago that I've > forgotten) - were all of your files 0 bytes? > Not all, 8..10 hints files were with 0 size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely
[ https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14079: Resolution: Fixed Fix Version/s: 4.0 3.11.2 Status: Resolved (was: Ready to Commit) > Prevent compaction strategies from looping indefinitely > --- > > Key: CASSANDRA-14079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14079 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Fix For: 3.11.2, 4.0 > > > As a result of CASSANDRA-13948, LCS was looping indefinitely trying to > generate the same candidates for SSTables which were not on the tracker. > We should add a protection on compaction strategies against looping > indefinitely to avoid similar bugs in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14079) Prevent compaction strategies from looping indefinitely
[ https://issues.apache.org/jira/browse/CASSANDRA-14079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273092#comment-16273092 ] Paulo Motta commented on CASSANDRA-14079: - Committed to {{c253ed4fa7b7b5667879bb41be09fe9658224c4e}} to cassandra-3.11 and merged up to trunk. Thanks for the review! > Prevent compaction strategies from looping indefinitely > --- > > Key: CASSANDRA-14079 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14079 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Fix For: 3.11.2, 4.0 > > > As a result of CASSANDRA-13948, LCS was looping indefinitely trying to > generate the same candidates for SSTables which were not on the tracker. > We should add a protection on compaction strategies against looping > indefinitely to avoid similar bugs in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a01019d2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a01019d2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a01019d2 Branch: refs/heads/trunk Commit: a01019d2c80d6cada5751fe23a7504ce549d2517 Parents: 4190468 c253ed4 Author: Paulo Motta Authored: Fri Dec 1 05:07:40 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 05:07:40 2017 +1100 -- CHANGES.txt | 1 + .../DateTieredCompactionStrategy.java | 16 ++- .../compaction/LeveledCompactionStrategy.java | 12 ++ .../db/compaction/LeveledManifest.java | 22 ++- .../SizeTieredCompactionStrategy.java | 12 ++ .../TimeWindowCompactionStrategy.java | 12 ++ .../AbstractCompactionStrategyTest.java | 144 +++ 7 files changed, 217 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a01019d2/CHANGES.txt -- diff --cc CHANGES.txt index 4456af5,ce279f2..009dcb5 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,168 -1,5 +1,169 @@@ +4.0 + * Fix flaky SecondaryIndexManagerTest.assert[Not]MarkedAsBuilt (CASSANDRA-13965) + * Make LWTs send resultset metadata on every request (CASSANDRA-13992) + * Fix flaky indexWithFailedInitializationIsNotQueryableAfterPartialRebuild (CASSANDRA-13963) + * Introduce leaf-only iterator (CASSANDRA-9988) + * Upgrade Guava to 23.3 and Airline to 0.8 (CASSANDRA-13997) + * Allow only one concurrent call to StatusLogger (CASSANDRA-12182) + * Refactoring to specialised functional interfaces (CASSANDRA-13982) + * Speculative retry should allow more friendly params (CASSANDRA-13876) + * Throw exception if we send/receive repair messages to incompatible nodes (CASSANDRA-13944) + * Replace usages of MessageDigest with Guava's Hasher (CASSANDRA-13291) + * Add nodetool cmd to print hinted handoff window (CASSANDRA-13728) + * Fix some alerts raised by static analysis (CASSANDRA-13799) + * Checksum sstable metadata (CASSANDRA-13321, CASSANDRA-13593) + * Add result set metadata to prepared statement MD5 hash calculation (CASSANDRA-10786) + * Refactor GcCompactionTest to avoid boxing (CASSANDRA-13941) + * Expose recent histograms in JmxHistograms (CASSANDRA-13642) + * Fix buffer length comparison when decompressing in netty-based streaming (CASSANDRA-13899) + * Properly close StreamCompressionInputStream to release any ByteBuf (CASSANDRA-13906) + * Add SERIAL and LOCAL_SERIAL support for cassandra-stress (CASSANDRA-13925) + * LCS needlessly checks for L0 STCS candidates multiple times (CASSANDRA-12961) + * Correctly close netty channels when a stream session ends (CASSANDRA-13905) + * Update lz4 to 1.4.0 (CASSANDRA-13741) + * Optimize Paxos prepare and propose stage for local requests (CASSANDRA-13862) + * Throttle base partitions during MV repair streaming to prevent OOM (CASSANDRA-13299) + * Use compaction threshold for STCS in L0 (CASSANDRA-13861) + * Fix problem with min_compress_ratio: 1 and disallow ratio < 1 (CASSANDRA-13703) + * Add extra information to SASI timeout exception (CASSANDRA-13677) + * Add incremental repair support for --hosts, --force, and subrange repair (CASSANDRA-13818) + * Rework CompactionStrategyManager.getScanners synchronization (CASSANDRA-13786) + * Add additional unit tests for batch behavior, TTLs, Timestamps (CASSANDRA-13846) + * Add keyspace and table name in schema validation exception (CASSANDRA-13845) + * Emit metrics whenever we hit tombstone failures and warn thresholds (CASSANDRA-13771) + * Make netty EventLoopGroups daemon threads (CASSANDRA-13837) + * Race condition when closing stream sessions (CASSANDRA-13852) + * NettyFactoryTest is failing in trunk on macOS (CASSANDRA-13831) + * Allow changing log levels via nodetool for related classes (CASSANDRA-12696) + * Add stress profile yaml with LWT (CASSANDRA-7960) + * Reduce memory copies and object creations when acting on ByteBufs (CASSANDRA-13789) + * Simplify mx4j configuration (Cassandra-13578) + * Fix trigger example on 4.0 (CASSANDRA-13796) + * Force minumum timeout value (CASSANDRA-9375) + * Use netty for streaming (CASSANDRA-12229) + * Use netty for internode messaging (CASSANDRA-8457) + * Add bytes repaired/unrepaired to nodetool tablestats (CASSANDRA-13774) + * Don't delete incremental repair sessions if they still have sstables (CASSANDRA-13758) + * Fix pending repair manager index out of bounds check (CASSANDRA-13769) + * Don't use RangeFetchMapCalculator when RF=1 (CASSANDRA-13576) + * Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-1
[1/3] cassandra git commit: Prevent compaction strategies from looping indefinitely
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 14e46e462 -> c253ed4fa refs/heads/trunk 41904684b -> a01019d2c Prevent compaction strategies from looping indefinitely Patch by Paulo Motta; Reviewed by Marcus Eriksson for CASSANDRA-14079 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c253ed4f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c253ed4f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c253ed4f Branch: refs/heads/cassandra-3.11 Commit: c253ed4fa7b7b5667879bb41be09fe9658224c4e Parents: 14e46e4 Author: Paulo Motta Authored: Sat Nov 25 01:55:35 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 05:07:31 2017 +1100 -- CHANGES.txt | 1 + .../DateTieredCompactionStrategy.java | 16 ++- .../compaction/LeveledCompactionStrategy.java | 22 ++- .../db/compaction/LeveledManifest.java | 22 ++- .../SizeTieredCompactionStrategy.java | 12 ++ .../TimeWindowCompactionStrategy.java | 12 ++ .../AbstractCompactionStrategyTest.java | 144 +++ 7 files changed, 222 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index fc18dc3..ce279f2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.11.2 + * Prevent compaction strategies from looping indefinitely (CASSANDRA-14079) * Cache disk boundaries (CASSANDRA-13215) * Add asm jar to build.xml for maven builds (CASSANDRA-11193) * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java index 729ddc0..bb9f4b9 100644 --- a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java @@ -73,6 +73,7 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy @SuppressWarnings("resource") public AbstractCompactionTask getNextBackgroundTask(int gcBefore) { +List previousCandidate = null; while (true) { List latestBucket = getNextBackgroundSSTables(gcBefore); @@ -80,9 +81,20 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy if (latestBucket.isEmpty()) return null; +// Already tried acquiring references without success. It means there is a race with +// the tracker but candidate SSTables were not yet replaced in the compaction strategy manager +if (latestBucket.equals(previousCandidate)) +{ +logger.warn("Could not acquire references for compacting SSTables {} which is not a problem per se," + +"unless it happens frequently, in which case it must be reported. Will retry later.", +latestBucket); +return null; +} + LifecycleTransaction modifier = cfs.getTracker().tryModify(latestBucket, OperationType.COMPACTION); if (modifier != null) return new CompactionTask(cfs, modifier, gcBefore); +previousCandidate = latestBucket; } } @@ -170,6 +182,8 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy // no need to convert to collection if had an Iterables.max(), but not present in standard toolkit, and not worth adding List list = new ArrayList<>(); Iterables.addAll(list, cfs.getSSTables(SSTableSet.LIVE)); +if (list.isEmpty()) +return 0; return Collections.max(list, (o1, o2) -> Long.compare(o1.getMaxTimestamp(), o2.getMaxTimestamp())) .getMaxTimestamp(); } @@ -462,7 +476,7 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy return uncheckedOptions; } -public CompactionLogger.Strategy strategyLogger() +public CompactionLogger.Strategy strategyLogger() { return new CompactionLogger.Strategy() { http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java -
[2/3] cassandra git commit: Prevent compaction strategies from looping indefinitely
Prevent compaction strategies from looping indefinitely Patch by Paulo Motta; Reviewed by Marcus Eriksson for CASSANDRA-14079 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c253ed4f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c253ed4f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c253ed4f Branch: refs/heads/trunk Commit: c253ed4fa7b7b5667879bb41be09fe9658224c4e Parents: 14e46e4 Author: Paulo Motta Authored: Sat Nov 25 01:55:35 2017 +1100 Committer: Paulo Motta Committed: Fri Dec 1 05:07:31 2017 +1100 -- CHANGES.txt | 1 + .../DateTieredCompactionStrategy.java | 16 ++- .../compaction/LeveledCompactionStrategy.java | 22 ++- .../db/compaction/LeveledManifest.java | 22 ++- .../SizeTieredCompactionStrategy.java | 12 ++ .../TimeWindowCompactionStrategy.java | 12 ++ .../AbstractCompactionStrategyTest.java | 144 +++ 7 files changed, 222 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index fc18dc3..ce279f2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.11.2 + * Prevent compaction strategies from looping indefinitely (CASSANDRA-14079) * Cache disk boundaries (CASSANDRA-13215) * Add asm jar to build.xml for maven builds (CASSANDRA-11193) * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java index 729ddc0..bb9f4b9 100644 --- a/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/DateTieredCompactionStrategy.java @@ -73,6 +73,7 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy @SuppressWarnings("resource") public AbstractCompactionTask getNextBackgroundTask(int gcBefore) { +List previousCandidate = null; while (true) { List latestBucket = getNextBackgroundSSTables(gcBefore); @@ -80,9 +81,20 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy if (latestBucket.isEmpty()) return null; +// Already tried acquiring references without success. It means there is a race with +// the tracker but candidate SSTables were not yet replaced in the compaction strategy manager +if (latestBucket.equals(previousCandidate)) +{ +logger.warn("Could not acquire references for compacting SSTables {} which is not a problem per se," + +"unless it happens frequently, in which case it must be reported. Will retry later.", +latestBucket); +return null; +} + LifecycleTransaction modifier = cfs.getTracker().tryModify(latestBucket, OperationType.COMPACTION); if (modifier != null) return new CompactionTask(cfs, modifier, gcBefore); +previousCandidate = latestBucket; } } @@ -170,6 +182,8 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy // no need to convert to collection if had an Iterables.max(), but not present in standard toolkit, and not worth adding List list = new ArrayList<>(); Iterables.addAll(list, cfs.getSSTables(SSTableSet.LIVE)); +if (list.isEmpty()) +return 0; return Collections.max(list, (o1, o2) -> Long.compare(o1.getMaxTimestamp(), o2.getMaxTimestamp())) .getMaxTimestamp(); } @@ -462,7 +476,7 @@ public class DateTieredCompactionStrategy extends AbstractCompactionStrategy return uncheckedOptions; } -public CompactionLogger.Strategy strategyLogger() +public CompactionLogger.Strategy strategyLogger() { return new CompactionLogger.Strategy() { http://git-wip-us.apache.org/repos/asf/cassandra/blob/c253ed4f/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/Le
[jira] [Commented] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272977#comment-16272977 ] Paulo Motta commented on CASSANDRA-14084: - This situation is reproduced by [this dest|https://github.com/pauloricardomg/cassandra-dtest/commit/1b96dfd855d1b2fc10cbb4cf2e4c95d236ecd951#diff-1ef92939c7765f8c4041bada71208eebR51]. The simple fix is to use normal tokens for replacement nodes with the same address: * [3.11|https://github.com/pauloricardomg/cassandra/tree/3.11-14084] CI looked clean when this was in CASSANDRA-13948, but I will submit again just to make sure this will not cause problems when committed separately. > Disks can be imbalanced during replace of same address when using JBOD > -- > > Key: CASSANDRA-14084 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14084 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > > While investigating CASSANDRA-14083, I noticed that [we use the pending > ranges to calculate the disk > boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91] > when the node is bootstrapping. > The problem is that when the node is replacing a node with the same address, > it [sets itself as normal > locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449] > (for other unrelated reasons), so the local ranges will be null and > consequently the disk boundaries will be null. This will cause the sstables > to be randomly spread across disks potentially causing imbalance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD
[ https://issues.apache.org/jira/browse/CASSANDRA-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14084: Status: Patch Available (was: In Progress) > Disks can be imbalanced during replace of same address when using JBOD > -- > > Key: CASSANDRA-14084 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14084 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > > While investigating CASSANDRA-14083, I noticed that [we use the pending > ranges to calculate the disk > boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91] > when the node is bootstrapping. > The problem is that when the node is replacing a node with the same address, > it [sets itself as normal > locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449] > (for other unrelated reasons), so the local ranges will be null and > consequently the disk boundaries will be null. This will cause the sstables > to be randomly spread across disks potentially causing imbalance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14084) Disks can be imbalanced during replace of same address when using JBOD
Paulo Motta created CASSANDRA-14084: --- Summary: Disks can be imbalanced during replace of same address when using JBOD Key: CASSANDRA-14084 URL: https://issues.apache.org/jira/browse/CASSANDRA-14084 Project: Cassandra Issue Type: Bug Reporter: Paulo Motta Assignee: Paulo Motta While investigating CASSANDRA-14083, I noticed that [we use the pending ranges to calculate the disk boundaries|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/db/DiskBoundaryManager.java#L91] when the node is bootstrapping. The problem is that when the node is replacing a node with the same address, it [sets itself as normal locally|https://github.com/apache/cassandra/blob/41904684bb5509595d11f008d0851c7ce625e020/src/java/org/apache/cassandra/service/StorageService.java#L1449] (for other unrelated reasons), so the local ranges will be null and consequently the disk boundaries will be null. This will cause the sstables to be randomly spread across disks potentially causing imbalance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272918#comment-16272918 ] Paulo Motta commented on CASSANDRA-14083: - After doing the trivial change of only invalidating disk boundaries when the replication settings change, {{disk_balance_test.py:TestDiskBalance.disk_balance_bootstrap_test}} started failing with imbalanced disks. After investigation, it turned out that when the node starts bootstrapping, it doesn't have any information about itself on {{TokenMetadata}}, so the disk boundaries will be empty. When the node adds itself to gossip, the cached ring version does not change, so the disk boundaries is never invalidated what affects the disk balance. This test was not failing before this change because during keyspace initialization, the disk boundaries were being invalidated by [Keyspace.setMetatada|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187], and properly reloaded with the correct boundaries during streaming - but if some consumer read the disk boundaries before it was set by the bootstrap process, it would cache an older version. My simple fix simply [invalidates the cached ring|https://github.com/pauloricardomg/cassandra/commit/fb66c3c451caec936447929f45be3c5f90725a48] after the node is added as bootstrapping to gossip, but this will also invalidate cached rings unnecessarily only to invalidate the disk boundaries. Perhaps we could decouple the cached ring version from the actual ring version which takes into account pending node changes (bootstrapping, leaving)? Patch: * [3.11|https://github.com/pauloricardomg/cassandra/tree/3.11-14083] Since this depends on CASSANDRA-13948, I will wait until that is committed before setting this as PA. > Avoid invalidating disk boundaries unnecessarily > > > Key: CASSANDRA-14083 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14083 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta > > We currently invalidate disk boundaries whenever [instantiating a new > replication > strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359], > but this is done whenever [updating keyspace > settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187]. > > Computing new boundaries is expensive and unnecessarily invalidating them > will cause {{DiskBoundaries}} consumers to also invalidate their work > unnecessarily. For instance, after CASSANDRA-13948 the > {{CompactionStrategyManager}} will reload all compaction strategies when the > boundaries are invalidated. > In this case, we should only invalidate the disk boundaries when the > replication settings change to avoid doing unnecessary work. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14083: Issue Type: Improvement (was: Bug) > Avoid invalidating disk boundaries unnecessarily > > > Key: CASSANDRA-14083 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14083 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta > > We currently invalidate disk boundaries whenever [instantiating a new > replication > strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359], > but this is done whenever [updating keyspace > settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187]. > > Computing new boundaries is expensive and unnecessarily invalidating them > will cause {{DiskBoundaries}} consumers to also invalidate their work > unnecessarily. For instance, after CASSANDRA-13948 the > {{CompactionStrategyManager}} will reload all compaction strategies when the > boundaries are invalidated. > In this case, we should only invalidate the disk boundaries when the > replication settings change to avoid doing unnecessary work. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily
[ https://issues.apache.org/jira/browse/CASSANDRA-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14083: Description: We currently invalidate disk boundaries whenever [instantiating a new replication strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359], but this is done whenever [updating keyspace settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187]. Computing new boundaries is expensive and unnecessarily invalidating them will cause {{DiskBoundaries}} consumers to also invalidate their work unnecessarily. For instance, after CASSANDRA-13948 the {{CompactionStrategyManager}} will reload all compaction strategies when the boundaries are invalidated. In this case, we should only invalidate the disk boundaries when the replication settings change to avoid doing unnecessary work. > Avoid invalidating disk boundaries unnecessarily > > > Key: CASSANDRA-14083 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14083 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > > We currently invalidate disk boundaries whenever [instantiating a new > replication > strategy|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L359], > but this is done whenever [updating keyspace > settings|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L187]. > > Computing new boundaries is expensive and unnecessarily invalidating them > will cause {{DiskBoundaries}} consumers to also invalidate their work > unnecessarily. For instance, after CASSANDRA-13948 the > {{CompactionStrategyManager}} will reload all compaction strategies when the > boundaries are invalidated. > In this case, we should only invalidate the disk boundaries when the > replication settings change to avoid doing unnecessary work. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14083) Avoid invalidating disk boundaries unnecessarily
Paulo Motta created CASSANDRA-14083: --- Summary: Avoid invalidating disk boundaries unnecessarily Key: CASSANDRA-14083 URL: https://issues.apache.org/jira/browse/CASSANDRA-14083 Project: Cassandra Issue Type: Bug Reporter: Paulo Motta Assignee: Paulo Motta -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14082) Do not expose compaction strategy index publicly
[ https://issues.apache.org/jira/browse/CASSANDRA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272656#comment-16272656 ] Paulo Motta commented on CASSANDRA-14082: - Currently the scrubber and relocate sstables were relying on the compaction strategy index, so this patches change these operation to use a {{DiskBoundaries}} object instead and make {{CSM.getCompactionStrategyIndex}} private. * [3.11 patch|https://github.com/pauloricardomg/cassandra/tree/3.11-14082] Since this depends on CASSANDRA-13948, I will wait until that is committed before setting this as PA. > Do not expose compaction strategy index publicly > > > Key: CASSANDRA-14082 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14082 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > > Before CASSANDRA-13215 we used the compaction strategy index to decide which > disk to place a given sstable, but now we can get this directly from the disk > boundary manager and keep the compaction strategy index internal only. > This will ensure external consumers will use a consistent {{DiskBoundaries}} > object to perform operations on multiple disks, rather than risking getting > inconsistent indexes if the compaction strategy indexes change between > successive calls to {{CSM.getCompactionStrategyIndex}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14082) Do not expose compaction strategy index publicly
[ https://issues.apache.org/jira/browse/CASSANDRA-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta reassigned CASSANDRA-14082: --- Assignee: Paulo Motta > Do not expose compaction strategy index publicly > > > Key: CASSANDRA-14082 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14082 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta > > Before CASSANDRA-13215 we used the compaction strategy index to decide which > disk to place a given sstable, but now we can get this directly from the disk > boundary manager and keep the compaction strategy index internal only. > This will ensure external consumers will use a consistent {{DiskBoundaries}} > object to perform operations on multiple disks, rather than risking getting > inconsistent indexes if the compaction strategy indexes change between > successive calls to {{CSM.getCompactionStrategyIndex}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14082) Do not expose compaction strategy index publicly
Paulo Motta created CASSANDRA-14082: --- Summary: Do not expose compaction strategy index publicly Key: CASSANDRA-14082 URL: https://issues.apache.org/jira/browse/CASSANDRA-14082 Project: Cassandra Issue Type: Bug Reporter: Paulo Motta Before CASSANDRA-13215 we used the compaction strategy index to decide which disk to place a given sstable, but now we can get this directly from the disk boundary manager and keep the compaction strategy index internal only. This will ensure external consumers will use a consistent {{DiskBoundaries}} object to perform operations on multiple disks, rather than risking getting inconsistent indexes if the compaction strategy indexes change between successive calls to {{CSM.getCompactionStrategyIndex}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes
[ https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272641#comment-16272641 ] Paulo Motta commented on CASSANDRA-13948: - bq. This ticket is getting quite big and very hard to review I tried to make things easier by splitting in different commits, but I agree it became a bit complicated for review. bq. Could we split out all the pre-existing bugs in other tickets and get them committed separately? Especially this as it involves tokenmetadata. The problem is that some bugs (even though were pre-existing) only started showing up after this, so they have a dependency on this. I reorganized [this branch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948] to keep only things essential to this ticket, created CASSANDRA-14079 and CASSANDRA-14081 with unrelated minor fixes, and will create two follow-up tickets which depend on this. This should be ready for review now, please let me know if some of the changes are not clear for you and needs better explanation. CI looked clean before the reorganization, but I will resubmit with the essential ticket just to make sure we didn't miss anything: * [3.11 patch|https://github.com/pauloricardomg/cassandra/tree/3.11-13948] * [dtest|https://github.com/pauloricardomg/cassandra-dtest/tree/13948] > Reload compaction strategies when JBOD disk boundary changes > > > Key: CASSANDRA-13948 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13948 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Paulo Motta >Assignee: Paulo Motta > Fix For: 3.11.x, 4.x > > Attachments: debug.log, dtest13948.png, dtest2.png, > threaddump-cleanup.txt, threaddump.txt, trace.log > > > The thread dump below shows a race between an sstable replacement by the > {{IndexSummaryRedistribution}} and > {{AbstractCompactionTask.getNextBackgroundTask}}: > {noformat} > Thread 94580: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=175 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() > @bci=1, line=836 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, > int) @bci=67, line=870 (Compiled frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) > @bci=17, line=1199 (Compiled frame) > - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, > line=943 (Compiled frame) > - > org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable, > java.lang.Iterable) @bci=359, line=483 (Interpreted frame) > - > org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification, > java.lang.Object) @bci=53, line=555 (Interpreted frame) > - > org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection, > java.util.Collection, org.apache.cassandra.db.compaction.OperationType, > java.lang.Throwable) @bci=50, line=409 (Interpreted frame) > - > org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable) > @bci=157, line=227 (Interpreted frame) > - > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable) > @bci=61, line=116 (Compiled frame) > - > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit() > @bci=2, line=200 (Interpreted frame) > - > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish() > @bci=5, line=185 (Interpreted frame) > - > org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries() > @bci=559, line=130 (Interpreted frame) > - > org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution) > @bci=9, line=1420 (Interpreted frame) > - > org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution) > @bci=4, line=250 (Interpreted frame) > - > org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() > @bci=30, line=228 (Interpreted frame) > - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() > @bci=4, line=125 (Interpreted frame) > - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 > (Interpreted frame) > - > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$Uncomplaini
[jira] [Commented] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed
[ https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272625#comment-16272625 ] Marcus Eriksson commented on CASSANDRA-14081: - +1 to remove this in trunk (this was added to give 3rd party compaction strategies more control, but I doubt it is needed anymore) should we remove {{ACS#getMemtableReservedSize()}} and {{ACS#isAffectedByMeteredFlusher()}} at the same time? > Remove AbstractCompactionStrategy.replaceFlushed > > > Key: CASSANDRA-14081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14081 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > > I didn't find a reason for why we need to send flush notifications from CFs > -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker > from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed
[ https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272603#comment-16272603 ] Paulo Motta commented on CASSANDRA-14081: - Trivial patch [here|https://github.com/pauloricardomg/cassandra/tree/trunk-14081]. CI looked clean when this was in CASSANDRA-13948, but I will submit again just to make sure this will not cause problems when committed separately on trunk. > Remove AbstractCompactionStrategy.replaceFlushed > > > Key: CASSANDRA-14081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14081 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > > I didn't find a reason for why we need to send flush notifications from CFs > -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker > from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed
[ https://issues.apache.org/jira/browse/CASSANDRA-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14081: Status: Patch Available (was: Open) > Remove AbstractCompactionStrategy.replaceFlushed > > > Key: CASSANDRA-14081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14081 > Project: Cassandra > Issue Type: Improvement >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > > I didn't find a reason for why we need to send flush notifications from CFs > -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker > from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14081) Remove AbstractCompactionStrategy.replaceFlushed
Paulo Motta created CASSANDRA-14081: --- Summary: Remove AbstractCompactionStrategy.replaceFlushed Key: CASSANDRA-14081 URL: https://issues.apache.org/jira/browse/CASSANDRA-14081 Project: Cassandra Issue Type: Improvement Reporter: Paulo Motta Assignee: Paulo Motta Priority: Minor I didn't find a reason for why we need to send flush notifications from CFs -> CSM -> Tracker, if we can bypass the CSM and send directly to the tracker from the CFS (and handle it on the CSM via {{SSTableAddedNotification}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13987) Multithreaded commitlog subtly changed durability
[ https://issues.apache.org/jira/browse/CASSANDRA-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272601#comment-16272601 ] Jason Brown commented on CASSANDRA-13987: - Spent about a week tracking down a race condition, and thankfully it was just a stupid bug which is fixed. I've also backported to 3.0 and 3.11 ||3.0||3.11||trunk|| |[branch|https://github.com/jasobrown/cassandra/tree/13987-3.0]|[branch|https://github.com/jasobrown/cassandra/tree/13987-3.11]|[branch|https://github.com/jasobrown/cassandra/tree/commitlog_mmap-more-frequent-markers]| |[utests & dtests|https://circleci.com/gh/jasobrown/cassandra/tree/13987-3.0]|[utests & dtests|https://circleci.com/gh/jasobrown/cassandra/tree/13987-3.11]|[utests & dtests|https://circleci.com/gh/jasobrown/cassandra/tree/commitlog_mmap-more-frequent-markers]| || The trunk branch is a continuation of the previous development branch, while the 3.0/3.11 branched are squashed backports. 3.11 is trivially close to the trunk code (minor compilation fixes were needed), but 3.0 required a bit more work. utests look good across the branches, and I'm waiting for the dtests to finish. Note: I'm using an updated circleci config which won't be committed into the apache repo. bq. Should we move the call to writeCDCIndexFile ... Yup, certainly seems the correct thing to do ;) I addressed the other nits, as well. While running the utests on circleci, I there were some failures, related to not forcing the flush when shutting down the {{AbstractCommitLogService}}. That's fixed in the latest commit. bq. do you think it's worth adding a unit test or two for this? Yes, and I've added {{CommitLogChainedMarkersTest}}. Looking at it now, perhaps it could do with a better name and/or a comment at the top of the file explaining what, specifically, it's testing. Also, I've fixed a few minor things for a few tests. - {{AbstractCommitLogService#requestExtraSync()}} was correctly unparking the sync thread, but commit log data was not being flushed to disk. Thus, I added a volatile boolean to the class for {{#requestExtraSync()}} to indicate that a sync should happen. This is mostly to support batch commit log mode. > Multithreaded commitlog subtly changed durability > - > > Key: CASSANDRA-13987 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13987 > Project: Cassandra > Issue Type: Improvement >Reporter: Jason Brown >Assignee: Jason Brown > Fix For: 4.x > > > When multithreaded commitlog was introduced in CASSANDRA-3578, we subtly > changed the way that commitlog durability worked. Everything still gets > written to an mmap file. However, not everything is replayable from the > mmaped file after a process crash, in periodic mode. > In brief, the reason this changesd is due to the chained markers that are > required for the multithreaded commit log. At each msync, we wait for > outstanding mutations to serialize into the commitlog, and update a marker > before and after the commits that have accumluated since the last sync. With > those markers, we can safely replay that section of the commitlog. Without > the markers, we have no guarantee that the commits in that section were > successfully written, thus we abandon those commits on replay. > If you have correlated process failures of multiple nodes at "nearly" the > same time (see ["There Is No > Now"|http://queue.acm.org/detail.cfm?id=2745385]), it is possible to have > data loss if none of the nodes msync the commitlog. For example, with RF=3, > if quorum write succeeds on two nodes (and we acknowledge the write back to > the client), and then the process on both nodes OOMs (say, due to reading the > index for a 100GB partition), the write will be lost if neither process > msync'ed the commitlog. More exactly, the commitlog cannot be fully replayed. > The reason why this data is silently lost is due to the chained markers that > were introduced with CASSANDRA-3578. > The problem we are addressing with this ticket is incrementally improving > 'durability' due to process crash, not host crash. (Note: operators should > use batch mode to ensure greater durability, but batch mode in it's current > implementation is a) borked, and b) will burn through, *very* rapidly, SSDs > that don't have a non-volatile write cache sitting in front.) > The current default for {{commitlog_sync_period_in_ms}} is 10 seconds, which > means that a node could lose up to ten seconds of data due to process crash. > The unfortunate thing is that the data is still avaialble, in the mmap file, > but we can't replay it due to incomplete chained markers. > ftr, I don't believe we've ever had a stated policy about commitlog > durability wrt process crash. Pre-2.0 we naturally pigg
[jira] [Commented] (CASSANDRA-14080) Handling 0 size hint files during start
[ https://issues.apache.org/jira/browse/CASSANDRA-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272592#comment-16272592 ] Aleksey Yeschenko commented on CASSANDRA-14080: --- Sorry. I don't mean that you shouldn't file/have filed this JIRA. Just saying that the similar one we closed recenlty-ish might have some useful context, so you might want to look it up and link to this one. > Handling 0 size hint files during start > --- > > Key: CASSANDRA-14080 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14080 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksandr Ivanov > > Continuation of CASSANDRA-12728 bug. > Problem: Cassandra didn't start due to 0 size hints files > Log form v3.0.14: > {code:java} > INFO [main] 2017-11-28 19:10:13,554 StorageService.java:575 - Cassandra > version: 3.0.14 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:576 - Thrift API > version: 20.1.0 > INFO [main] 2017-11-28 19:10:13,555 StorageService.java:577 - CQL supported > versions: 3.4.0 (default: 3.4.0) > ERROR [main] 2017-11-28 19:10:13,592 CassandraDaemon.java:710 - Exception > encountered during startup > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:142) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) > ~[na:1.8.0_141] > at java.util.Iterator.forEachRemaining(Iterator.java:116) > ~[na:1.8.0_141] > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > ~[na:1.8.0_141] > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > ~[na:1.8.0_141] > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > ~[na:1.8.0_141] > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > ~[na:1.8.0_141] > at org.apache.cassandra.hints.HintsCatalog.load(HintsCatalog.java:65) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:88) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsService.(HintsService.java:63) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageProxy.(StorageProxy.java:121) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at java.lang.Class.forName0(Native Method) ~[na:1.8.0_141] > at java.lang.Class.forName(Class.java:264) ~[na:1.8.0_141] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:570) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:346) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569) > [apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697) > [apache-cassandra-3.0.14.jar:3.0.14] > Caused by: java.io.EOFException: null > at java.io.RandomAccessFile.readInt(RandomAccessFile.java:803) > ~[na:1.8.0_141] > at > org.apache.cassandra.hints.HintsDescriptor.deserialize(HintsDescriptor.java:237) > ~[apache-cassandra-3.0.14.jar:3.0.14] > at > org.apache.cassandra.hints.HintsDescriptor.readFromFile(HintsDescriptor.java:138) > ~[apache-cassandra-3.0.14.jar:3.0.14] > ... 20 common frames omitted > {code} > After several 0 size hints files deletion Cassandra started successfully. > Jeff Jirsa added a comment - Yesterday > Aleksandr Ivanov can you open a new JIRA and link it back to this one? It's > possible that the original patch didn't consider 0 byte files (I don't have > time to go back and look at the commit, and it was long enough ago that I've > forgotten) - were all of your files 0 bytes? > Not all, 8..10 hints files were with 0 size. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org