[jira] [Commented] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063338#comment-17063338 ] Ryan Svihla commented on CASSANDRA-15647: - [~benedict] I think the build.xml file got force pushed over in this commit [https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993|https://github.com/apache/cassandra/blob/7dc5b700b760382c15045e3301c7061f412da993/build.xml] removing the exclusion. Can you confirm dropping the JNA exclusion wasn't intentional? > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15626) Need microsecond precision for dropped columns so we can avoid timestamp issues
[ https://issues.apache.org/jira/browse/CASSANDRA-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15626: Fix Version/s: 4.0-beta > Need microsecond precision for dropped columns so we can avoid timestamp > issues > --- > > Key: CASSANDRA-15626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15626 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > In CASSANDRA-15557 the fix for the flaky test is reimplementing the logic > from CASSANDRA-12997 which was removed as part of CASSANDRA-13426 > However, since dropped columns are stored at a millisecond precision instead > of a microsecond precision and ClientState.getTimestamp adds microseconds on > each call we will lose the precision on save and some writes that should be > dropped could reappear. > Note views affected as well > > [https://github.com/apache/cassandra/blob/cb83fbff479bb90e9abeaade9e0f8843634c974d/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L712-L716] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15647: Test and Documentation Plan: {{ant write-poms}} {{ mvn dependency:tree -f build/apache-cassandra-*-SNAPSHOT.pom -Dverbose -Dincludes=net.java.dev.jna}} Status: Patch Available (was: In Progress) I've linked the equivalent [PR|https://github.com/apache/cassandra/pull/476] for the trunk version of the build file. Now output for jna is all 4.2.2 ➜ cassandra git:(15647) ✗ mvn dependency:tree -f build/apache-cassandra-*-SNAPSHOT.pom -Dverbose -Dincludes=net.java.dev.jna [INFO] Scanning for projects... [INFO] [INFO] -< org.apache.cassandra:cassandra-all >- [INFO] Building Apache Cassandra 4.0-alpha4-SNAPSHOT [INFO] [ jar ]- [WARNING] The POM for org.perfkit.sjk.parsers:sjk-jfr5:jar:0.5 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details [WARNING] The POM for org.perfkit.sjk.parsers:sjk-jfr6:jar:0.7 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details [WARNING] The POM for org.perfkit.sjk.parsers:sjk-nps:jar:0.5 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details [INFO] [INFO] --- maven-dependency-plugin:3.1.1:tree (default-cli) @ cassandra-all --- [INFO] Verbose not supported since maven-dependency-plugin 3.0 [INFO] org.apache.cassandra:cassandra-all:jar:4.0-alpha4-SNAPSHOT [INFO] \- net.java.dev.jna:jna:jar:4.2.2:compile [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1.902 s [INFO] Finished at: 2020-03-18T17:00:00+01:00 [INFO] > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061015#comment-17061015 ] Ryan Svihla edited comment on CASSANDRA-15647 at 3/18/20, 2:48 PM: --- digging into the history I think this just happened as part of a force push from the 3.11 branch into trunk (but I could be misreading the github ui), [https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993 |https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993]It does not look intentional that the JNA exclusion was left off or related to the issue that wrote the build.xml. was (Author: rssvihla): digging into the history I think this just happened as part of a force push from the 3.11 branch into trunk (but I could be misreading the github ui), [https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993 |https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993]It does not look intentional that the JNA exclusion was left off or related to the issue that wrote the build.xml. I've linked the equivalent [PR|https://github.com/apache/cassandra/pull/476] for the trunk version of the build file in case it was accidental as I suspect. > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15647: Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear Impact(13164) Complexity: Low Hanging Fruit Discovered By: Code Inspection Fix Version/s: 4.0-beta Severity: Low Status: Open (was: Triage Needed) Opening this up since it seems pretty simple matter of a force push eating a commit > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061015#comment-17061015 ] Ryan Svihla commented on CASSANDRA-15647: - digging into the history I think this just happened as part of a force push from the 3.11 branch into trunk (but I could be misreading the github ui), [https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993 |https://github.com/apache/cassandra/commit/7dc5b700b760382c15045e3301c7061f412da993]It does not look intentional that the JNA exclusion was left off or related to the issue that wrote the build.xml. I've linked the equivalent [PR|https://github.com/apache/cassandra/pull/476] for the trunk version of the build file in case it was accidental as I suspect. > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15647) Missmatching dependencies between cassandra dist and cassandra-all pom
[ https://issues.apache.org/jira/browse/CASSANDRA-15647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15647: --- Assignee: Ryan Svihla > Missmatching dependencies between cassandra dist and cassandra-all pom > -- > > Key: CASSANDRA-15647 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15647 > Project: Cassandra > Issue Type: Bug > Components: Build, Dependencies >Reporter: Marvin Froeder >Assignee: Ryan Svihla >Priority: Normal > > I noticed that the cassandra distribution (tar.gz) dependencies doesn't match > the dependency list for the cassandra-all that is available at maven central. > Cassandra distribution only includes jna 4.2.2. > But, the maven dependency also include jna-platform 4.4.0 > Breakdown of relevant maven dependencies: > ``` > [INFO] +- org.apache.cassandra:cassandra-all:jar:4.0-alpha3:provided > [INFO] | +- net.java.dev.jna:jna:jar:4.2.2:provided > [INFO] | +- net.openhft:chronicle-threads:jar:1.16.0:provided > [INFO] | | \- net.openhft:affinity:jar:3.1.7:provided > [INFO] | | \- net.java.dev.jna:jna-platform:jar:4.4.0:provided > ``` > As you can see, jna is a direct dependency and jna-platform is a transitive > dependency from chronicle-threads. > I expected this issue to had been fixed by > https://github.com/apache/cassandra/pull/240/, but this change seem to have > being reverted, as no longer in trunk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14296) Fix eclipse-warnings introduced by 7544 parameter handling
[ https://issues.apache.org/jira/browse/CASSANDRA-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060129#comment-17060129 ] Ryan Svihla edited comment on CASSANDRA-14296 at 3/16/20, 11:23 AM: I would say it's likely it got fixed somewhere else long ago. I just ran eclipse-warnings from the latest trunk and the output generates no warnings. {{eclipse-warnings:}} {{ [echo] Running Eclipse Code Analysis. Output logged to /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}}{{BUILD SUCCESSFUL}} {{Total time: 16 seconds}} {{➜ cassandra git:(trunk) ✗ cat /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{# 16/03/20 12:17:03 CET}} {{# Eclipse Compiler for Java(TM) v20160829-0950, 3.12.1, Copyright IBM Corp 2000, 2015. All rights reserved.}} {{➜ cassandra git:(trunk) ✗}} was (Author: rssvihla): I would say it's likely it got fixed somewhere else long ago. I just ran eclipse-warnings from the latest trunk and the output generates no warnings. {{eclipse-warnings:}} {{ [echo] Running Eclipse Code Analysis. Output logged to /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}}{{BUILD SUCCESSFUL}} {{Total time: 16 seconds}} {{➜ cassandra git:(trunk) ✗ vim /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{➜ cassandra git:(trunk) ✗ vim /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{➜ cassandra git:(trunk) ✗ cat /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{# 16/03/20 12:17:03 CET}} {{# Eclipse Compiler for Java(TM) v20160829-0950, 3.12.1, Copyright IBM Corp 2000, 2015. All rights reserved.}} {{➜ cassandra git:(trunk) ✗}} > Fix eclipse-warnings introduced by 7544 parameter handling > -- > > Key: CASSANDRA-14296 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14296 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Ariel Weisberg >Priority: Normal > Fix For: 4.0 > > > There are some Closables that aren't being closed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14296) Fix eclipse-warnings introduced by 7544 parameter handling
[ https://issues.apache.org/jira/browse/CASSANDRA-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060129#comment-17060129 ] Ryan Svihla commented on CASSANDRA-14296: - I would say it's likely it got fixed somewhere else long ago. I just ran eclipse-warnings from the latest trunk and the output generates no warnings. eclipse-warnings: [echo] Running Eclipse Code Analysis. Output logged to /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt BUILD SUCCESSFUL Total time: 16 seconds ➜ cassandra git:(trunk) ✗ cat /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt # 16/03/20 12:17:03 CET # Eclipse Compiler for Java(TM) v20160829-0950, 3.12.1, Copyright IBM Corp 2000, 2015. All rights reserved. ➜ cassandra git:(trunk) ✗ > Fix eclipse-warnings introduced by 7544 parameter handling > -- > > Key: CASSANDRA-14296 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14296 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Ariel Weisberg >Priority: Normal > Fix For: 4.0 > > > There are some Closables that aren't being closed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14296) Fix eclipse-warnings introduced by 7544 parameter handling
[ https://issues.apache.org/jira/browse/CASSANDRA-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060129#comment-17060129 ] Ryan Svihla edited comment on CASSANDRA-14296 at 3/16/20, 11:23 AM: I would say it's likely it got fixed somewhere else long ago. I just ran eclipse-warnings from the latest trunk and the output generates no warnings. {{eclipse-warnings:}} {{ [echo] Running Eclipse Code Analysis. Output logged to /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}}{{BUILD SUCCESSFUL}} {{Total time: 16 seconds}} {{➜ cassandra git:(trunk) ✗ vim /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{➜ cassandra git:(trunk) ✗ vim /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{➜ cassandra git:(trunk) ✗ cat /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt}} {{# 16/03/20 12:17:03 CET}} {{# Eclipse Compiler for Java(TM) v20160829-0950, 3.12.1, Copyright IBM Corp 2000, 2015. All rights reserved.}} {{➜ cassandra git:(trunk) ✗}} was (Author: rssvihla): I would say it's likely it got fixed somewhere else long ago. I just ran eclipse-warnings from the latest trunk and the output generates no warnings. eclipse-warnings: [echo] Running Eclipse Code Analysis. Output logged to /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt BUILD SUCCESSFUL Total time: 16 seconds ➜ cassandra git:(trunk) ✗ cat /Users/ryansvihla/code/cassandra/build/ecj/eclipse_compiler_checks.txt # 16/03/20 12:17:03 CET # Eclipse Compiler for Java(TM) v20160829-0950, 3.12.1, Copyright IBM Corp 2000, 2015. All rights reserved. ➜ cassandra git:(trunk) ✗ > Fix eclipse-warnings introduced by 7544 parameter handling > -- > > Key: CASSANDRA-14296 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14296 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Ariel Weisberg >Priority: Normal > Fix For: 4.0 > > > There are some Closables that aren't being closed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15624) Avoid lazy initializing shut down instances when trying to send them messages
[ https://issues.apache.org/jira/browse/CASSANDRA-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058532#comment-17058532 ] Ryan Svihla edited comment on CASSANDRA-15624 at 3/13/20, 9:13 AM: --- [~ifesdjeen] yes I was looking into it and using it as an opportunity to learn how the JVM dtests work. If you'd rather have someone more experienced with that part of the code base involved though I'm happy to look for simpler issues first. EDIT handed it back, I can still review the codebase and if I come up with a fix before someone else takes it..I'll submit a PR then, but that way I'm not blocking someone else. was (Author: rssvihla): [~ifesdjeen] yes I was looking into it and using it as an opportunity to learn how the JVM dtests work. If you'd rather have someone more experienced with that part of the code base involved though I'm happy to look for simpler issues first. > Avoid lazy initializing shut down instances when trying to send them messages > - > > Key: CASSANDRA-15624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15624 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > > We currently use {{to.broadcastAddressAndPort()}} when figuring out if we > should send a message to an instance, if that instance has been shut down it > will get re-initialized but not startup:ed which makes the tests fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15624) Avoid lazy initializing shut down instances when trying to send them messages
[ https://issues.apache.org/jira/browse/CASSANDRA-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15624: --- Assignee: (was: Ryan Svihla) > Avoid lazy initializing shut down instances when trying to send them messages > - > > Key: CASSANDRA-15624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15624 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > > We currently use {{to.broadcastAddressAndPort()}} when figuring out if we > should send a message to an instance, if that instance has been shut down it > will get re-initialized but not startup:ed which makes the tests fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15624) Avoid lazy initializing shut down instances when trying to send them messages
[ https://issues.apache.org/jira/browse/CASSANDRA-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058532#comment-17058532 ] Ryan Svihla edited comment on CASSANDRA-15624 at 3/13/20, 9:04 AM: --- [~ifesdjeen] yes I was looking into it and using it as an opportunity to learn how the JVM dtests work. If you'd rather have someone more experienced with that part of the code base involved though I'm happy to look for simpler issues first. was (Author: rssvihla): [~ifesdjeen] yes I was looking into it and using it as an opportunity to learn how the JVM tests work. If you'd rather have someone more experienced with that part of the code base involved though I'm happy to look for simpler issues first. > Avoid lazy initializing shut down instances when trying to send them messages > - > > Key: CASSANDRA-15624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15624 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Marcus Eriksson >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > We currently use {{to.broadcastAddressAndPort()}} when figuring out if we > should send a message to an instance, if that instance has been shut down it > will get re-initialized but not startup:ed which makes the tests fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15624) Avoid lazy initializing shut down instances when trying to send them messages
[ https://issues.apache.org/jira/browse/CASSANDRA-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058532#comment-17058532 ] Ryan Svihla commented on CASSANDRA-15624: - [~ifesdjeen] yes I was looking into it and using it as an opportunity to learn how the JVM tests work. If you'd rather have someone more experienced with that part of the code base involved though I'm happy to look for simpler issues first. > Avoid lazy initializing shut down instances when trying to send them messages > - > > Key: CASSANDRA-15624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15624 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Marcus Eriksson >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > We currently use {{to.broadcastAddressAndPort()}} when figuring out if we > should send a message to an instance, if that instance has been shut down it > will get re-initialized but not startup:ed which makes the tests fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15624) Avoid lazy initializing shut down instances when trying to send them messages
[ https://issues.apache.org/jira/browse/CASSANDRA-15624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15624: --- Assignee: Ryan Svihla > Avoid lazy initializing shut down instances when trying to send them messages > - > > Key: CASSANDRA-15624 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15624 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Marcus Eriksson >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > We currently use {{to.broadcastAddressAndPort()}} when figuring out if we > should send a message to an instance, if that instance has been shut down it > will get re-initialized but not startup:ed which makes the tests fail. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057944#comment-17057944 ] Ryan Svihla commented on CASSANDRA-15234: - I think this is overall a good idea, but it definitely needs some limits on the number of combinations supported. Let's just look at a hypothetical new configuration `compaction_throughput` instead of `compaction_throughput_mb_per_sec` (which is a poster child for a difficult to remember name in need of review): * Any downstream tooling that reads configuration will have to take on everything we add, which is fine, but the more math we require the worse it is on them to get updated. An older tool will read compaction_throughput_mb_per_sec and it was self documenting. A new tool will have to take into account every variant we support for MiB/MB/mb/bytes/etc/etc * what about `500mb` vs `500mb/s` vs `500 mbs` (note the space) vs `500MiB/s` vs `500 mb/s` (note the 2 spaces). Which of those is obviously valid or wrong at a glance to a new user? So if we're going to do it, definitely only accept one valid that's the same for everything..I still think it's adding some learning curve for new users. > Standardise config and JVM parameters > - > > Key: CASSANDRA-15234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15234 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Benedict Elliott Smith >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0, 4.0-beta > > > We have a bunch of inconsistent names and config patterns in the codebase, > both from the yams and JVM properties. It would be nice to standardise the > naming (such as otc_ vs internode_) as well as the provision of values with > units - while maintaining perpetual backwards compatibility with the old > parameter names, of course. > For temporal units, I would propose parsing strings with suffixes of: > {{code}} > u|micros(econds?)? > ms|millis(econds?)? > s(econds?)? > m(inutes?)? > h(ours?)? > d(ays?)? > mo(nths?)? > {{code}} > For rate units, I would propose parsing any of the standard {{B/s, KiB/s, > MiB/s, GiB/s, TiB/s}}. > Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or > powers of 1000 such as {{KB/s}}, given these are regularly used for either > their old or new definition e.g. {{KiB/s}}, or we could support them and > simply log the value in bytes/s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15629) Fix flakey testSendSmall - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15629: --- Assignee: (was: Ryan Svihla) > Fix flakey testSendSmall - org.apache.cassandra.net.ConnectionTest > -- > > Key: CASSANDRA-15629 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15629 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > The test fails sometimes with the following error message and trace. > {code:java} > processed count values don't match expected:<10> but was:<9> > junit.framework.AssertionFailedError: processed count values don't match > expected:<10> but was:<9> > at > org.apache.cassandra.net.ConnectionUtils$InboundCountChecker.doCheck(ConnectionUtils.java:217) > at > org.apache.cassandra.net.ConnectionUtils$InboundCountChecker.check(ConnectionUtils.java:200) > at > org.apache.cassandra.net.ConnectionTest.lambda$testSendSmall$11(ConnectionTest.java:305) > at > org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:242) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:262) > at org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:240) > at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:229) > at > org.apache.cassandra.net.ConnectionTest.testSendSmall(ConnectionTest.java:277) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15629) Fix flakey testSendSmall - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15629: --- Assignee: Ryan Svihla > Fix flakey testSendSmall - org.apache.cassandra.net.ConnectionTest > -- > > Key: CASSANDRA-15629 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15629 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Yifan Cai >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-beta > > > The test fails sometimes with the following error message and trace. > {code:java} > processed count values don't match expected:<10> but was:<9> > junit.framework.AssertionFailedError: processed count values don't match > expected:<10> but was:<9> > at > org.apache.cassandra.net.ConnectionUtils$InboundCountChecker.doCheck(ConnectionUtils.java:217) > at > org.apache.cassandra.net.ConnectionUtils$InboundCountChecker.check(ConnectionUtils.java:200) > at > org.apache.cassandra.net.ConnectionTest.lambda$testSendSmall$11(ConnectionTest.java:305) > at > org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:242) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:262) > at org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:240) > at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:229) > at > org.apache.cassandra.net.ConnectionTest.testSendSmall(ConnectionTest.java:277) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057724#comment-17057724 ] Ryan Svihla commented on CASSANDRA-15605: - After running for awhile I did get a failure on a different test (not one of the one's listed) so that maybe worth another jira..but no flakey failures on the listed tests above after an overnight run. > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057238#comment-17057238 ] Ryan Svihla edited comment on CASSANDRA-15605 at 3/11/20, 5:25 PM: --- Looks like the racks were assumed to be in the same order, adding a sort gets the tests to pass. I'm not sure why the order is changing sometimes, but in all cases the topology had changed and the comparison was just missing that fact. fails {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.3 105.28 KiB 1 ? 41df89ae-0631-447c-9def-ff1fbff3b8a7 rack2}} {{UN 127.0.0.2 164.25 KiB 1 ? 47a49e4e-6964-4b00-b353-d3ab345f4aba rack1}} {{UN 127.0.0.1 128.46 KiB 1 ? b63ad12d-1d12-4e51-95ba-88ff61b252ea rack0}} works {\{16:05:35,630 replication_test DEBUG Datacenter: dc1 }} {{===}} {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.1 132.51 KiB 1 100.0% 8b244cb0-135d-4745-b498-86f46187348f rack0}} {{UN 127.0.0.2 164.36 KiB 1 100.0% 36f42f34-fe8c-4543-b6c9-6db12b93b660 rack1}} {{UN 127.0.0.3 192.66 KiB 1 100.0% 1224e768-db1f-4d56-a1c8-db9b764a46d5 rack2}} [pr|https://github.com/apache/cassandra-dtest/pull/58] was (Author: rssvihla): https://github.com/apache/cassandra-dtest/pull/58 > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15605: Comment: was deleted (was: Looks like the racks were assumed to be in the same order, adding a sort gets the tests to pass. I'm not sure why the order is changing sometimes, but in all cases the topology had changed and the comparison was just missing that fact. fails {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.3 105.28 KiB 1 ? 41df89ae-0631-447c-9def-ff1fbff3b8a7 rack2}} {{UN 127.0.0.2 164.25 KiB 1 ? 47a49e4e-6964-4b00-b353-d3ab345f4aba rack1}} {{UN 127.0.0.1 128.46 KiB 1 ? b63ad12d-1d12-4e51-95ba-88ff61b252ea rack0}} works {{16:05:35,630 replication_test DEBUG Datacenter: dc1 }} {{===}} {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.1 132.51 KiB 1 100.0% 8b244cb0-135d-4745-b498-86f46187348f rack0}} {{UN 127.0.0.2 164.36 KiB 1 100.0% 36f42f34-fe8c-4543-b6c9-6db12b93b660 rack1}} {{UN 127.0.0.3 192.66 KiB 1 100.0% 1224e768-db1f-4d56-a1c8-db9b764a46d5 rack2}} [pr|https://github.com/apache/cassandra-dtest/pull/58]) > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15605: Test and Documentation Plan: not needed Status: Patch Available (was: In Progress) https://github.com/apache/cassandra-dtest/pull/58 > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057232#comment-17057232 ] Ryan Svihla edited comment on CASSANDRA-15605 at 3/11/20, 5:17 PM: --- Looks like the racks were assumed to be in the same order, adding a sort gets the tests to pass. I'm not sure why the order is changing sometimes, but in all cases the topology had changed and the comparison was just missing that fact. fails {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.3 105.28 KiB 1 ? 41df89ae-0631-447c-9def-ff1fbff3b8a7 rack2}} {{UN 127.0.0.2 164.25 KiB 1 ? 47a49e4e-6964-4b00-b353-d3ab345f4aba rack1}} {{UN 127.0.0.1 128.46 KiB 1 ? b63ad12d-1d12-4e51-95ba-88ff61b252ea rack0}} works {{16:05:35,630 replication_test DEBUG Datacenter: dc1 }} {{===}} {{Status=Up/Down}} {{|/ State=Normal/Leaving/Joining/Moving}} {{-- Address Load Tokens Owns (effective) Host ID Rack}} {{UN 127.0.0.1 132.51 KiB 1 100.0% 8b244cb0-135d-4745-b498-86f46187348f rack0}} {{UN 127.0.0.2 164.36 KiB 1 100.0% 36f42f34-fe8c-4543-b6c9-6db12b93b660 rack1}} {{UN 127.0.0.3 192.66 KiB 1 100.0% 1224e768-db1f-4d56-a1c8-db9b764a46d5 rack2}} [pr|https://github.com/apache/cassandra-dtest/pull/58] was (Author: rssvihla): Looks like the racks were assumed to be in the same order, adding a sort gets the tests to pass. I'm not sure why the order is changing sometimes, but in all cases the topology had changed and the comparison was just missing that fact: [pr|https://github.com/apache/cassandra-dtest/pull/58] > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057232#comment-17057232 ] Ryan Svihla commented on CASSANDRA-15605: - Looks like the racks were assumed to be in the same order, adding a sort gets the tests to pass. I'm not sure why the order is changing sometimes, but in all cases the topology had changed and the comparison was just missing that fact: [pr|https://github.com/apache/cassandra-dtest/pull/58] > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056886#comment-17056886 ] Ryan Svihla commented on CASSANDRA-15605: - {{So I got them to finally fail, I had to get down to a 2 core VM before they would:}} {{test_rf_collapse_gossiping_property_file_snitch passed 1 out of the required 1 times. Success!}} {{test_rf_expand_gossiping_property_file_snitch failed and was not selected for rerun.}} {{ }} {{ Ran out of time waiting for topology to change on node 0}} {{ [, , ] }} {{test_rf_collapse_property_file_snitch passed 1 out of the required 1 times. Success!}} {{test_rf_expand_property_file_snitch failed and was not selected for rerun.}} {{ }} {{ Ran out of time waiting for topology to change on node 0}} {{ [, , ] }} {{test_cannot_restart_with_different_rack passed 1 out of the required 1 times. Success!}} {{test_failed_snitch_update_gossiping_property_file_snitch passed 1 out of the required 1 times. Success! }} {{test_failed_snitch_update_property_file_snitch passed 1 out of the required 1 times. Success!}} {{test_switch_data_center_startup_fails passed 1 out of the required 1 times. Success!}} > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15605) Broken dtest replication_test.py::TestSnitchConfigurationUpdate
[ https://issues.apache.org/jira/browse/CASSANDRA-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15605: --- Assignee: Ryan Svihla > Broken dtest replication_test.py::TestSnitchConfigurationUpdate > --- > > Key: CASSANDRA-15605 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15605 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Sam Tunnicliffe >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > Noticed this failing on a couple of CI runs and repros when running trunk > locally and on CircleCI > 2 or 3 tests are consistently failing: > * {{test_rf_expand_gossiping_property_file_snitch}} > * {{test_rf_expand_property_file_snitch}} > * {{test_move_forwards_between_and_cleanup}} > [https://circleci.com/workflow-run/f23f13a9-bbdc-4764-8336-109517e137f1] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055682#comment-17055682 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/10/20, 8:15 AM: --- I borrowed the code from 15303 and updated the PR. Some notes: * There is an additional fix in there which is related to comparison of relations. This doesn't appear to be related to either Jira, but it makes sense at a glance, either way [~jasonstack] added it and he can comment * Made ClientState.getTimestatmp() static and call it directly instead of FBUtilities.timestampMicros() But I'm also ok with someone first merging in 15303, then I can do another PR based on that code base just with the changes that use ClientState.getTimetamp() instead of FBUtilities.timestampMicros(). was (Author: rssvihla): I borrowed the code from 15303 and updated the PR. Some notes: * There is an additional fix in there which is related to comparison of relations. This doesn't appear to be related to either Jira, but it makes sense at a glance, either way [~jasonstack] added it and he can comment * Made ClientState.getTimestatmp() static and call it directly instead of FBUtilities.timestampMicros() But I'm also ok with someone first merging in 15303, then I can do another PR based on that code base just with the changes that use ClientState.getTimetamp() instead of FBUtilities.timestampMicros(). > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055682#comment-17055682 ] Ryan Svihla commented on CASSANDRA-15557: - I borrowed the code from 15303 and updated the PR. Some notes: * There is an additional fix in there which is related to comparison of relations. This doesn't appear to be related to either Jira, but it makes sense at a glance, either way [~jasonstack] added it and he can comment * Made ClientState.getTimestatmp() static and call it directly instead of FBUtilities.timestampMicros() But I'm also ok with someone first merging in 15303, then I can do another PR based on that code base just with the changes that use ClientState.getTimetamp() instead of FBUtilities.timestampMicros(). > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15626) Need microsecond precision for dropped columns so we can avoid timestamp issues
[ https://issues.apache.org/jira/browse/CASSANDRA-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-15626: Description: In CASSANDRA-15557 the fix for the flaky test is reimplementing the logic from CASSANDRA-12997 which was removed as part of CASSANDRA-13426 However, since dropped columns are stored at a millisecond precision instead of a microsecond precision and ClientState.getTimestamp adds microseconds on each call we will lose the precision on save and some writes that should be dropped could reappear. Note views affected as well [https://github.com/apache/cassandra/blob/cb83fbff479bb90e9abeaade9e0f8843634c974d/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L712-L716] was: In CASSANDRA-15557 the fix for the flaky test was reimplementing the logic from CASSANDRA-12997 which was removed as part of CASSANDRA-13426 However, since dropped columns are stored at a millisecond precision instead of a microsecond precision and ClientState.getTimestamp adds microseconds on each call we will lose the precision on save and some writes that should be dropped could reappear. Note views affected as well [https://github.com/apache/cassandra/blob/cb83fbff479bb90e9abeaade9e0f8843634c974d/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L712-L716] > Need microsecond precision for dropped columns so we can avoid timestamp > issues > --- > > Key: CASSANDRA-15626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15626 > Project: Cassandra > Issue Type: Improvement > Components: Local/SSTable >Reporter: Ryan Svihla >Priority: Normal > > In CASSANDRA-15557 the fix for the flaky test is reimplementing the logic > from CASSANDRA-12997 which was removed as part of CASSANDRA-13426 > However, since dropped columns are stored at a millisecond precision instead > of a microsecond precision and ClientState.getTimestamp adds microseconds on > each call we will lose the precision on save and some writes that should be > dropped could reappear. > Note views affected as well > > [https://github.com/apache/cassandra/blob/cb83fbff479bb90e9abeaade9e0f8843634c974d/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L712-L716] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054989#comment-17054989 ] Ryan Svihla commented on CASSANDRA-15557: - New [PR|https://github.com/apache/cassandra/pull/465] Note: I think this also happens to fix this behavior in CASSANDRA-15303 and made a new Jira for the issue this causes with dropped columns and precision https://issues.apache.org/jira/browse/CASSANDRA-15626 > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15626) Need microsecond precision for dropped columns so we can avoid timestamp issues
Ryan Svihla created CASSANDRA-15626: --- Summary: Need microsecond precision for dropped columns so we can avoid timestamp issues Key: CASSANDRA-15626 URL: https://issues.apache.org/jira/browse/CASSANDRA-15626 Project: Cassandra Issue Type: Improvement Components: Local/SSTable Reporter: Ryan Svihla In CASSANDRA-15557 the fix for the flaky test was reimplementing the logic from CASSANDRA-12997 which was removed as part of CASSANDRA-13426 However, since dropped columns are stored at a millisecond precision instead of a microsecond precision and ClientState.getTimestamp adds microseconds on each call we will lose the precision on save and some writes that should be dropped could reappear. Note views affected as well [https://github.com/apache/cassandra/blob/cb83fbff479bb90e9abeaade9e0f8843634c974d/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L712-L716] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054908#comment-17054908 ] Ryan Svihla commented on CASSANDRA-15557: - So looking at the alter schema logic more: [https://github.com/apache/cassandra/blob/08b2192da0eb6deddcd8f79cd180d069442223ae/src/java/org/apache/cassandra/cql3/statements/schema/AlterTableStatement.java#L398] and [https://github.com/apache/cassandra/blob/08b2192da0eb6deddcd8f79cd180d069442223ae/src/java/org/apache/cassandra/cql3/statements/schema/AlterTableStatement.java#L411-L426] it does seem (naively) reasonable to have it using the ClientState's getTimestamp() method in the AlterTableStatement since the ClientState is already there, but I'm sure I'm missing lots of background. Will wait for more experience people to weigh in. > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054887#comment-17054887 ] Ryan Svihla commented on CASSANDRA-15557: - So digging into the actual failure, this took a few tries as wrapping logging, or flushing sstables seemed to make it hard to reproduce, I've confirmed it's time based errors at least in this case: row ts: {{1583753957613001 }} dropped time: {{1583753957613000}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED }} {{[junit-timeout] Dropped column: \{java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001 }} {{[junit-timeout] junit.framework.AssertionFailedError: Dropped column: \{java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:102) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) }} {{[junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) }} {{[junit-timeout] Caused by: java.lang.AssertionError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]> }} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:98) }} {{[junit-timeout]}} {{[junit-timeout]}} > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054887#comment-17054887 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/9/20, 11:56 AM: --- So digging into the actual failure, this took a few tries as wrapping logging, or flushing sstables seemed to make it hard to reproduce, I've confirmed it's time based errors at least in this case: row ts: 1583753957613001 dropped time: {{1583753957613000}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED }} {{[junit-timeout] Dropped column: {java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001 }} {{[junit-timeout] junit.framework.AssertionFailedError: Dropped column: {java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:102) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) }} {{[junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) }} {{[junit-timeout] Caused by: java.lang.AssertionError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]> }} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:98) }} {{[junit-timeout]}} {{[junit-timeout]}} was (Author: rssvihla): So digging into the actual failure, this took a few tries as wrapping logging, or flushing sstables seemed to make it hard to reproduce, I've confirmed it's time based errors at least in this case: row ts: {{1583753957613001 }} dropped time: {{1583753957613000}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED }} {{[junit-timeout] Dropped column: \{java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001 }} {{[junit-timeout] junit.framework.AssertionFailedError: Dropped column: \{java.nio.HeapByteBuffer[pos=0 lim=12 cap=12]=DroppedColumn{column=mycollection, droppedTime=1583753957613000}} Row timestamp: 1583753957613001}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:102) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) }} {{[junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) }} {{[junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) }} {{[junit-timeout] Caused by: java.lang.AssertionError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]> }} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:98) }} {{[junit-timeout]}} {{[junit-timeout]}} > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsu
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052417#comment-17052417 ] Ryan Svihla commented on CASSANDRA-15557: - Yep makes sense. I'd feel better if I had some proof why it's happening now (logging more or copying out the dropped columns table makes it not happen on the boxes I've replicated it with), but either way it can definitely happen with System.currentTimeMillis() and clock skew, and we'd need to protect against that. I reopened my PR and restored my branch with the client side timestamp, simple fix: https://github.com/rssvihla/cassandra/pull/1 > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1705#comment-1705 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/5/20, 10:27 AM: --- Yes silly me, easy enough to prove when setting the timestamps to the same value the tests still pass, and just to totally eliminate weird side effects, setting the timestamp in reverse breaks the test as expected so I don't think there is anything specific about setting the timestamp to any value that makes it pass. So I'm not sure I have a good theory as to why breaks in the first place. It does then act as if somehow the drop is happening "before" the add. I'll keep digging thanks for the review. was (Author: rssvihla): Yes silly me, easy enough to prove when setting the timestamps to the same value the tests still pass, and just to totally eliminate weird side effects, setting the timestamp in reverse breaks the test as expected so I don't think there is anything specific about setting the timestamp to any value that makes it pass. So I'm not sure I have a good theory as to why breaks in the first place. It does then act as if somehow the drop is happening "before" the insert. I'll keep digging thanks for the review. > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1705#comment-1705 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/5/20, 10:27 AM: --- Yes silly me, easy enough to prove when setting the timestamps to the same value the tests still pass, and just to totally eliminate weird side effects, setting the timestamp in reverse breaks the test as expected so I don't think there is anything specific about setting the timestamp to any value that makes it pass. So I'm not sure I have a good theory as to why breaks in the first place. It does then act as if somehow the drop is happening "before" the insert. I'll keep digging thanks for the review. was (Author: rssvihla): Yes silly me, easy enough to prove when setting the timestamps to the same value the tests still pass, and just to totally eliminate weird side effects, setting the timestamp in reverse breaks the test as expected so I don't think there is anything specific about setting the timestamp to any value that makes it pass. So I'm not sure I have a good theory as to why breaks in the first place. It does then act as if somehow the drop is happening "before" the add. I'll keep digging thanks for the review. > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1705#comment-1705 ] Ryan Svihla commented on CASSANDRA-15557: - Yes silly me, easy enough to prove when setting the timestamps to the same value the tests still pass, and just to totally eliminate weird side effects, setting the timestamp in reverse breaks the test as expected so I don't think there is anything specific about setting the timestamp to any value that makes it pass. So I'm not sure I have a good theory as to why breaks in the first place. It does then act as if somehow the drop is happening "before" the insert. I'll keep digging thanks for the review. > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051025#comment-17051025 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/4/20, 9:22 AM: -- Not sure what's going on there, maybe some Jira macro run amok. Take 2 Code as it was when the test was originally fixed [https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0R253] EDIT dunno why the highlight isn't being added to the links..[src/java/org/apache/cassandra/cql3/statements/AlterTableStatement.java|https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0] line 253 Change below that I think stripped out the client state timestamp [https://github.com/apache/cassandra/commit/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6#diff-50a7bf78bc4f943f42ce60c8768484a6R400] EDIT ( [src/java/org/apache/cassandra/cql3/statements/schema/AlterTableStatement.java|https://github.com/apache/cassandra/commit/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6#diff-50a7bf78bc4f943f42ce60c8768484a6] line 400) was (Author: rssvihla): Not sure what's going on there, maybe some Jira macro run amok. Take 2 Code as it was when the test was originally fixed [https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0R253] Change below that I think stripped out the client state timestamp [https://github.com/apache/cassandra/commit/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6#diff-50a7bf78bc4f943f42ce60c8768484a6R400] > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051025#comment-17051025 ] Ryan Svihla commented on CASSANDRA-15557: - Not sure what's going on there, maybe some Jira macro run amok. Take 2 Code as it was when the test was originally fixed [https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0R253] Change below that I think stripped out the client state timestamp [https://github.com/apache/cassandra/commit/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6#diff-50a7bf78bc4f943f42ce60c8768484a6R400] > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051007#comment-17051007 ] Ryan Svihla commented on CASSANDRA-15557: - [PR|[https://github.com/rssvihla/cassandra/pull/1]][Code|[https://github.com/rssvihla/cassandra/tree/failing-test-15557]] Ran the tests all night on two different servers and had no issues (whereas before on the same hardware I had 30 or 40 failures usually within 5 to 10 minute). The theory I have is this test got broken when the schema management changed in https://issues.apache.org/jira/browse/CASSANDRA-13426 and the test no longer was relying on client query state to set a timestamp as it was [here|[https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0R253]|https://github.com/pcmanus/cassandra/commit/020960d727b882349aff00c17a1c46bf3c7f7a24#diff-43b2d530031e2f7ad4286bd05fed4ca0R253]and this logic appears to not have made it back in [here|[https://github.com/apache/cassandra/commit/207c80c1fd63dfbd8ca7e615ec8002ee8983c5d6#diff-50a7bf78bc4f943f42ce60c8768484a6R400]] which meant an occasional tie, but maybe I'm misreading it all, lots of code to go through in that PR. The fix I used was one of the two proposed in 12997 (ie just using manual time stamp). > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla reassigned CASSANDRA-15557: --- Assignee: Ryan Svihla > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Ryan Svihla >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049232#comment-17049232 ] Ryan Svihla commented on CASSANDRA-15557: - It would fit if this was a timestamp issue like the previous failing of this test (see https://issues.apache.org/jira/browse/CASSANDRA-12997) but I'd expect to see it more on the faster machines than on the slower ones that I've been able to repro this quickly with. > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048991#comment-17048991 ] Ryan Svihla edited comment on CASSANDRA-15557 at 3/2/20 10:14 AM: -- Seems real as I've been able to reproduce a failure in a few environments just running the testsuite in a loop, it varies how frequently it happens. So far it seems the slower the env (fewer cores, less ram, etc) the more likely it happens. Using _while ant test -Dtest.name=AlterTest; do :; done_ {{[junit-timeout] DEBUG [PerDiskMemtableFlushWriter_0:1] 2020-03-02 09:39:35,759 Memtable.java:483 - Completed flushing /home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db (0.035KiB) for commitlog position CommitLogPosition(segmentId=1583141964092, position=7807)}} {{[junit-timeout] DEBUG [MemtableFlushWriter:2] 2020-03-02 09:39:35,764 ColumnFamilyStore.java:1144 - Flushed to [BigTableReader(path='/home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db')] (1 sstables, 4.830KiB), biggest 4.830KiB, smallest 4.830KiB}} {{[junit-timeout] - ---}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED}} {{[junit-timeout] Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91)}} {{[junit-timeout] }} {{[junit-timeout] }} {{[junit-timeout] Test org.apache.cassandra.cql3.validation.operations.AlterTest FAILED}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/commitlog:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/data:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/saved_caches:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/hints:0}} {{[junitreport] Processing /home/debian/code/cassandra/build/test/TESTS-TestSuites.xml to /tmp/null291777528}} {{[junitreport] Loading stylesheet jar:[file:/usr/share/ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl|file:///usr/share/ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl]}} {{[junitreport] Transform time: 483ms}} {{[junitreport] Deleting: /tmp/null291777528}}{{BUILD FAILED}} {{/home/debian/code/cassandra/build.xml:1930: The following error occurred while executing this line:}} {{/home/debian/code/cassandra/build.xml:1831: Some test(s) failed.}} was (Author: rssvihla): Seems real as I've been able to reproduce a failure in a few environments just running the testsuite in a loop, it varies how frequently it happens. So far it seems the slower the env the more likely it happens. Using _while ant test -Dtest.name=AlterTest; do :; done_ {{[junit-timeout] DEBUG [PerDiskMemtableFlushWriter_0:1] 2020-03-02 09:39:35,759 Memtable.java:483 - Completed flushing /home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db (0.035KiB) for commitlog position CommitLogPosition(segmentId=1583141964092, position=7807)}} {{[junit-timeout] DEBUG [MemtableFlushWriter:2] 2020-03-02 09:39:35,764 ColumnFamilyStore.java:1144 - Flushed to [BigTableReader(path='/home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db')] (1 sstables, 4.830KiB), biggest 4.830KiB, smallest 4.830KiB}} {{[junit-timeout] - ---}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED}} {{[junit-timeout] Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91)}} {{[junit-timeout] }} {{[junit-timeout] }} {{[junit-timeout] Test org.apache.cassandra.cql3.validation.operations.AlterTest F
[jira] [Commented] (CASSANDRA-15557) Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest testDropListAndAddListWithSameName
[ https://issues.apache.org/jira/browse/CASSANDRA-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17048991#comment-17048991 ] Ryan Svihla commented on CASSANDRA-15557: - Seems real as I've been able to reproduce a failure in a few environments just running the testsuite in a loop, it varies how frequently it happens. So far it seems the slower the env the more likely it happens. Using _while ant test -Dtest.name=AlterTest; do :; done_ {{[junit-timeout] DEBUG [PerDiskMemtableFlushWriter_0:1] 2020-03-02 09:39:35,759 Memtable.java:483 - Completed flushing /home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db (0.035KiB) for commitlog position CommitLogPosition(segmentId=1583141964092, position=7807)}} {{[junit-timeout] DEBUG [MemtableFlushWriter:2] 2020-03-02 09:39:35,764 ColumnFamilyStore.java:1144 - Flushed to [BigTableReader(path='/home/debian/code/cassandra/build/test/cassandra/data:0/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/na-194-big-Data.db')] (1 sstables, 4.830KiB), biggest 4.830KiB, smallest 4.830KiB}} {{[junit-timeout] - ---}} {{[junit-timeout] Testcase: testDropListAndAddListWithSameName(org.apache.cassandra.cql3.validation.operations.AlterTest): FAILED}} {{[junit-timeout] Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 0 column 2 (mycollection of type list), expected but got <[first element]>}} {{[junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070)}} {{[junit-timeout] at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91)}} {{[junit-timeout] }} {{[junit-timeout] }} {{[junit-timeout] Test org.apache.cassandra.cql3.validation.operations.AlterTest FAILED}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/commitlog:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/data:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/saved_caches:0}} {{ [delete] Deleting directory /home/debian/code/cassandra/build/test/cassandra/hints:0}} {{[junitreport] Processing /home/debian/code/cassandra/build/test/TESTS-TestSuites.xml to /tmp/null291777528}} {{[junitreport] Loading stylesheet jar:file:/usr/share/ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl}} {{[junitreport] Transform time: 483ms}} {{[junitreport] Deleting: /tmp/null291777528}}{{BUILD FAILED}} {{/home/debian/code/cassandra/build.xml:1930: The following error occurred while executing this line:}} {{/home/debian/code/cassandra/build.xml:1831: Some test(s) failed.}} > Fix flaky test org.apache.cassandra.cql3.validation.operations.AlterTest > testDropListAndAddListWithSameName > --- > > Key: CASSANDRA-15557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15557 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-alpha > > > https://app.circleci.com/jobs/github/dcapwell/cassandra/482/tests > {code} > junit.framework.AssertionFailedError: Invalid value for row 0 column 2 > (mycollection of type list), expected but got <[first element]> > at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1070) > at > org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:91) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14737) Limit the dependencies used by UDFs/UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030509#comment-17030509 ] Ryan Svihla commented on CASSANDRA-14737: - +1 I applied the patch to a recent trunk and the builds are passing locally with the exception of some of the dtests but those have all been flakey. I can't see any downside to applying the patch and it'd be nice to separate this out. > Limit the dependencies used by UDFs/UDAs > > > Key: CASSANDRA-14737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14737 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Low > Labels: UDF > Fix For: 4.0 > > > In an effort to clean up our hygiene and limit the dependencies used by > UDFs/UDAs, I think we should refactor the UDF code parts and remove the > dependency to the Java Driver in that area without breaking existing > UDFs/UDAs. > > The patch is in [this > branch|https://github.com/snazy/cassandra/tree/feature/remove-udf-driver-dep-trunk]. > The changes are rather trivial and provide 100% backwards compatibility for > existing UDFs. > > The prototype copies the necessary parts from the Java Driver into the C* > source tree to {{org.apache.cassandra.cql3.functions.types}} and adopts its > usages - i.e. UDF/UDA code plus {{CQLSSTableWriter}} + > {{StressCQLSSTableWriter}}. The latter two classes have a reference to UDF's > {{UDHelper}} and had to be changed as well. > > Some functionality, like type parsing & handling, is duplicated in the code > base with this prototype - once in the "current" source tree and once for > UDFs. However, unifying the code paths is not trivial, since the UDF sandbox > prohibits the use of internal classes (direct and likely indirect > dependencies). > > /cc [~jbellis] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-14737) Limit the dependencies used by UDFs/UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-14737: Comment: was deleted (was: Changes look valid and the tests are passing +1) > Limit the dependencies used by UDFs/UDAs > > > Key: CASSANDRA-14737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14737 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Low > Labels: UDF > Fix For: 4.0 > > > In an effort to clean up our hygiene and limit the dependencies used by > UDFs/UDAs, I think we should refactor the UDF code parts and remove the > dependency to the Java Driver in that area without breaking existing > UDFs/UDAs. > > The patch is in [this > branch|https://github.com/snazy/cassandra/tree/feature/remove-udf-driver-dep-trunk]. > The changes are rather trivial and provide 100% backwards compatibility for > existing UDFs. > > The prototype copies the necessary parts from the Java Driver into the C* > source tree to {{org.apache.cassandra.cql3.functions.types}} and adopts its > usages - i.e. UDF/UDA code plus {{CQLSSTableWriter}} + > {{StressCQLSSTableWriter}}. The latter two classes have a reference to UDF's > {{UDHelper}} and had to be changed as well. > > Some functionality, like type parsing & handling, is duplicated in the code > base with this prototype - once in the "current" source tree and once for > UDFs. However, unifying the code paths is not trivial, since the UDF sandbox > prohibits the use of internal classes (direct and likely indirect > dependencies). > > /cc [~jbellis] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14737) Limit the dependencies used by UDFs/UDAs
[ https://issues.apache.org/jira/browse/CASSANDRA-14737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027832#comment-17027832 ] Ryan Svihla commented on CASSANDRA-14737: - Changes look valid and the tests are passing +1 > Limit the dependencies used by UDFs/UDAs > > > Key: CASSANDRA-14737 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14737 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Low > Labels: UDF > Fix For: 4.0 > > > In an effort to clean up our hygiene and limit the dependencies used by > UDFs/UDAs, I think we should refactor the UDF code parts and remove the > dependency to the Java Driver in that area without breaking existing > UDFs/UDAs. > > The patch is in [this > branch|https://github.com/snazy/cassandra/tree/feature/remove-udf-driver-dep-trunk]. > The changes are rather trivial and provide 100% backwards compatibility for > existing UDFs. > > The prototype copies the necessary parts from the Java Driver into the C* > source tree to {{org.apache.cassandra.cql3.functions.types}} and adopts its > usages - i.e. UDF/UDA code plus {{CQLSSTableWriter}} + > {{StressCQLSSTableWriter}}. The latter two classes have a reference to UDF's > {{UDHelper}} and had to be changed as well. > > Some functionality, like type parsing & handling, is duplicated in the code > base with this prototype - once in the "current" source tree and once for > UDFs. However, unifying the code paths is not trivial, since the UDF sandbox > prohibits the use of internal classes (direct and likely indirect > dependencies). > > /cc [~jbellis] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13315) Semantically meaningful Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Summary: Semantically meaningful Consistency Levels (was: Consistency is confusing for new users) > Semantically meaningful Consistency Levels > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13318) Include number of messages attempted when logging message drops
[ https://issues.apache.org/jira/browse/CASSANDRA-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13318: Summary: Include number of messages attempted when logging message drops (was: Include # Of messages attempted when logging message drops) > Include number of messages attempted when logging message drops > --- > > Key: CASSANDRA-13318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13318 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > I use the log messages for mutation drops a lot for diagnostics, it'd be > helpful if we included the number of messages attempted so we can get a > glance by looking at the logs for the load during that time. > 1131 MUTATION messages dropped in last 5000ms > to > > 1131 MUTATION messages dropped out of 5000 MUTATION messages in last 5000ms -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13318) Include # Of messages attempted when logging message drops
Ryan Svihla created CASSANDRA-13318: --- Summary: Include # Of messages attempted when logging message drops Key: CASSANDRA-13318 URL: https://issues.apache.org/jira/browse/CASSANDRA-13318 Project: Cassandra Issue Type: Improvement Reporter: Ryan Svihla I use the log messages for mutation drops a lot for diagnostics, it'd be helpful if we included the number of messages attempted so we can get a glance by looking at the logs for the load during that time. 1131 MUTATION messages dropped in last 5000ms to 1131 MUTATION messages dropped out of 5000 MUTATION messages in last 5000ms -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903890#comment-15903890 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 9:43 PM: - I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the 2 DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. was (Author: rssvihla): I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the TWO DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903890#comment-15903890 ] Ryan Svihla commented on CASSANDRA-13315: - I don't think any should be removed, but nearly every time I dig into an EACH_QUORUM use case with someone..it ends up being not what they want. EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack of replicas you can still be returning the 'failed write' successfully on the nodes it did succeed on, so in effect during DC connection outages unless you just turn writes off you get divergence between the TWO DCs and reads in one DC show up and not in another. Also several customers with EACH_QUORUM have had downgrading retry policy on..defeating it entirely. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903481#comment-15903481 ] Ryan Svihla commented on CASSANDRA-13315: - For SERIAL/TRANSACTIONAL I think there is a better middle ground name in there..LIGHTWEIGHT_TRANSACTION maybe, more cassandra specific and a hint for new users without being as technically incorrect as TRANSACTIONAL > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > o > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: > EDIT better names bases on comments. >* EVENTUALLY = LOCAL_ONE reads and writes >* STRONG = LOCAL_QUORUM reads and writes >* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know > what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as > correct as Id like) > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. o 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: EDIT better names bases on comments. * EVENTUALLY = LOCAL_ONE reads and writes * STRONG = LOCAL_QUORUM reads and writes * SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as correct as Id like) for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to w
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903468#comment-15903468 ] Ryan Svihla commented on CASSANDRA-13315: - Jeff how about just tagging it with _DC..slight push back everyone is happily calling multidc rdbms acid compliant even when thats only inside a dc > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903419#comment-15903419 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 5:14 PM: - On the power users it'll just be up to the driver implementers how they handle that (different packages and methods for the power user for example). was (Author: rssvihla): On the power users it'll just be up to the driver implementers right how they handle that (different packages and methods for the power user for example). > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903419#comment-15903419 ] Ryan Svihla commented on CASSANDRA-13315: - On the power users it'll just be up to the driver implementers right how they handle that (different packages and methods for the power user for example). > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps (EDIT based on Jonathan's comment): > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket to set, conditions still need to be > required for SERIAL/LOCAL_SERIAL > 2. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditions still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for new users >
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified in the primary CL. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for ne
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set, conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps (EDIT based on Jonathan's comment): 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket to set conditional updates still need to be required for SERIAL/LOCAL_SERIAL 2. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistency is confusing for new users >
[jira] [Updated] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-13315: Description: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified in the primary CL. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. was: New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. > Consistenc
[jira] [Comment Edited] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903404#comment-15903404 ] Ryan Svihla edited comment on CASSANDRA-13315 at 3/9/17 5:04 PM: - those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL mutations I'm fine. was (Author: rssvihla): those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL I'm fine. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps: > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket at the protocol level. > 2. To enable #1 just reject writes or updates done without a condition when > SERIAL/LOCAL_SERIAL is specified. > 3. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903404#comment-15903404 ] Ryan Svihla commented on CASSANDRA-13315: - those are better names +1 on that. Dual CL yeah I mistated that and we're on the same page with intent, as I've stated it's hard to even talk about it in text without getting bewildered. Just as long as we have only a single bucket to set and we require a condition for SERIAL I'm fine. > Consistency is confusing for new users > -- > > Key: CASSANDRA-13315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 > Project: Cassandra > Issue Type: Improvement >Reporter: Ryan Svihla > > New users really struggle with consistency level and fall into a large number > of tarpits trying to decide on the right one. > 1. There are a LOT of consistency levels and it's up to the end user to > reason about what combinations are valid and what is really what they intend > it to be. Is there any reason why write at ALL and read at CL TWO is better > than read at CL ONE? > 2. They require a good understanding of failure modes to do well. It's not > uncommon for people to use CL one and wonder why their data is missing. > 3. The serial consistency level "bucket" is confusing to even write about and > easy to get wrong even for experienced users. > So I propose the following steps: > 1. Remove the "serial consistency" level of consistency levels and just have > all consistency levels in one bucket at the protocol level. > 2. To enable #1 just reject writes or updates done without a condition when > SERIAL/LOCAL_SERIAL is specified. > 3. add 3 new consistency levels pointing to existing ones but that infer > intent much more cleanly: >* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes >* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes >* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes > for global levels of this I propose keeping the old ones around, they're > rarely used in the field except by accident or particularly opinionated and > advanced users. > Drivers should put the new consistency levels in a new package and docs > should be updated to suggest their use. Likewise setting default CL should > only provide those three settings and applying it for reads and writes at the > same time. > CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins > get surprised by this frequently and I can think of a couple very major > escalations because people were confused what the default behavior was. > The benefit to all this change is we shrink the surface area that one has to > understand when learning Cassandra greatly, and we have far less bad initial > experiences and surprises. New users will more likely be able to wrap their > brains around those 3 ideas more readily then they can "what happens when I > have RF2, QUROUM writes and ONE reads". Advanced users get access to all the > way still, while new users don't have to learn all the ins and outs of > distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13315) Consistency is confusing for new users
Ryan Svihla created CASSANDRA-13315: --- Summary: Consistency is confusing for new users Key: CASSANDRA-13315 URL: https://issues.apache.org/jira/browse/CASSANDRA-13315 Project: Cassandra Issue Type: Improvement Reporter: Ryan Svihla New users really struggle with consistency level and fall into a large number of tarpits trying to decide on the right one. 1. There are a LOT of consistency levels and it's up to the end user to reason about what combinations are valid and what is really what they intend it to be. Is there any reason why write at ALL and read at CL TWO is better than read at CL ONE? 2. They require a good understanding of failure modes to do well. It's not uncommon for people to use CL one and wonder why their data is missing. 3. The serial consistency level "bucket" is confusing to even write about and easy to get wrong even for experienced users. So I propose the following steps: 1. Remove the "serial consistency" level of consistency levels and just have all consistency levels in one bucket at the protocol level. 2. To enable #1 just reject writes or updates done without a condition when SERIAL/LOCAL_SERIAL is specified. 3. add 3 new consistency levels pointing to existing ones but that infer intent much more cleanly: * EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes * HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes * TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes for global levels of this I propose keeping the old ones around, they're rarely used in the field except by accident or particularly opinionated and advanced users. Drivers should put the new consistency levels in a new package and docs should be updated to suggest their use. Likewise setting default CL should only provide those three settings and applying it for reads and writes at the same time. CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins get surprised by this frequently and I can think of a couple very major escalations because people were confused what the default behavior was. The benefit to all this change is we shrink the surface area that one has to understand when learning Cassandra greatly, and we have far less bad initial experiences and surprises. New users will more likely be able to wrap their brains around those 3 ideas more readily then they can "what happens when I have RF2, QUROUM writes and ONE reads". Advanced users get access to all the way still, while new users don't have to learn all the ins and outs of distributed theory just to write data and be able to read it back. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb
[ https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886677#comment-15886677 ] Ryan Svihla commented on CASSANDRA-13241: - Density wise it depends on the use case and the needs, worked on a ton of clusters over 2TB per node though. > Lower default chunk_length_in_kb from 64kb to 4kb > - > > Key: CASSANDRA-13241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13241 > Project: Cassandra > Issue Type: Wish > Components: Core >Reporter: Benjamin Roth > > Having a too low chunk size may result in some wasted disk space. A too high > chunk size may lead to massive overreads and may have a critical impact on > overall system performance. > In my case, the default chunk size lead to peak read IOs of up to 1GB/s and > avg reads of 200MB/s. After lowering chunksize (of course aligned with read > ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s. > The risk of (physical) overreads is increasing with lower (page cache size) / > (total data size) ratio. > High chunk sizes are mostly appropriate for bigger payloads pre request but > if the model consists rather of small rows or small resultsets, the read > overhead with 64kb chunk size is insanely high. This applies for example for > (small) skinny rows. > Please also see here: > https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY > To give you some insights what a difference it can make (460GB data, 128GB > RAM): > - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L > - Disk throughput: https://cl.ly/2a0Z250S1M3c > - This shows, that the request distribution remained the same, so no "dynamic > snitch magic": https://cl.ly/3E0t1T1z2c0J -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12394) Node crashes during compaction with assertion error checking length of entries for IndexSummaryBuilder
[ https://issues.apache.org/jira/browse/CASSANDRA-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409648#comment-15409648 ] Ryan Svihla commented on CASSANDRA-12394: - Appears to be a dupe to https://issues.apache.org/jira/browse/CASSANDRA-12014 though it also affects 2.1.x same issue > Node crashes during compaction with assertion error checking length of > entries for IndexSummaryBuilder > -- > > Key: CASSANDRA-12394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12394 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Ryan Svihla >Priority: Blocker > > This limit appears to be arbitrarily Integer.MAX the rest of the values are > all referencing a type of long. Worse when this is encountered the node > crashes. Seems to be introduced in > https://issues.apache.org/jira/browse/CASSANDRA-8757 > Sstables are 350-440gb > Relevant code is > https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L171 > stack trace: > [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 > - Exception in thread Thread[CompactionExecutor:47340,1,main] > java.lang.AssertionError: null > at > org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12394) Node crashes during compaction with assertion error checking length of entries for IndexSummaryBuilder
[ https://issues.apache.org/jira/browse/CASSANDRA-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-12394: Summary: Node crashes during compaction with assertion error checking length of entries for IndexSummaryBuilder (was: Node crashes during compaction with assertion error checking length of entries for PartitionSummaryBuilder) > Node crashes during compaction with assertion error checking length of > entries for IndexSummaryBuilder > -- > > Key: CASSANDRA-12394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12394 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Ryan Svihla >Priority: Blocker > > This limit appears to be arbitrarily Integer.MAX the rest of the values are > all referencing a type of long. Worse when this is encountered the node > crashes. Seems to be introduced in > https://issues.apache.org/jira/browse/CASSANDRA-8757 > Sstables are 350-440gb > Relevant code is > https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L171 > stack trace: > [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 > - Exception in thread Thread[CompactionExecutor:47340,1,main] > java.lang.AssertionError: null > at > org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_51] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12394) Node crashes during compaction with assertion error checking length of entries for PartitionSummaryBuilder
[ https://issues.apache.org/jira/browse/CASSANDRA-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-12394: Description: This limit appears to be arbitrarily Integer.MAX the rest of the values are all referencing a type of long. Worse when this is encountered the node crashes. Seems to be introduced in https://issues.apache.org/jira/browse/CASSANDRA-8757 Sstables are 350-440gb Relevant code is https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java#L171 stack trace: [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 - Exception in thread Thread[CompactionExecutor:47340,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] was: This limit appears to be arbitrarily Integer.MAX the rest of the values are all referencing a type of long. Worse when this is encountered the node crashes. Seems to be introduced in https://issues.apache.org/jira/browse/CASSANDRA-8757 Sstables are 350-440gb stack trace: [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 - Exception in thread Thread[CompactionExecutor:47340,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.Th
[jira] [Updated] (CASSANDRA-12394) Node crashes during compaction with assertion error checking length of entries for PartitionSummaryBuilder
[ https://issues.apache.org/jira/browse/CASSANDRA-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-12394: Description: This limit appears to be arbitrarily Integer.MAX the rest of the values are all referencing a type of long. Worse when this is encountered the node crashes. Seems to be introduced in https://issues.apache.org/jira/browse/CASSANDRA-8757 Sstables are 350-440gb stack trace: [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 - Exception in thread Thread[CompactionExecutor:47340,1,main] java.lang.AssertionError: null at org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:263) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] was: This limit appears to be arbitrarily Integer.MAX the rest of the values are all referencing a type of long. Worse when this is encountered the node crashes. Seems to be introduced in https://issues.apache.org/jira/browse/CASSANDRA-8757 > Node crashes during compaction with assertion error checking length of > entries for PartitionSummaryBuilder > -- > > Key: CASSANDRA-12394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12394 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Ryan Svihla >Priority: Blocker > > This limit appears to be arbitrarily Integer.MAX the rest of the values are > all referencing a type of long. Worse when this is encountered the node > crashes. Seems to be introduced in > https://issues.apache.org/jira/browse/CASSANDRA-8757 > Sstables are 350-440gb > stack trace: > [CompactionExecutor:47340] 2016-07-17 04:55:08,966 CassandraDaemon.java:229 > - Exception in thread Thread[CompactionExecutor:47340,1,main] > java.lang.AssertionError: null > at > org.apache.cassandra.io.sstable.IndexSummaryBuilder.maybeAddEntry(IndexSummaryBuilder.java:171) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.append(SSTableWriter.java:634) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.afterAppend(SSTableWriter.java:179) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:205) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) > ~[cassandra-all-2.1.12.1046.jar:2.1.12.10
[jira] [Created] (CASSANDRA-12394) Node crashes during compaction with assertion error checking length of entries for PartitionSummaryBuilder
Ryan Svihla created CASSANDRA-12394: --- Summary: Node crashes during compaction with assertion error checking length of entries for PartitionSummaryBuilder Key: CASSANDRA-12394 URL: https://issues.apache.org/jira/browse/CASSANDRA-12394 Project: Cassandra Issue Type: Bug Components: Compaction Reporter: Ryan Svihla Priority: Blocker This limit appears to be arbitrarily Integer.MAX the rest of the values are all referencing a type of long. Worse when this is encountered the node crashes. Seems to be introduced in https://issues.apache.org/jira/browse/CASSANDRA-8757 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12231) Make large mutations easier to find
Ryan Svihla created CASSANDRA-12231: --- Summary: Make large mutations easier to find Key: CASSANDRA-12231 URL: https://issues.apache.org/jira/browse/CASSANDRA-12231 Project: Cassandra Issue Type: Improvement Reporter: Ryan Svihla Priority: Minor Apologies if this has already been submitted, my lacking Jira search foo will be to blame. There are two problems related to large mutations: 1. Mutations that are really large and fail the write with the following "Mutation of %s is too large for the maximum size of %s" (in CommitLog.java). While this should be something the clients can handle, they often dont and the DBA is the first person to notice this. If we could log the primary key attempted it could be helpful to track down the culprit. 2. Mutations that are still too large but still under the threshold, so they silently crush the server. We could add a warn to this the same way we do with batch logs with ideally a configurable threshold. It would also be handy if we could include the primary key used. Today, these are super nasty today to track down and involve a scan of the dataset using something like Spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11992) Consistency Level Histograms
Ryan Svihla created CASSANDRA-11992: --- Summary: Consistency Level Histograms Key: CASSANDRA-11992 URL: https://issues.apache.org/jira/browse/CASSANDRA-11992 Project: Cassandra Issue Type: New Feature Reporter: Ryan Svihla Priority: Minor It would be really handy to diagnose data inconsistency issues if we had a counter for how often a given consistency level was attempted on coordinators (could be handy cluster wide too) on a given table. nodetool clhistogram foo_keyspace foo_table CLREAD/WRITE ANY 0/1 ONE 0/0 TWO 0/100 THREE 0/0 LOCAL_ONE 0/1000 LOCAL_QUORUM 1000/2000 QUORUM 0/1000 EACH_QUORUM 0/0 ALL 0/0 Open to better layout or better separator, this is just off the top of my head. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11831) Ability to disable purgeable tombstone check via startup flag
[ https://issues.apache.org/jira/browse/CASSANDRA-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-11831: Description: On Cassandra 2.1.14 hen a node gets way behind and has 10s of thousand sstables it appears a lot of the CPU time is spent doing checks like this on a call to getMaxPurgeableTimestamp org.apache.cassandra.utils.Murmur3BloomFilter.hash(java.nio.ByteBuffer, int, int, long, long[]) @bci=13, line=57 (Compiled frame; information may be imprecise) - org.apache.cassandra.utils.BloomFilter.indexes(java.nio.ByteBuffer) @bci=22, line=82 (Compiled frame) - org.apache.cassandra.utils.BloomFilter.isPresent(java.nio.ByteBuffer) @bci=2, line=107 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionController.maxPurgeableTimestamp(org.apache.cassandra.db.DecoratedKey) @bci=89, line=186 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.getMaxPurgeableTimestamp() @bci=21, line=99 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.access$300(org.apache.cassandra.db.compaction.LazilyCompactedRow) @bci=1, line=49 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=241, line=296 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=1, line=206 (Compiled frame) - org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext() @bci=44, line=206 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame) - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=645 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame) - org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(java.util.Iterator) @bci=1, line=166 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.write(long, org.apache.cassandra.io.util.DataOutputPlus) @bci=52, line=121 (Compiled frame) - org.apache.cassandra.io.sstable.SSTableWriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=18, line=193 (Compiled frame) - org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=13, line=127 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionTask.runMayThrow() @bci=666, line=197 (Compiled frame) - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=6, line=73 (Compiled frame) - org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=2, line=59 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run() @bci=125, line=264 (Compiled frame) - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame) - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Compiled frame) - java.lang.Thread.run() @bci=11, line=745 (Compiled frame) If we could at least on startup pass a flag like -DskipTombstonePurgeCheck so we could in these particularly bad cases just avoid the calculation and merge tables until we have less to worry about then restart the node with that flag missing once we're down to a more manageable amount of sstables. was: On Cassandra 2.1.14 hen a node gets way behind and has 10s of thousand sstables it appears a lot of the CPU time is spent doing checks like this on a call to getMaxPurgeableTimestamp org.apache.cassandra.utils.Murmur3BloomFilter.hash(java.nio.ByteBuffer, int, int, long, long[]) @bci=13, line=57 (Compiled frame; information may be imprecise) - org.apache.cassandra.utils.BloomFilter.indexes(java.nio.ByteBuffer) @bci=22, line=82 (Compiled frame) - org.apache.cassandra.utils.BloomFilter.isPresent(java.nio.ByteBuffer) @bci=2, line=107 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionController.maxPurgeableTimestamp(org.apache.cassandra.db.DecoratedKey) @bci=89, line=186 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.getMaxPurgeableTimestamp() @bci=21, line=9
[jira] [Created] (CASSANDRA-11831) Ability to disable purgeable tombstone check via startup flag
Ryan Svihla created CASSANDRA-11831: --- Summary: Ability to disable purgeable tombstone check via startup flag Key: CASSANDRA-11831 URL: https://issues.apache.org/jira/browse/CASSANDRA-11831 Project: Cassandra Issue Type: New Feature Reporter: Ryan Svihla On Cassandra 2.1.14 hen a node gets way behind and has 10s of thousand sstables it appears a lot of the CPU time is spent doing checks like this on a call to getMaxPurgeableTimestamp org.apache.cassandra.utils.Murmur3BloomFilter.hash(java.nio.ByteBuffer, int, int, long, long[]) @bci=13, line=57 (Compiled frame; information may be imprecise) - org.apache.cassandra.utils.BloomFilter.indexes(java.nio.ByteBuffer) @bci=22, line=82 (Compiled frame) - org.apache.cassandra.utils.BloomFilter.isPresent(java.nio.ByteBuffer) @bci=2, line=107 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionController.maxPurgeableTimestamp(org.apache.cassandra.db.DecoratedKey) @bci=89, line=186 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.getMaxPurgeableTimestamp() @bci=21, line=99 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.access$300(org.apache.cassandra.db.compaction.LazilyCompactedRow) @bci=1, line=49 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=241, line=296 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced() @bci=1, line=206 (Compiled frame) - org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext() @bci=44, line=206 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame) - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=645 (Compiled frame) - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9, line=143 (Compiled frame) - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138 (Compiled frame) - org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(java.util.Iterator) @bci=1, line=166 (Compiled frame) - org.apache.cassandra.db.compaction.LazilyCompactedRow.write(long, org.apache.cassandra.io.util.DataOutputPlus) @bci=52, line=121 (Compiled frame) - org.apache.cassandra.io.sstable.SSTableWriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=18, line=193 (Compiled frame) - org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow) @bci=13, line=127 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionTask.runMayThrow() @bci=666, line=197 (Compiled frame) - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=6, line=73 (Compiled frame) - org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector) @bci=2, line=59 (Compiled frame) - org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run() @bci=125, line=264 (Compiled frame) - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame) - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Compiled frame) - java.lang.Thread.run() @bci=11, line=745 (Compiled frame) If we could at least on startup pass a flag like -DskipTombstonePurgeCheck so we could in these particularly bad cases just avoid the calculation and merge tables until we have less to worry about then restart the node with that flag missing once we're down to a more manageable amount of sstables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11721) Have a per operation truncate ddl "no snapshot" option
[ https://issues.apache.org/jira/browse/CASSANDRA-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274469#comment-15274469 ] Ryan Svihla commented on CASSANDRA-11721: - I've thought about it a bit: 1. NO SNAPSHOT is probably the most pure and clean and satisfies even the most pendantic user who wants their temporary data backed up in C* when a drop or typical truncate is called, but comes at the cost of changing truncate and having driver dependencies. 2. table based is easy to impliment and satisfies a lot of people even if a couple of people will be sad. They probably can just log their data in another table before they truncate if they're that determined to have it backed up. > Have a per operation truncate ddl "no snapshot" option > -- > > Key: CASSANDRA-11721 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11721 > Project: Cassandra > Issue Type: Wish > Components: CQL >Reporter: Jeremy Hanna >Priority: Minor > > Right now with truncate, it will always create a snapshot. That is the right > thing to do most of the time. 'auto_snapshot' exists as an option to disable > that but it is server wide and requires a restart to change. There are data > models, however, that require rotating through a handful of tables and > periodically truncating them. Currently you either have to operate with no > safety net (some actually do this) or manually clear those snapshots out > periodically. Both are less than optimal. > In HDFS, you generally delete something where it goes to the trash. If you > don't want that safety net, you can do something like 'rm -rf -skiptrash > /jeremy/stuff' in one command. > It would be nice to have something in the truncate ddl to skip the snapshot > on a per operation basis. Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'. > This might also be useful in those situations where you're just playing with > data and you don't want something to take a snapshot in a development system. > If that's the case, this would also be useful for the DROP operation, but > that convenience is not the main reason for this option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11170) Uneven load can be created by cross DC mutation propagations, as remote coordinator is not randomly picked
[ https://issues.apache.org/jira/browse/CASSANDRA-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148798#comment-15148798 ] Ryan Svihla commented on CASSANDRA-11170: - Fat partitions and bad data models exist, no need to make them worse by pinning all write load to one unfortunate unlucky node until it dies. Going to a single node just lowers the bar for the data model falling apart, I get RF writes will happen anyway, but I'm assuming coordinator work is non trivial (especially on higher levels of CL) and I know from observation that hint handling and replay is non trivial especially at certain points (I'm certain improved with file based hints but I'm also certain not free). Final point, if you think of the stereotypical time series bucket data model the "stick to the primary token owner" approach will generate more hints than SOME strategy of balancing the load. > Uneven load can be created by cross DC mutation propagations, as remote > coordinator is not randomly picked > -- > > Key: CASSANDRA-11170 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11170 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Wei Deng > > I was looking at the o.a.c.service.StorageProxy code and realized that it > seems to be always picking the first IP in the remote DC target list as the > destination, whenever it needs to send the mutation to a remote DC. See these > lines in the code: > https://github.com/apache/cassandra/blob/1944bf507d66b5c103c136319caeb4a9e3767a69/src/java/org/apache/cassandra/service/StorageProxy.java#L1280-L1301 > This could cause one node in the remote DC receiving more mutation messages > than the other nodes, and hence uneven workload distribution. > A trivial test (with TRACE logging level enabled) on a 3+3 node cluster > proved the problem, see the system.log entries below: > {code} > INFO [RMI TCP Connection(18)-54.173.227.52] 2016-02-13 09:54:55,948 > StorageService.java:3353 - set log level to TRACE for classes under > 'org.apache.cassandra.service.StorageProxy' (if the level doesn't look like > 'TRACE' then the logger couldn't parse 'TRACE') > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,148 StorageProxy.java:1284 - > Adding FWD message to 8996@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149 StorageProxy.java:1284 - > Adding FWD message to 8997@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:15,149 StorageProxy.java:1289 - > Sending message to 8998@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,939 StorageProxy.java:1284 - > Adding FWD message to 9032@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,940 StorageProxy.java:1284 - > Adding FWD message to 9033@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:22,941 StorageProxy.java:1289 - > Sending message to 9034@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,975 StorageProxy.java:1284 - > Adding FWD message to 9064@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,976 StorageProxy.java:1284 - > Adding FWD message to 9065@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:28,977 StorageProxy.java:1289 - > Sending message to 9066@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,464 StorageProxy.java:1284 - > Adding FWD message to 9094@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,465 StorageProxy.java:1284 - > Adding FWD message to 9095@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:33,478 StorageProxy.java:1289 - > Sending message to 9096@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,243 StorageProxy.java:1284 - > Adding FWD message to 9121@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244 StorageProxy.java:1284 - > Adding FWD message to 9122@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:39,244 StorageProxy.java:1289 - > Sending message to 9123@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,248 StorageProxy.java:1284 - > Adding FWD message to 9145@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249 StorageProxy.java:1284 - > Adding FWD message to 9146@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:44,249 StorageProxy.java:1289 - > Sending message to 9147@/54.183.209.219 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,731 StorageProxy.java:1284 - > Adding FWD message to 9170@/52.53.215.74 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,734 StorageProxy.java:1284 - > Adding FWD message to 9171@/54.183.23.201 > TRACE [SharedPool-Worker-1] 2016-02-13 09:55:49,735 StorageProxy.java:1289 - > Sending messag
[jira] [Comment Edited] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142003#comment-15142003 ] Ryan Svihla edited comment on CASSANDRA-11153 at 2/11/16 12:16 AM: --- Use case is something like this: 1. Stateless servers so all useful data has to be passed over the URL (and can bump around randomly to different servers so you start talking shared cache for the state) 2. permalinks based on page number to never changing data. the only valid alternative to this is they would have to retain the "start id" which still requires them figuring out what that is in the first place, and so they'll have to do something like this, client side, still. Again this is not how I would ever design an application, but this is hyper common in legacy use cases and it would be nice to give them some approach that is "fast enough" for their needs, even if it doesn't scale, even if it's not optimal. Most of these corners are not high perf sub ms latency use cases. was (Author: rssvihla): Use case is something like this: 1. Stateless servers so all useful data has to be passed over the URL (and can bump around randomly to different servers so you start talking shared cache for the state) 2. permalinks based on page number to never changing data. the only valid alternative to this is they would have to retain the "start id" which still requires them figuring out what that is in the first place, and so they'll have to do something like this, client side, still. Again this is not how I would ever design an application, but this is hyper common in legacy use cases and it would be nice to give them some approach that is "fast enough" for their needs. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142003#comment-15142003 ] Ryan Svihla commented on CASSANDRA-11153: - Use case is something like this: 1. Stateless servers so all useful data has to be passed over the URL (and can bump around randomly to different servers so you start talking shared cache for the state) 2. permalinks based on page number to never changing data. the only valid alternative to this is they would have to retain the "start id" which still requires them figuring out what that is in the first place, and so they'll have to do something like this, client side, still. Again this is not how I would ever design an application, but this is hyper common in legacy use case and it would be nice to give them some approach that is "fast enough" for their needs. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142003#comment-15142003 ] Ryan Svihla edited comment on CASSANDRA-11153 at 2/11/16 12:16 AM: --- Use case is something like this: 1. Stateless servers so all useful data has to be passed over the URL (and can bump around randomly to different servers so you start talking shared cache for the state) 2. permalinks based on page number to never changing data. the only valid alternative to this is they would have to retain the "start id" which still requires them figuring out what that is in the first place, and so they'll have to do something like this, client side, still. Again this is not how I would ever design an application, but this is hyper common in legacy use cases and it would be nice to give them some approach that is "fast enough" for their needs. was (Author: rssvihla): Use case is something like this: 1. Stateless servers so all useful data has to be passed over the URL (and can bump around randomly to different servers so you start talking shared cache for the state) 2. permalinks based on page number to never changing data. the only valid alternative to this is they would have to retain the "start id" which still requires them figuring out what that is in the first place, and so they'll have to do something like this, client side, still. Again this is not how I would ever design an application, but this is hyper common in legacy use case and it would be nice to give them some approach that is "fast enough" for their needs. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141673#comment-15141673 ] Ryan Svihla edited comment on CASSANDRA-11153 at 2/10/16 9:13 PM: -- Ok thinking about this is more, this is basically a "what is worse" option. Folks that need to do horrible paging queries because of some legacy interface that they pass around (http://foo.com/?page=9) will just read the whole partition and throw away the extra AKA lousy client side mode. If we required this be an "ALLOW FILTERING" option, I think that would tag it as a bad idea, but still enable people to do something less horrible than what they're already doing. For smaller partitions this should be plenty fast enough still. was (Author: rssvihla): Ok thinking about this is more, this is basically a "what is worse" option. Folks that need to do horrible paging queries because of some legacy interface that they pass around (http://foo.com/?page=9) will just read the whole partition and throw away the extra AKA lousy client side mode. If we required this be an "ALLOW FILTERING" option, I think that would tag it as a bad idea, but still enable people to do something less horrible than what they're already doing. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141673#comment-15141673 ] Ryan Svihla commented on CASSANDRA-11153: - Ok thinking about this is more, this is basically a "what is worse" option. Folks that need to do horrible paging queries because of some legacy interface that they pass around (http://foo.com/?page=9) will just read the whole partition and throw away the extra AKA lousy client side mode. If we required this be an "ALLOW FILTERING" option, I think that would tag it as a bad idea, but still enable people to do something less horrible than what they're already doing. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141646#comment-15141646 ] Ryan Svihla commented on CASSANDRA-11153: - That's a true argument of pagers in general and people still implement them anyway. But if it's data that isn't updated I'm not sure why it wouldn't be consistent. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11153) Row offset within a partition
[ https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-11153: Description: While doing this across partitions would be awful, inside of a partition this seems like a reasonable request. Something like: SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 with a schema such as: CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY KEY(bucket, id)); This could ease pain in migration of legacy use cases and I'm not convinced the read cost has to be horrible when it's inside of a single partition. EDIT: I'm aware there is already an issue https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition key requirement is where we get enough performance to provide the flexibility in dealing with legacy apps that are stuck on a 'go to page 8' concept for their application flow without incurring a huge hit scanning a cluster and tossing the first 5 nodes results. was: While doing this across partitions would be awful, inside of a partition this seems like a reasonable request. Something like: SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 with a schema such as: CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY KEY(bucket, id)); This could ease pain in migration of legacy use cases and I'm not convinced the read cost has to be horrible when it's inside of a single partition. > Row offset within a partition > - > > Key: CASSANDRA-11153 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla >Priority: Minor > > While doing this across partitions would be awful, inside of a partition this > seems like a reasonable request. Something like: > SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 > with a schema such as: > CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY > KEY(bucket, id)); > This could ease pain in migration of legacy use cases and I'm not convinced > the read cost has to be horrible when it's inside of a single partition. > EDIT: I'm aware there is already an issue > https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition > key requirement is where we get enough performance to provide the flexibility > in dealing with legacy apps that are stuck on a 'go to page 8' concept for > their application flow without incurring a huge hit scanning a cluster and > tossing the first 5 nodes results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11153) Row offset within a partition
Ryan Svihla created CASSANDRA-11153: --- Summary: Row offset within a partition Key: CASSANDRA-11153 URL: https://issues.apache.org/jira/browse/CASSANDRA-11153 Project: Cassandra Issue Type: New Feature Reporter: Ryan Svihla Priority: Minor While doing this across partitions would be awful, inside of a partition this seems like a reasonable request. Something like: SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100 with a schema such as: CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY KEY(bucket, id)); This could ease pain in migration of legacy use cases and I'm not convinced the read cost has to be horrible when it's inside of a single partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10013) Default commitlog_total_space_in_mb to 4G
[ https://issues.apache.org/jira/browse/CASSANDRA-10013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714231#comment-14714231 ] Ryan Svihla commented on CASSANDRA-10013: - I thought we'd had but I figured you knew better than I did :) > Default commitlog_total_space_in_mb to 4G > - > > Key: CASSANDRA-10013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10013 > Project: Cassandra > Issue Type: Improvement > Components: Config >Reporter: Brandon Williams > Fix For: 2.1.x > > > First, it bothers me that we default to 1G but have 4G commented out in the > config. > More importantly though is more than once I've seen this lead to dropped > mutations, because you have ~100 tables (which isn't that hard to do with > OpsCenter and CFS and an application that uses a moderately high but still > reasonable amount of tables itself) and when the limit is reached CLA flushes > the oldest tables to try to free up CL space, but this in turn causes a flush > stampede that in some cases never ends and backs up the flush queue which > then causes the drops. This leaves you thinking you have a load shedding > situation (which I guess you kind of do) but it would go away if you had just > uncommented that config line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10013) Default commitlog_total_space_in_mb to 4G
[ https://issues.apache.org/jira/browse/CASSANDRA-10013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713624#comment-14713624 ] Ryan Svihla commented on CASSANDRA-10013: - +1 on warning AND 1/8th free space. Yes people are doing that, seeing more "Cassandra for HA" use cases. Sadly I think these are pilot projects for bigger Cassandra deployments later on and they're using what they have available. > Default commitlog_total_space_in_mb to 4G > - > > Key: CASSANDRA-10013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10013 > Project: Cassandra > Issue Type: Improvement > Components: Config >Reporter: Brandon Williams > Fix For: 2.1.x > > > First, it bothers me that we default to 1G but have 4G commented out in the > config. > More importantly though is more than once I've seen this lead to dropped > mutations, because you have ~100 tables (which isn't that hard to do with > OpsCenter and CFS and an application that uses a moderately high but still > reasonable amount of tables itself) and when the limit is reached CLA flushes > the oldest tables to try to free up CL space, but this in turn causes a flush > stampede that in some cases never ends and backs up the flush queue which > then causes the drops. This leaves you thinking you have a load shedding > situation (which I guess you kind of do) but it would go away if you had just > uncommented that config line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10013) Default commitlog_total_space_in_mb to 4G
[ https://issues.apache.org/jira/browse/CASSANDRA-10013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713012#comment-14713012 ] Ryan Svihla edited comment on CASSANDRA-10013 at 8/26/15 12:16 PM: --- Only negative, and I've hit this, is the commit log filling up a small partition that previously was large enough. While I can safely say this should NEVER be a problem, it sometimes is and affects 'small deployments' like some of the smaller cloud instances (EC2 m3.medium for example is only 4g). On the positive side: - better default performance (better out of the box experience for most customers) On the negative side: - will block some new users from even using Cassandra ( new to cloud and new to Cassandra I'd think ) - May break some existing 2.1.x deployments who are just relying on default (most) and have a tiny partition for commit log (I'd hope very very few) If we go this route we need to make sure the minimum requirements are updated in the docs and wiki correspondingly. It is also surprising in a point release to change a default. EDIT: I also agree the configuration having a comment different than the default is HYPER surprising and would be the first time I've even seen that in the wild. was (Author: rssvihla): Only negative and I've hit this, is the commit log filling up a small partition that previously was large enough. While I can safely say this should NEVER be a problem, it sometimes is and affects 'small deployments' like some of the smaller cloud instances (EC2 m3.medium for example is only 4g). On the positive side: - better default performance (better out of the box experience for most customers) On the negative side: - will block some new users from even using Cassandra ( new to cloud and new to Cassandra I'd think ) - May break some existing 2.1.x deployments who are just relying on default (most) and have a tiny partition for commit log (I'd hope very very few) If we go this route we need to make sure the minimum requirements are updated in the docs and wiki correspondingly. It is also surprising in a point release to change a default. EDIT: I also agree the configuration have a comment different than the default is HYPER surprising and would be the first time I've even seen that in the wild. > Default commitlog_total_space_in_mb to 4G > - > > Key: CASSANDRA-10013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10013 > Project: Cassandra > Issue Type: Improvement > Components: Config >Reporter: Brandon Williams > Fix For: 2.1.x > > > First, it bothers me that we default to 1G but have 4G commented out in the > config. > More importantly though is more than once I've seen this lead to dropped > mutations, because you have ~100 tables (which isn't that hard to do with > OpsCenter and CFS and an application that uses a moderately high but still > reasonable amount of tables itself) and when the limit is reached CLA flushes > the oldest tables to try to free up CL space, but this in turn causes a flush > stampede that in some cases never ends and backs up the flush queue which > then causes the drops. This leaves you thinking you have a load shedding > situation (which I guess you kind of do) but it would go away if you had just > uncommented that config line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10013) Default commitlog_total_space_in_mb to 4G
[ https://issues.apache.org/jira/browse/CASSANDRA-10013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713012#comment-14713012 ] Ryan Svihla edited comment on CASSANDRA-10013 at 8/26/15 12:16 PM: --- Only negative and I've hit this, is the commit log filling up a small partition that previously was large enough. While I can safely say this should NEVER be a problem, it sometimes is and affects 'small deployments' like some of the smaller cloud instances (EC2 m3.medium for example is only 4g). On the positive side: - better default performance (better out of the box experience for most customers) On the negative side: - will block some new users from even using Cassandra ( new to cloud and new to Cassandra I'd think ) - May break some existing 2.1.x deployments who are just relying on default (most) and have a tiny partition for commit log (I'd hope very very few) If we go this route we need to make sure the minimum requirements are updated in the docs and wiki correspondingly. It is also surprising in a point release to change a default. EDIT: I also agree the configuration have a comment different than the default is HYPER surprising and would be the first time I've even seen that in the wild. was (Author: rssvihla): Only negative and I've hit this, is the commit log filling up a small partition that previously was large enough. While I can safely say this should NEVER be a problem, it sometimes is and affects 'small deployments' like some of the smaller cloud instances (EC2 m3.medium for example is only 4g). On the positive side: - better default performance (better out of the box experience for most customers) On the negative side: - will block some new users from even using Cassandra ( new to cloud and new to Cassandra I'd think ) - May break some existing 2.1.x deployments who are just relying on default (most) and have a tiny partition for commit log (I'd hope very very few) If we go this route we need to make sure the minimum requirements are updated in the docs and wiki correspondingly. It is also surprising in a point release to change a default. > Default commitlog_total_space_in_mb to 4G > - > > Key: CASSANDRA-10013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10013 > Project: Cassandra > Issue Type: Improvement > Components: Config >Reporter: Brandon Williams > Fix For: 2.1.x > > > First, it bothers me that we default to 1G but have 4G commented out in the > config. > More importantly though is more than once I've seen this lead to dropped > mutations, because you have ~100 tables (which isn't that hard to do with > OpsCenter and CFS and an application that uses a moderately high but still > reasonable amount of tables itself) and when the limit is reached CLA flushes > the oldest tables to try to free up CL space, but this in turn causes a flush > stampede that in some cases never ends and backs up the flush queue which > then causes the drops. This leaves you thinking you have a load shedding > situation (which I guess you kind of do) but it would go away if you had just > uncommented that config line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10013) Default commitlog_total_space_in_mb to 4G
[ https://issues.apache.org/jira/browse/CASSANDRA-10013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713012#comment-14713012 ] Ryan Svihla commented on CASSANDRA-10013: - Only negative and I've hit this, is the commit log filling up a small partition that previously was large enough. While I can safely say this should NEVER be a problem, it sometimes is and affects 'small deployments' like some of the smaller cloud instances (EC2 m3.medium for example is only 4g). On the positive side: - better default performance (better out of the box experience for most customers) On the negative side: - will block some new users from even using Cassandra ( new to cloud and new to Cassandra I'd think ) - May break some existing 2.1.x deployments who are just relying on default (most) and have a tiny partition for commit log (I'd hope very very few) If we go this route we need to make sure the minimum requirements are updated in the docs and wiki correspondingly. It is also surprising in a point release to change a default. > Default commitlog_total_space_in_mb to 4G > - > > Key: CASSANDRA-10013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10013 > Project: Cassandra > Issue Type: Improvement > Components: Config >Reporter: Brandon Williams > Fix For: 2.1.x > > > First, it bothers me that we default to 1G but have 4G commented out in the > config. > More importantly though is more than once I've seen this lead to dropped > mutations, because you have ~100 tables (which isn't that hard to do with > OpsCenter and CFS and an application that uses a moderately high but still > reasonable amount of tables itself) and when the limit is reached CLA flushes > the oldest tables to try to free up CL space, but this in turn causes a flush > stampede that in some cases never ends and backs up the flush queue which > then causes the drops. This leaves you thinking you have a load shedding > situation (which I guess you kind of do) but it would go away if you had just > uncommented that config line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT
[ https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548435#comment-14548435 ] Ryan Svihla commented on CASSANDRA-9415: This would be a big win for a lot of analytics tools and would bring us ever closer to RDBMS for ease of use. I can see this greatly smoothing the learning curve for new users as well. > Implicit use of Materialized Views on SELECT > > > Key: CASSANDRA-9415 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9415 > Project: Cassandra > Issue Type: Improvement >Reporter: Brian Hess > > CASSANDRA-6477 introduces Materialized Views. This greatly simplifies the > write path for the best-practice of "query tables". But it does not simplify > the read path as much as our users want/need. > We suggest to folks to create multiple copies of their base table optimized > for certain queries - hence "query table". For example, we may have a USER > table with two type of queries: lookup by userid and lookup by email address. > We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL. Both > would have the exact same schema, with the same PRIMARY KEY columns, but > different PARTITION KEY - the first would be USERID and the second would be > EMAIL. > One complicating thing with this approach is that the application now needs > to know that when it INSERT/UPDATE/DELETEs from the base table it needs to > INSERT/UPDATE/DELETE from all of the query tables as well. CASSANDRA-6477 > covers this nicely. > However, the other side of the coin is that the application needs to know > which query table to leverage based on the selection criteria. Using the > example above, if the query has a predicate such as "WHERE userid = 'bhess'", > then USERS_BY_USERID is the better table to use. Similarly, when the > predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is > appropriate. > On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" > to the collection of tables. You do operations just on the base table. It > is very attractive for the SELECT side as well. It would be very good to > allow an application to simply do "SELECT * FROM users WHERE userid = > 'bhess'" and have that query implicitly leverage the USERS_BY_USERID > materialized view. > For additional use cases, especially analytics use cases like in Spark, this > allows the Spark code to simply push down the query without having to know > about all of the MVs that have been set up. The system will route the query > appropriately. And if additional MVs are necessary to make a query run > better/faster, then those MVs can be set up and Spark will implicitly > leverage them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9284) support joins if partition key is the same in both tables
[ https://issues.apache.org/jira/browse/CASSANDRA-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523251#comment-14523251 ] Ryan Svihla edited comment on CASSANDRA-9284 at 5/1/15 2:32 PM: This could add some flexibility to data modeling practices and would allow one giant table to be split up into smaller tables. One would still have to be mindful of the total retrieval size in a way we do not have to consider now, but this would I believe help people new to Cassandra still keep their table models conceptually broken up into smaller easier to understand pieces. Also is has the side effect of reducing partition size into smaller logical units wrt compaction. Some data models that would probably go over the in_memory_compaction_limit previously would no longer as that data would be split up into different tables. This effectively allows another type of sharding for large logical rows. was (Author: rssvihla): This could add some flexibility to data modeling practices and would allow one giant table to be split up into smaller tables. One would still have to be mindful of the total retrieval size in a way we do not have to consider now, but this would I believe help people new to Cassandra still keep their table models conceptually broken up into smaller easier to understand pieces. > support joins if partition key is the same in both tables > - > > Key: CASSANDRA-9284 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9284 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Priority: Minor > > Based off this conversation: > https://mail-archives.apache.org/mod_mbox/cassandra-dev/201505.mbox/%3CCACUnPaAfJqU%2B86fFd4S6MS7Wv0KhpT_vavdkvDS%2Bm4Madi8_cg%40mail.gmail.com%3E > It would be nice to have the flexibility of joins if we knew the query could > be satisfied on a single node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9284) support joins if partition key is the same in both tables
[ https://issues.apache.org/jira/browse/CASSANDRA-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523251#comment-14523251 ] Ryan Svihla commented on CASSANDRA-9284: This could add some flexibility to data modeling practices and would allow one giant table to be split up into smaller tables. One would still have to be mindful of the total retrieval size in a way we do not have to consider now, but this would I believe help people new to Cassandra still keep their table models conceptually broken up into smaller easier to understand pieces. > support joins if partition key is the same in both tables > - > > Key: CASSANDRA-9284 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9284 > Project: Cassandra > Issue Type: Improvement >Reporter: Jon Haddad >Priority: Minor > > Based off this conversation: > https://mail-archives.apache.org/mod_mbox/cassandra-dev/201505.mbox/%3CCACUnPaAfJqU%2B86fFd4S6MS7Wv0KhpT_vavdkvDS%2Bm4Madi8_cg%40mail.gmail.com%3E > It would be nice to have the flexibility of joins if we knew the query could > be satisfied on a single node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9191) Log and count failure to obtain requested consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495136#comment-14495136 ] Ryan Svihla edited comment on CASSANDRA-9191 at 4/14/15 11:25 PM: -- JMX counter at least maybe (or is that what you mean by 'expose' ? was (Author: rssvihla): JMX counter maybe? > Log and count failure to obtain requested consistency > - > > Key: CASSANDRA-9191 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9191 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Matt Stump > > Cassandra should have a way to log failed requests due to failure to obtain > requested consistency. This should be logged as error or warning by default. > Also exposed should be a counter for the benefit of opscenter. > Currently the only way to log this is at the client. Often the application > and DB teams are separate and it's very difficult to obtain client logs. Also > because it's only visible to the client no visibility is given to opscenter > making it difficult for the field to track down or isolate systematic or node > level errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9191) Log and count failure to obtain requested consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495136#comment-14495136 ] Ryan Svihla commented on CASSANDRA-9191: JMX counter maybe? > Log and count failure to obtain requested consistency > - > > Key: CASSANDRA-9191 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9191 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Matt Stump > > Cassandra should have a way to log failed requests due to failure to obtain > requested consistency. This should be logged as error or warning by default. > Also exposed should be a counter for the benefit of opscenter. > Currently the only way to log this is at the client. Often the application > and DB teams are separate and it's very difficult to obtain client logs. Also > because it's only visible to the client no visibility is given to opscenter > making it difficult for the field to track down or isolate systematic or node > level errors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7902) Introduce the ability to ignore RR based on consistencfy
[ https://issues.apache.org/jira/browse/CASSANDRA-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351588#comment-14351588 ] Ryan Svihla edited comment on CASSANDRA-7902 at 3/7/15 2:05 PM: Agreed, local RR for local_* is a demonstration of the principle of least surprise if there ever was one. This will have some interesting ramifications for one scenario where someone uses local_* because of slow dc links, but doesn't run nodetool repair because of slow dc links, in theory global read repair firing may have given them their only chance at cross dc consistency (assuming the RR even succeeds). However, I'm not sure it's worth complicating everything for a scenario that involves lots of dubious decisions and will never work properly even with or without global RR (assuming they even know how to turn it on). was (Author: rssvihla): Agreed, local RR for local_* is a demonstration of the principle of least surprise if there ever was one. This will have some interesting ramifications for one scenario where someone uses local_* because of slow dc links, but doesn't run nodetool repair because of slow dc links, in theory global read repair firing may have given them their only chance at cross dc consistency (assuming the RR even succeeds). However, I'm not sure it's worth complicating everything for a scenario that involves lots of dubious decisions and will never work properly even with or without global RR. > Introduce the ability to ignore RR based on consistencfy > > > Key: CASSANDRA-7902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7902 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Brandon Williams > > There exists a case for LOCAL_* consistency levels where you really want them > *local only*. This implies that you don't ever want to do cross-dc RR, but > do for other levels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7902) Introduce the ability to ignore RR based on consistencfy
[ https://issues.apache.org/jira/browse/CASSANDRA-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351588#comment-14351588 ] Ryan Svihla commented on CASSANDRA-7902: Agreed, local RR for local_* is a demonstration of the principle of least surprise if there ever was one. This will have some interesting ramifications for one scenario where someone uses local_* because of slow dc links, but doesn't run nodetool repair because of slow dc links, in theory global read repair firing may have given them their only chance at cross dc consistency (assuming the RR even succeeds). However, I'm not sure it's worth complicating everything for a scenario that involves lots of dubious decisions and will never work properly even with or without global RR. > Introduce the ability to ignore RR based on consistencfy > > > Key: CASSANDRA-7902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7902 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Brandon Williams > > There exists a case for LOCAL_* consistency levels where you really want them > *local only*. This implies that you don't ever want to do cross-dc RR, but > do for other levels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8754) Required consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309751#comment-14309751 ] Ryan Svihla commented on CASSANDRA-8754: Auditing would help with diagnosis, but it'd have to be cheap enough to turn it on. I still think there is a huge value for a number of users to have some server side CL restriction, I've been asked personally more than a few times. > Required consistency level > -- > > Key: CASSANDRA-8754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8754 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla > Labels: ponies > > Idea is to prevent a query based on a consistency level not being met. For > example we can specify that all queries should be at least CL LOCAL_QUORUM. > Lots of users struggle with getting all their dev teams on board with > consistency levels and all the ramifications. The normal solution for this > has traditionally to build a service in front of Cassandra that the entire > dev team accesses. However, this has proven challenging for some > organizations to do correctly, and I think an easier approach would be to > require a given consistency level as a matter of enforced policy in the > database. > I'm open for where this belongs. The most flexible approach is at a table > level, however I'm concerned this is potentially error prone and labor > intensive. It could be a table attribute similar to compaction strategy. > The simplest administratively is a cluster level, in say the cassandra.yaml > The middle ground is at they keyspace level, the only downside I could > foresee is keyspace explosion to fit involved minimum schemes. It could be a > keyspace attribute such as replication strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-8754) Required consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Svihla updated CASSANDRA-8754: --- Comment: was deleted (was: Then their table will be changed..and I'll have a schema description to indicate such) > Required consistency level > -- > > Key: CASSANDRA-8754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8754 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla > Labels: ponies > > Idea is to prevent a query based on a consistency level not being met. For > example we can specify that all queries should be at least CL LOCAL_QUORUM. > Lots of users struggle with getting all their dev teams on board with > consistency levels and all the ramifications. The normal solution for this > has traditionally to build a service in front of Cassandra that the entire > dev team accesses. However, this has proven challenging for some > organizations to do correctly, and I think an easier approach would be to > require a given consistency level as a matter of enforced policy in the > database. > I'm open for where this belongs. The most flexible approach is at a table > level, however I'm concerned this is potentially error prone and labor > intensive. It could be a table attribute similar to compaction strategy. > The simplest administratively is a cluster level, in say the cassandra.yaml > The middle ground is at they keyspace level, the only downside I could > foresee is keyspace explosion to fit involved minimum schemes. It could be a > keyspace attribute such as replication strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8754) Required consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309651#comment-14309651 ] Ryan Svihla commented on CASSANDRA-8754: Then their table will be changed..and I'll have a schema description to indicate such > Required consistency level > -- > > Key: CASSANDRA-8754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8754 > Project: Cassandra > Issue Type: New Feature >Reporter: Ryan Svihla > Labels: ponies > > Idea is to prevent a query based on a consistency level not being met. For > example we can specify that all queries should be at least CL LOCAL_QUORUM. > Lots of users struggle with getting all their dev teams on board with > consistency levels and all the ramifications. The normal solution for this > has traditionally to build a service in front of Cassandra that the entire > dev team accesses. However, this has proven challenging for some > organizations to do correctly, and I think an easier approach would be to > require a given consistency level as a matter of enforced policy in the > database. > I'm open for where this belongs. The most flexible approach is at a table > level, however I'm concerned this is potentially error prone and labor > intensive. It could be a table attribute similar to compaction strategy. > The simplest administratively is a cluster level, in say the cassandra.yaml > The middle ground is at they keyspace level, the only downside I could > foresee is keyspace explosion to fit involved minimum schemes. It could be a > keyspace attribute such as replication strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)