[jira] [Assigned] (CASSANDRA-15830) Invalid version value: 4.0~alpha4 during startup
[ https://issues.apache.org/jira/browse/CASSANDRA-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever reassigned CASSANDRA-15830: -- Assignee: Michael Semb Wever > Invalid version value: 4.0~alpha4 during startup > > > Key: CASSANDRA-15830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15830 > Project: Cassandra > Issue Type: Bug >Reporter: Eric Wong >Assignee: Michael Semb Wever >Priority: Normal > > Hi: > We are testing the latest cassandra-4.0 on Centos 7 using a clean database. > When we started cassandra the first time, everything is fine. However, when > we stop and restart cassandra, we got the following error and the db refuses > to startup: > {code} > ERROR [main] 2020-05-22 05:58:18,698 CassandraDaemon.java:789 - Exception > encountered during startup > java.lang.IllegalArgumentException: Invalid version value: 4.0~alpha4 > at > org.apache.cassandra.utils.CassandraVersion.(CassandraVersion.java:64) > at > org.apache.cassandra.io.sstable.SSTableHeaderFix.fixNonFrozenUDTIfUpgradeFrom30(SSTableHeaderFix.java:84) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:250) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:650) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:767) > {code} > The only way to get the node up and running again is by deleting all data > under /var/lib/cassandra. > > Is that a known issue? > Thanks, Eric > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Khandelwal reassigned CASSANDRA-15833: - Assignee: (was: Manish Khandelwal) > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.x > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Khandelwal reassigned CASSANDRA-15833: - Assignee: Manish Khandelwal (was: Jacek Lewandowski) > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Manish Khandelwal >Priority: Normal > Fix For: 3.11.x, 4.x > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117172#comment-17117172 ] Benedict Elliott Smith commented on CASSANDRA-12126: So, a thought has occurred to me: what do we actually claim our consistency properties are for SERIAL? My understanding was that we claimed only serializability, in which case I don't think that strictly speaking this is a bug. I think it's only a bug if we claim strict serializability. However the only docs I can find claiming either are DataStax's which mixes linearizable up with serializable. FWIW, I consider this to be a bug, as we should at least support the more intuitively correct semantics. But perhaps we should instead introduce a new STRICT_SERIAL consistency level to solve it, and clarify what SERIAL means in our docs? I would also be OK with simply claiming strict serializability for SERIAL. But perhaps this technicality/ambiguity buys us some time and cover to solve the problem without introducing major performance penalties? I also have some relevant test cases I will share tomorrow, along with test cases for other correctness failures of LWTs. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Legacy/Coordination >Reporter: Sankalp Kohli >Assignee: Sylvain Lebresne >Priority: Normal > Labels: LWT, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15837) Enhance fqltool to be able to export the fql log into a format which doesn't depend on Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-15837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117151#comment-17117151 ] David Capwell commented on CASSANDRA-15837: --- I already created a working patch which also works with tlp-stress. It worked by producing a file using thrift data, have tlp-stress use as input to generate the load against a cluster. I am linking a working branch to promote discussion, I am aware that the branch requires cleanup; mostly focused on making sure others are ok with the core concept. > Enhance fqltool to be able to export the fql log into a format which doesn't > depend on Cassandra > > > Key: CASSANDRA-15837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15837 > Project: Cassandra > Issue Type: Improvement > Components: Tool/fql >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > > Currently the fql log format uses Cassandra serialization within the message, > which means that reading the file also requires Cassandra classes. To make it > easier for outside tools to read the fql logs we should enhance the fqltool > to be able to dump the logs to a file format using thrift or protobuf. > Additionally we should support exporting the original query with a > deterministic version allowing tools to have reproducible and comparable > results. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117143#comment-17117143 ] Blake Eggleston commented on CASSANDRA-12126: - Agreed that's the most straightforward way to address both issues (although I've only skimmed your patch). In 3.x though, and at least for the serial read fix, I think we should include a flag to disable the fix, in case a) there's a problem with the fix or b) operators would rather trade the performance impact for linearizability for whatever reason. There's also a variant of the non-applying update issue where it's exposed by a read, not another insert. It would be good to have a test for that as well. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Legacy/Coordination >Reporter: Sankalp Kohli >Assignee: Sylvain Lebresne >Priority: Normal > Labels: LWT, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowing if it is accepted by majority of acceptors. > So this behavior is correct. > However we need to fix step 2, since it caused reads to not be linearizable > with respect to writes and other reads. In this case, we know that majority > of acceptors have no inflight commit which means we have majority that > nothing was accepted by majority. I think we should run a propose step here > with empty commit and that will cause write written in step 1 to not be > visible ever after. > With this fix, we will either see data written in step 1 on next serial read > or will never see it which is what we want. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-15833: -- Test and Documentation Plan: The issue can be tested by executing the following steps: 1. Start 2 node cluster of C* 3.0 2. Create a keyspace with RF=2, and a sample table with some regular columns (say of text type), insert some data into that table with CL=QUORUM or ALL 3. Flush created table on both nodes 4. Upgrade the first node either to 3.11 or 4.0 5. Start CQL, set consistency to QUORUM or ALL and enable tracing 6. Query table for a single row with a single regular column 7. Without patch, the tracing output shows the information about digest mismatch, which is unrecoverable (you can even do full repair), with the patch, there is no such issue Status: Patch Available (was: In Progress) > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.x > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-15833: -- Attachment: CASSANDRA-15833-4.0.patch > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.x > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15821) Metrics Documentation Enhancements
[ https://issues.apache.org/jira/browse/CASSANDRA-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Mallette reassigned CASSANDRA-15821: Assignee: Stephen Mallette I've started with some initial changes here: https://github.com/apache/cassandra/compare/trunk...spmallette:CASSANDRA-15821 I mostly focused on Table/Keyspace metrics and ClientRequest metrics, adding those items noted as missing in the referenced spreadsheet. I've updated the spreadsheet accordingly to keep track of where things are. At the risk of shifting a lot of things around I'd very much like to alphabetize the metric listings in the various tables. If anyone feels strongly against that for some reason, please let me know. I will save that particular change for my final steps with this issue. Note that I think there are some naming discrepancies among the Table/Keyspace metrics where the table and keyspace naming don't match for what I believe is the same metric: * Table.SyncTime == Keyspace.RepairSyncTime * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize I've taken the liberty of documenting these items differently for now though I think it would be preferable to making the naming consistent (for which I could create another ticket). Unless there are objections to doing so, I will proceed in that fashion. I'm happy to hear any feedback - thanks! > Metrics Documentation Enhancements > -- > > Key: CASSANDRA-15821 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15821 > Project: Cassandra > Issue Type: Improvement > Components: Documentation/Website >Reporter: Stephen Mallette >Assignee: Stephen Mallette >Priority: Normal > > CASSANDRA-15582 involves quality around metrics and it was mentioned that > reviewing and [improving > documentation|https://github.com/apache/cassandra/blob/trunk/doc/source/operating/metrics.rst] > around metrics would fall into that scope. Please consider some of this > analysis in determining what improvements to make here: > Please see [this > spreadsheet|https://docs.google.com/spreadsheets/d/1iPWfCMIG75CI6LbYuDtCTjEOvZw-5dyH-e08bc63QnI/edit?usp=sharing] > that itemizes almost all of cassandra's metrics and whether they are > documented or not (and other notes). That spreadsheet is "almost all" > because there are some metrics that don't seem to initialize as part of > Cassandra startup (i was able to trigger some to initialize, but all were not > immediately obvious). The missing metrics seem to be related to the following: > * ThreadPool metrics - only some initialize at startup the list of which > follow below > * Streaming Metrics > * HintedHandoff Metrics > * HintsService Metrics > Here are the ThreadPool scopes that get listed: > {code} > AntiEntropyStage > CacheCleanupExecutor > CompactionExecutor > GossipStage > HintsDispatcher > MemtableFlushWriter > MemtablePostFlush > MemtableReclaimMemory > MigrationStage > MutationStage > Native-Transport-Requests > PendingRangeCalculator > PerDiskMemtableFlushWriter_0 > ReadStage > Repair-Task > RequestResponseStage > Sampler > SecondaryIndexManagement > ValidationExecutor > ViewBuildExecutor > {code} > I noticed that Keyspace Metrics have this note: "Most of these metrics are > the same as the Table Metrics above, only they are aggregated at the Keyspace > level." I think I've isolated those metrics on table that are not on keyspace > to specifically be: > {code} > BloomFilterFalsePositives > BloomFilterFalseRatio > BytesAnticompacted > BytesFlushed > BytesMutatedAnticompaction > BytesPendingRepair > BytesRepaired > BytesUnrepaired > CompactionBytesWritten > CompressionRatio > CoordinatorReadLatency > CoordinatorScanLatency > CoordinatorWriteLatency > EstimatedColumnCountHistogram > EstimatedPartitionCount > EstimatedPartitionSizeHistogram > KeyCacheHitRate > LiveSSTableCount > MaxPartitionSize > MeanPartitionSize > MinPartitionSize > MutatedAnticompactionGauge > PercentRepaired > RowCacheHitOutOfRange > RowCacheHit > RowCacheMiss > SpeculativeSampleLatencyNanos > SyncTime > WaitingOnFreeMemtableSpace > DroppedMutations > {code} > Someone with greater knowledge of this area might consider it worth the > effort to see if any of these metrics should be aggregated to the keyspace > level in case they were inadvertently missed. In any case, perhaps the > documentation could easily now reflect which metric names could be expected > on Keyspace. > The
[jira] [Comment Edited] (CASSANDRA-15826) Jenkins dtests don't need to be category throttled any longer
[ https://issues.apache.org/jira/browse/CASSANDRA-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116949#comment-17116949 ] Michael Semb Wever edited comment on CASSANDRA-15826 at 5/26/20, 6:15 PM: -- Committed [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra/commit/e690e2985f3fcba6ab919c1e81b4e4111d85c2d1] and [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra-builds/commit/e883522c513978405b917837b4953a6516f02599]. was (Author: michaelsembwever): Committed [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra/commit/e690e2985f3fcba6ab919c1e81b4e4111d85c2d1 and [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra-builds/commit/e883522c513978405b917837b4953a6516f02599]. > Jenkins dtests don't need to be category throttled any longer > - > > Key: CASSANDRA-15826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15826 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0, 4.0-alpha5 > > > Remove the "Cassandra" category throttle on dtests. > Add docs section on configuring the Jenkins master to create the "Cassandra" > category throttle. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15826) Jenkins dtests don't need to be category throttled any longer
[ https://issues.apache.org/jira/browse/CASSANDRA-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-15826: --- Fix Version/s: (was: 4.0-beta) 4.0-alpha5 4.0 Source Control Link: https://github.com/apache/cassandra/commit/e690e2985f3fcba6ab919c1e81b4e4111d85c2d1 and https://github.com/apache/cassandra-builds/commit/e883522c513978405b917837b4953a6516f02599 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra/commit/e690e2985f3fcba6ab919c1e81b4e4111d85c2d1 and [e883522c513978405b917837b4953a6516f02599 |https://github.com/apache/cassandra-builds/commit/e883522c513978405b917837b4953a6516f02599]. > Jenkins dtests don't need to be category throttled any longer > - > > Key: CASSANDRA-15826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15826 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0, 4.0-alpha5 > > > Remove the "Cassandra" category throttle on dtests. > Add docs section on configuring the Jenkins master to create the "Cassandra" > category throttle. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Add docs section on configuring the Jenkins master to create the "Cassandra" category throttle.
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new e690e29 Add docs section on configuring the Jenkins master to create the "Cassandra" category throttle. e690e29 is described below commit e690e2985f3fcba6ab919c1e81b4e4111d85c2d1 Author: Mick Semb Wever AuthorDate: Thu May 21 12:58:08 2020 +0200 Add docs section on configuring the Jenkins master to create the "Cassandra" category throttle. patch by Mick Semb Wever; reviewed by David Capwell for CASSANDRA-15826 --- doc/source/development/ci.rst | 7 +++ 1 file changed, 7 insertions(+) diff --git a/doc/source/development/ci.rst b/doc/source/development/ci.rst index 77360ae..a6620b3 100644 --- a/doc/source/development/ci.rst +++ b/doc/source/development/ci.rst @@ -48,6 +48,13 @@ Go to ``Manage Jenkins -> Manage Plugins -> Available`` and install the followin * Hudson Post build task +Configure Throttle Category +--- + +Builds that are not containerized (e.g. cqlshlib tests and in-jvm dtests) use local resources for Cassandra (ccm). To prevent these builds running concurrently the ``Cassandra`` throttle category needs to be created. + +This is done under ``Manage Jenkins -> System Configuration -> Throttle Concurrent Builds``. Enter "Cassandra" for the ``Category Name`` and "1" for ``Maximum Concurrent Builds Per Node``. + Setup seed job -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-builds] branch master updated: In Jenkins the dtests don't need to be throttled to one concurrent run per node, as they are dockerised now. This means, in PostBuildTask, only older docker
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git The following commit(s) were added to refs/heads/master by this push: new e883522 In Jenkins the dtests don't need to be throttled to one concurrent run per node, as they are dockerised now. This means, in PostBuildTask, only older docker resources can be pruned (as `docker cp …` is used), and use `git clean -xdff` to clean the project. Also enable concurrent builds on all non-pipeline builds, as builds can run concurrently on different nodes. e883522 is described below commit e883522c513978405b917837b4953a6516f02599 Author: mck AuthorDate: Wed May 20 18:34:04 2020 +0200 In Jenkins the dtests don't need to be throttled to one concurrent run per node, as they are dockerised now. This means, in PostBuildTask, only older docker resources can be pruned (as `docker cp …` is used), and use `git clean -xdff` to clean the project. Also enable concurrent builds on all non-pipeline builds, as builds can run concurrently on different nodes. patch by Mick Semb Wever; reviewed by David Capwell for CASSANDRA-15826 --- jenkins-dsl/cassandra_job_dsl_seed.groovy | 47 +-- 1 file changed, 20 insertions(+), 27 deletions(-) diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy b/jenkins-dsl/cassandra_job_dsl_seed.groovy index f862136..65f2b63 100644 --- a/jenkins-dsl/cassandra_job_dsl_seed.groovy +++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy @@ -74,6 +74,7 @@ if(binding.hasVariable("CASSANDRA_DOCKER_IMAGE")) { job('Cassandra-template-artifacts') { disabled(true) description(jobDescription) +concurrentBuild() jdk(jdkLabel) label(slaveLabel) compressBuildLog() @@ -131,8 +132,8 @@ job('Cassandra-template-artifacts') { } postBuildTask { task('.', ''' -echo "Cleaning project…"; ant realclean; -echo "Pruning docker…" ; docker system prune -f --volumes ; +echo "Cleaning project…"; git clean -xdff ; +echo "Pruning docker…" ; docker system prune -f --filter "until=48h" ; echo "Reporting disk usage…"; df -h ; find . -maxdepth 2 -type d -exec du -hs {} ';' ; du -hs ../* ; echo "Cleaning tmp…"; find . -type d -name tmp -delete 2>/dev/null ; @@ -148,6 +149,7 @@ job('Cassandra-template-artifacts') { job('Cassandra-template-test') { disabled(true) description(jobDescription) +concurrentBuild() jdk(jdkLabel) label(slaveLabel) compressBuildLog() @@ -193,8 +195,8 @@ job('Cassandra-template-test') { postBuildTask { task('.', ''' echo "Finding job process orphans…"; if pgrep -af ${JOB_BASE_NAME}; then pkill -9 -f ${JOB_BASE_NAME}; fi; -echo "Cleaning project…"; ant realclean; -echo "Pruning docker…" ; docker system prune -f --volumes ; +echo "Cleaning project…"; git clean -xdff ; +echo "Pruning docker…" ; docker system prune -f --filter "until=48h" ; echo "Reporting disk usage…"; df -h ; find . -maxdepth 2 -type d -exec du -hs {} ';' ; du -hs ../* ; echo "Cleaning tmp…"; find . -type d -name tmp -delete 2>/dev/null ; @@ -210,6 +212,7 @@ job('Cassandra-template-test') { job('Cassandra-template-dtest') { disabled(true) description(jobDescription) +concurrentBuild() jdk(jdkLabel) label(slaveLabel) compressBuildLog() @@ -223,9 +226,6 @@ job('Cassandra-template-dtest') { } timestamps() } -throttleConcurrentBuilds { -categories(['Cassandra']) -} scm { git { remote { @@ -254,9 +254,8 @@ job('Cassandra-template-dtest') { } postBuildTask { task('.', ''' -echo "Finding job process orphans…"; if pgrep -af ${JOB_BASE_NAME}; then pkill -9 -f ${JOB_BASE_NAME}; fi; -echo "Cleaning project…"; ant realclean; -echo "Pruning docker…" ; docker system prune -f --volumes ; +echo "Cleaning project…"; git clean -xdff ; +echo "Pruning docker…" ; if pgrep -af jenkinscommand.sh; then system prune -f --filter 'until=48h'; else docker system prune -f --volumes ; fi; echo "Reporting disk usage…"; df -h ; find . -maxdepth 2 -type d -exec du -hs {} ';' ; du -hs ../* ; echo "Cleaning tmp…"; find . -type d -name tmp -delete 2>/dev/null ; @@ -272,6 +271,7 @@ job('Cassandra-template-dtest') { matrixJob('Cassandra-template-cqlsh-tests') { disabled(true) description(jobDescription) +concurrentBuild() compressBuildLog() logRotator { numToKeep(25) @@ -322,8 +322,8 @@ matrixJob('Cass
[jira] [Updated] (CASSANDRA-15826) Jenkins dtests don't need to be category throttled any longer
[ https://issues.apache.org/jira/browse/CASSANDRA-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15826: -- Status: Ready to Commit (was: Review In Progress) Doc changes LGTM. Build changes make it so artifacts + dtests are not throttled but everything else is. Speaking to Mick in Slack some ant tasks don't need this, but its outside of the scope of this JIRA. +1 > Jenkins dtests don't need to be category throttled any longer > - > > Key: CASSANDRA-15826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15826 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0-beta > > > Remove the "Cassandra" category throttle on dtests. > Add docs section on configuring the Jenkins master to create the "Cassandra" > category throttle. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15826) Jenkins dtests don't need to be category throttled any longer
[ https://issues.apache.org/jira/browse/CASSANDRA-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15826: -- Reviewers: David Capwell, David Capwell (was: David Capwell) David Capwell, David Capwell Status: Review In Progress (was: Patch Available) > Jenkins dtests don't need to be category throttled any longer > - > > Key: CASSANDRA-15826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15826 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0-beta > > > Remove the "Cassandra" category throttle on dtests. > Add docs section on configuring the Jenkins master to create the "Cassandra" > category throttle. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15826) Jenkins dtests don't need to be category throttled any longer
[ https://issues.apache.org/jira/browse/CASSANDRA-15826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116931#comment-17116931 ] Michael Semb Wever commented on CASSANDRA-15826: Some context… The new ASF CI jenkins did not have the category set up (it's something you had to do on the jenkins master). It used to work (eg I didn't need to manually hack the cqlshlib tests to not be concurrent on a node) So now that I understood what {{throttleConcurrentBuilds}} was for, and had set up the category in the master, then i could remove the cqlshlib hack. And since {{throttleConcurrentBuilds}} had been added the dtests had been dockerised. So they no longer need to be throttled. Also, with so many agents, we need to enable concurrent builds {{concurrentBuild()}} on all the (non-pipeline) builds. (concurrent builds means concurrent over the whole jenkins cluster) > Jenkins dtests don't need to be category throttled any longer > - > > Key: CASSANDRA-15826 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15826 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0-beta > > > Remove the "Cassandra" category throttle on dtests. > Add docs section on configuring the Jenkins master to create the "Cassandra" > category throttle. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15584) 4.0 quality testing: Tooling - External Ecosystem
[ https://issues.apache.org/jira/browse/CASSANDRA-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] scott hendrickson updated CASSANDRA-15584: -- Authors: Sam Tunnicliffe, scott hendrickson (was: Sam Tunnicliffe) > 4.0 quality testing: Tooling - External Ecosystem > - > > Key: CASSANDRA-15584 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15584 > Project: Cassandra > Issue Type: Task >Reporter: Josh McKenzie >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0-beta > > > Many users of Apache Cassandra employ open source tooling to automate > Cassandra configuration, runtime management, and repair scheduling. Prior to > release, we need to confirm that popular third-party tools such as Reaper, > Priam, etc. function properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15837) Enhance fqltool to be able to export the fql log into a format which doesn't depend on Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-15837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15837: -- Change Category: Operability Complexity: Low Hanging Fruit Status: Open (was: Triage Needed) > Enhance fqltool to be able to export the fql log into a format which doesn't > depend on Cassandra > > > Key: CASSANDRA-15837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15837 > Project: Cassandra > Issue Type: Improvement > Components: Tool/fql >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > > Currently the fql log format uses Cassandra serialization within the message, > which means that reading the file also requires Cassandra classes. To make it > easier for outside tools to read the fql logs we should enhance the fqltool > to be able to dump the logs to a file format using thrift or protobuf. > Additionally we should support exporting the original query with a > deterministic version allowing tools to have reproducible and comparable > results. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15837) Enhance fqltool to be able to export the fql log into a format which doesn't depend on Cassandra
David Capwell created CASSANDRA-15837: - Summary: Enhance fqltool to be able to export the fql log into a format which doesn't depend on Cassandra Key: CASSANDRA-15837 URL: https://issues.apache.org/jira/browse/CASSANDRA-15837 Project: Cassandra Issue Type: Improvement Components: Tool/fql Reporter: David Capwell Assignee: David Capwell Currently the fql log format uses Cassandra serialization within the message, which means that reading the file also requires Cassandra classes. To make it easier for outside tools to read the fql logs we should enhance the fqltool to be able to dump the logs to a file format using thrift or protobuf. Additionally we should support exporting the original query with a deterministic version allowing tools to have reproducible and comparable results. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15583) 4.0 quality testing: Tooling, Bundled and First Party
[ https://issues.apache.org/jira/browse/CASSANDRA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] scott hendrickson updated CASSANDRA-15583: -- Reviewers: scott hendrickson, Vinay Chella (was: Vinay Chella) > 4.0 quality testing: Tooling, Bundled and First Party > - > > Key: CASSANDRA-15583 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15583 > Project: Cassandra > Issue Type: Task > Components: Test/dtest >Reporter: Josh McKenzie >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0-beta > > > Test plans should cover bundled first-party tooling and CLIs such as > nodetool, cqlsh, and new tools supporting full query and audit logging > (CASSANDRA-13983, CASSANDRA-12151). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom van der Woerdt updated CASSANDRA-15810: --- Change Category: Performance Complexity: Low Hanging Fruit Component/s: Local/Config > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Priority: Normal > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15836) nodetool garbagecollect could leave LCS level intact
[ https://issues.apache.org/jira/browse/CASSANDRA-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom van der Woerdt updated CASSANDRA-15836: --- Component/s: Local/Compaction > nodetool garbagecollect could leave LCS level intact > > > Key: CASSANDRA-15836 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15836 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Tom van der Woerdt >Priority: Normal > > The nodetool command `garbagecollect` will run a single-sstable compaction > for every sstable in a cf, while using other sstables on the side to allow > for dropping tombstoned data. However, in doing so, it resets all LCS levels > back to 0, causing a significant write amplification. > Fundamentally there's no reason why LCS levels have to be changed here, since > these are single-sstable compactions. When the expected reduction in data set > size is small (say, 10%) it may be preferable for sstables to remain in place > instead of having to go through the entire compaction process again. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15836) nodetool garbagecollect could leave LCS level intact
Tom van der Woerdt created CASSANDRA-15836: -- Summary: nodetool garbagecollect could leave LCS level intact Key: CASSANDRA-15836 URL: https://issues.apache.org/jira/browse/CASSANDRA-15836 Project: Cassandra Issue Type: Improvement Reporter: Tom van der Woerdt The nodetool command `garbagecollect` will run a single-sstable compaction for every sstable in a cf, while using other sstables on the side to allow for dropping tombstoned data. However, in doing so, it resets all LCS levels back to 0, causing a significant write amplification. Fundamentally there's no reason why LCS levels have to be changed here, since these are single-sstable compactions. When the expected reduction in data set size is small (say, 10%) it may be preferable for sstables to remain in place instead of having to go through the entire compaction process again. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15812) Submitting Validation requests can block ANTI_ENTROPY stage
[ https://issues.apache.org/jira/browse/CASSANDRA-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15812: --- Reviewers: Benjamin Lerer, Benjamin Lerer (was: Benjamin Lerer) Benjamin Lerer, Benjamin Lerer (was: Benjamin Lerer) Status: Review In Progress (was: Patch Available) > Submitting Validation requests can block ANTI_ENTROPY stage > > > Key: CASSANDRA-15812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15812 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0-alpha > > > RepairMessages are handled on Stage.ANTI_ENTROPY, which has a thread pool > with core/max capacity of one, ie. we can only process one message at a time. > > Scheduling validation compactions may however block the stage completely, by > blocking on CompactionManager's ValidationExecutor while submitting a new > validation compaction, in cases where there are already more validations > running than can be executed in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15825) Fix flaky test incrementalSSTableSelection - org.apache.cassandra.db.streaming.CassandraStreamManagerTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi reassigned CASSANDRA-15825: --- Assignee: Berenguer Blasi (was: Benjamin Lerer) > Fix flaky test incrementalSSTableSelection - > org.apache.cassandra.db.streaming.CassandraStreamManagerTest > - > > Key: CASSANDRA-15825 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15825 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-alpha > > > Build link: > https://app.circleci.com/pipelines/github/dcapwell/cassandra/287/workflows/06baf3db-7094-431f-920d-e8fcd1da9cce/jobs/1398 > > {code} > java.lang.RuntimeException: java.nio.file.NoSuchFileException: > /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db > at > org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55) > at > org.apache.cassandra.io.util.ChannelProxy.(ChannelProxy.java:66) > at > org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:315) > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:126) > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:136) > at > org.apache.cassandra.io.sstable.format.SSTableReader.reloadSSTableMetadata(SSTableReader.java:2047) > at > org.apache.cassandra.db.streaming.CassandraStreamManagerTest.mutateRepaired(CassandraStreamManagerTest.java:128) > at > org.apache.cassandra.db.streaming.CassandraStreamManagerTest.incrementalSSTableSelection(CassandraStreamManagerTest.java:175) > Caused by: java.nio.file.NoSuchFileException: > /tmp/cassandra/build/test/cassandra/data:2/ks_1589913975959/tbl-051c0a709a0111eab5fb6f52366536f8/na-4-big-Statistics.db > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) > at java.nio.channels.FileChannel.open(FileChannel.java:287) > at java.nio.channels.FileChannel.open(FileChannel.java:335) > at > org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15835) Upgrade-dtests on trunk not working in CircleCI
[ https://issues.apache.org/jira/browse/CASSANDRA-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-15835: - Bug Category: Parent values: Correctness(12982)Level 1 values: Test Failure(12990) Complexity: Low Hanging Fruit Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Upgrade-dtests on trunk not working in CircleCI > --- > > Key: CASSANDRA-15835 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15835 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Normal > Fix For: 4.0-alpha > > > ~3600 Upgrade-dtests are failing in CircleCI for trunk due to the missing > {{JAVA8_HOME}} environment variable. > Patching the Docker image is rather simple by creating a new image: > {code} > FROM nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200406 > ENV JAVA8_HOME=/usr/lib/jvm/java-8-openjdk-amd64 > {code} > Pushed the above to Docker hub as > [snazy/cassandra-testing-ubuntu1910-java11-w-dependencies:202005261540|https://hub.docker.com/layers/snazy/cassandra-testing-ubuntu1910-java11-w-dependencies/202005261540/images/sha256-ac8a713be58694f095c491921e006c2d1a7823a3c23299e477198e2c93a6bbd7?context=explore] > The size of the whole Docker image is a little concerning though (1.85G > compressed), but that's out of the scope of this ticket. > I'll prepare a patch soon-ish. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15835) Upgrade-dtests on trunk not working in CircleCI
[ https://issues.apache.org/jira/browse/CASSANDRA-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-15835: - Fix Version/s: 4.0-alpha > Upgrade-dtests on trunk not working in CircleCI > --- > > Key: CASSANDRA-15835 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15835 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Normal > Fix For: 4.0-alpha > > > ~3600 Upgrade-dtests are failing in CircleCI for trunk due to the missing > {{JAVA8_HOME}} environment variable. > Patching the Docker image is rather simple by creating a new image: > {code} > FROM nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200406 > ENV JAVA8_HOME=/usr/lib/jvm/java-8-openjdk-amd64 > {code} > Pushed the above to Docker hub as > [snazy/cassandra-testing-ubuntu1910-java11-w-dependencies:202005261540|https://hub.docker.com/layers/snazy/cassandra-testing-ubuntu1910-java11-w-dependencies/202005261540/images/sha256-ac8a713be58694f095c491921e006c2d1a7823a3c23299e477198e2c93a6bbd7?context=explore] > The size of the whole Docker image is a little concerning though (1.85G > compressed), but that's out of the scope of this ticket. > I'll prepare a patch soon-ish. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15835) Upgrade-dtests on trunk not working in CircleCI
Robert Stupp created CASSANDRA-15835: Summary: Upgrade-dtests on trunk not working in CircleCI Key: CASSANDRA-15835 URL: https://issues.apache.org/jira/browse/CASSANDRA-15835 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: Robert Stupp Assignee: Robert Stupp ~3600 Upgrade-dtests are failing in CircleCI for trunk due to the missing {{JAVA8_HOME}} environment variable. Patching the Docker image is rather simple by creating a new image: {code} FROM nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200406 ENV JAVA8_HOME=/usr/lib/jvm/java-8-openjdk-amd64 {code} Pushed the above to Docker hub as [snazy/cassandra-testing-ubuntu1910-java11-w-dependencies:202005261540|https://hub.docker.com/layers/snazy/cassandra-testing-ubuntu1910-java11-w-dependencies/202005261540/images/sha256-ac8a713be58694f095c491921e006c2d1a7823a3c23299e477198e2c93a6bbd7?context=explore] The size of the whole Docker image is a little concerning though (1.85G compressed), but that's out of the scope of this ticket. I'll prepare a patch soon-ish. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski updated CASSANDRA-15834: --- Test and Documentation Plan: https://github.com/apache/cassandra-dtest/pull/70 https://github.com/apache/cassandra/pull/600 Status: Patch Available (was: Open) > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug > Components: Observability/JMX >Reporter: Jaroslaw Grabowski >Assignee: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} > Original jira by [~pauloricardomg] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski updated CASSANDRA-15834: --- Bug Category: Parent values: Correctness(12982) Complexity: Normal Component/s: Observability/JMX Discovered By: Code Inspection Severity: Normal Status: Open (was: Triage Needed) > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug > Components: Observability/JMX >Reporter: Jaroslaw Grabowski >Assignee: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} > Original jira by [~pauloricardomg] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski reassigned CASSANDRA-15834: -- Assignee: Jaroslaw Grabowski > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug >Reporter: Jaroslaw Grabowski >Assignee: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} > Original jira by [~pauloricardomg] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski updated CASSANDRA-15834: --- Description: The bloom filter false positive ratio is [currently computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] as: {{bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)}} However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters). In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. We should update the calculation to: {{bf_fp_ratio = false_positive_count / #bf_queries}} Original jira by [~pauloricardomg] was: The bloom filter false positive ratio is [currently computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] as: {{bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)}} However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters). In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. We should update the calculation to: {{bf_fp_ratio = false_positive_count / #bf_queries}} > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug >Reporter: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} > Original jira by [~pauloricardomg] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name
[ https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116696#comment-17116696 ] Andres de la Peña commented on CASSANDRA-14361: --- The rebase looks perfect. Indeed it seems that the option to fallback to the old behaviour is not needed nor desirable. Anyway, I still think we should add a brief entry in {{NEWS.txt}} about the new behaviour. I have created [a PR|https://github.com/apache/cassandra/pull/599] from your branch and left a couple of very minor suggestions, besides the one about moving the property to {{cassandra.yaml}}. > Allow SimpleSeedProvider to resolve multiple IPs per DNS name > - > > Key: CASSANDRA-14361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14361 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Ben Bromhead >Assignee: Ben Bromhead >Priority: Low > Fix For: 4.0 > > > Currently SimpleSeedProvider can accept a comma separated string of IPs or > hostnames as the set of Cassandra seeds. hostnames are resolved via > InetAddress.getByName, which will only return the first IP associated with an > A, or CNAME record. > By changing to InetAddress.getAllByName, existing behavior is preserved, but > now Cassandra can discover multiple IP address per record, allowing seed > discovery by DNS to be a little easier. > Some examples of improved workflows with this change include: > * specify the DNS name of a headless service in Kubernetes which will > resolve to all IP addresses of pods within that service. > * seed discovery for multi-region clusters via AWS route53, AzureDNS etc > * Other common DNS service discovery mechanisms. > The only behavior this is likely to impact would be where users are relying > on the fact that getByName only returns a single IP address. > I can't imagine any scenario where that is a sane choice. Even when that > choice has been made, it only impacts the first startup of Cassandra and > would not be on any critical path. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116681#comment-17116681 ] Ekaterina Dimitrova commented on CASSANDRA-15783: - Thanks [~mck]! > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0, 4.0-alpha5 > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15725: Source Control Link: https://github.com/apache/cassandra/commit/19a8f4ea13eb844bc0387637f82da1da62991107 Resolution: Fixed Status: Resolved (was: Ready to Commit) And committed, thanks! > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Add support for adding custom Verbs
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 19a8f4e Add support for adding custom Verbs 19a8f4e is described below commit 19a8f4ea13eb844bc0387637f82da1da62991107 Author: Marcus Eriksson AuthorDate: Thu Apr 2 08:23:33 2020 +0200 Add support for adding custom Verbs Patch by marcuse; reviewed by Benedict Elliott Smith and David Capwell for CASSANDRA-15725 --- CHANGES.txt | 1 + src/java/org/apache/cassandra/net/Verb.java | 106 +-- test/unit/org/apache/cassandra/net/VerbTest.java | 33 +++ 3 files changed, 133 insertions(+), 7 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 1be10dd..2a35318 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0-alpha5 + * Add support for adding custom Verbs (CASSANDRA-15725) * Speed up entire-file-streaming file containment check and allow entire-file-streaming for all compaction strategies (CASSANDRA-15657,CASSANDRA-15783) * Provide ability to configure IAuditLogger (CASSANDRA-15748) * Fix nodetool enablefullquerylog blocking param parsing (CASSANDRA-15819) diff --git a/src/java/org/apache/cassandra/net/Verb.java b/src/java/org/apache/cassandra/net/Verb.java index 67d847e..6ba9ab8 100644 --- a/src/java/org/apache/cassandra/net/Verb.java +++ b/src/java/org/apache/cassandra/net/Verb.java @@ -91,6 +91,7 @@ import static org.apache.cassandra.concurrent.Stage.*; import static org.apache.cassandra.concurrent.Stage.INTERNAL_RESPONSE; import static org.apache.cassandra.concurrent.Stage.MISC; import static org.apache.cassandra.net.VerbTimeouts.*; +import static org.apache.cassandra.net.Verb.Kind.*; import static org.apache.cassandra.net.Verb.Priority.*; import static org.apache.cassandra.schema.MigrationManager.MigrationsSerializer; @@ -185,6 +186,11 @@ public enum Verb INTERNAL_RSP (23, P1, rpcTimeout, INTERNAL_RESPONSE, () -> null, () -> ResponseVerbHandler.instance ), // largest used ID: 116 + +// CUSTOM VERBS +UNUSED_CUSTOM_VERB (CUSTOM, +0, P1, rpcTimeout, INTERNAL_RESPONSE, () -> null, () -> null ), + ; public static final List VERBS = ImmutableList.copyOf(Verb.values()); @@ -198,9 +204,16 @@ public enum Verb P4 } +public enum Kind +{ +NORMAL, +CUSTOM +} + public final int id; public final Priority priority; public final Stage stage; +public final Kind kind; /** * Messages we receive from peers have a Verb that tells us what kind of message it is. @@ -233,16 +246,38 @@ public enum Verb Verb(int id, Priority priority, ToLongFunction expiration, Stage stage, Supplier> serializer, Supplier> handler, Verb responseVerb) { +this(NORMAL, id, priority, expiration, stage, serializer, handler, responseVerb); +} + +Verb(Kind kind, int id, Priority priority, ToLongFunction expiration, Stage stage, Supplier> serializer, Supplier> handler) +{ +this(kind, id, priority, expiration, stage, serializer, handler, null); +} + +Verb(Kind kind, int id, Priority priority, ToLongFunction expiration, Stage stage, Supplier> serializer, Supplier> handler, Verb responseVerb) +{ this.stage = stage; if (id < 0) throw new IllegalArgumentException("Verb id must be non-negative, got " + id + " for verb " + name()); -this.id = id; +if (kind == CUSTOM) +{ +if (id > MAX_CUSTOM_VERB_ID) +throw new AssertionError("Invalid custom verb id " + id + " - we only allow custom ids between 0 and " + MAX_CUSTOM_VERB_ID); +this.id = idForCustomVerb(id); +} +else +{ +if (id > CUSTOM_VERB_START - MAX_CUSTOM_VERB_ID) +throw new AssertionError("Invalid verb id " + id + " - we only allow ids between 0 and " + (CUSTOM_VERB_START - MAX_CUSTOM_VERB_ID)); +this.id = id; +} this.priority = priority; this.serializer = serializer; this.handler = handler; this.responseVerb = responseVerb; this.expiration = expiration; +this.kind = kind; } public IVersionedAsymmetricSerializer serializer() @@ -319,33 +354,90 @@ public enum Verb return original; } +// This is the largest number we can store in 2 bytes using VIntCoding (1 bit per byte is used to indicate if there is more data coming). +// When generating ids we count *down* from this number +private static final int CUSTOM_VERB_START = (1
[jira] [Updated] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski updated CASSANDRA-15834: --- Since Version: 3.0.0 > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug >Reporter: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
[ https://issues.apache.org/jira/browse/CASSANDRA-15834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Grabowski updated CASSANDRA-15834: --- Description: The bloom filter false positive ratio is [currently computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] as: {{bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)}} However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters). In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. We should update the calculation to: {{bf_fp_ratio = false_positive_count / #bf_queries}} was: The bloom filter false positive ratio is [currently computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] as: {\{bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)}} However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters). In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. We should update the calculation to: {\{bf_fp_ratio = false_positive_count / #bf_queries}} > Bloom filter false positive rate calculation does not take into account true > negatives > -- > > Key: CASSANDRA-15834 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 > Project: Cassandra > Issue Type: Bug >Reporter: Jaroslaw Grabowski >Priority: Normal > > The bloom filter false positive ratio is [currently > computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] > as: > {{bf_fp_ratio = false_positive_count / (false_positive_count + > true_positive_count)}} > However, this calculation doesn't take into account true negatives (false > negatives never happen on bloom filters). > In a situation where there are 1000 reads for non existing rows, and there > are 10 false positives, the bloom filter false positive ratio will be wrongly > calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. > We should update the calculation to: > {{bf_fp_ratio = false_positive_count / #bf_queries}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15834) Bloom filter false positive rate calculation does not take into account true negatives
Jaroslaw Grabowski created CASSANDRA-15834: -- Summary: Bloom filter false positive rate calculation does not take into account true negatives Key: CASSANDRA-15834 URL: https://issues.apache.org/jira/browse/CASSANDRA-15834 Project: Cassandra Issue Type: Bug Reporter: Jaroslaw Grabowski The bloom filter false positive ratio is [currently computed|https://github.com/apache/cassandra/blob/ded62076e7fdfd1cfdcf96447489ea607ca796a0/src/java/org/apache/cassandra/metrics/TableMetrics.java#L738] as: {\{bf_fp_ratio = false_positive_count / (false_positive_count + true_positive_count)}} However, this calculation doesn't take into account true negatives (false negatives never happen on bloom filters). In a situation where there are 1000 reads for non existing rows, and there are 10 false positives, the bloom filter false positive ratio will be wrongly calculated as 10/10 = 1.0, while it should be 10/1000 = 0.01. We should update the calculation to: {\{bf_fp_ratio = false_positive_count / #bf_queries}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-15833: -- Attachment: CASSANDRA-15833-3.11.patch > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.x > > Attachments: CASSANDRA-15833-3.11.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies
[ https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116578#comment-17116578 ] Sylvain Lebresne commented on CASSANDRA-12126: -- It definitively doesn't look good that this messages comes so late, but I feel this is a serious issue of the {{SERIAL}}/{{LOCAL_SERIAL}} consistency levels since this breaks the basic guarantee they exist to provide, and as such should be fixed all the way down 3.0, and the sooner, the better. In an attempt to sum this up quickly, the problem we have here affects both serial reads _and_ LWT updates that do not apply (whose condition evaluates to {{false}}). In both case, while the current code replays "effectively committed" proposals (those whose proposal has been accepted by majority of replica) with {{beginAndRepairPaxos}}, neither make proposals of their own, so nothing will prevent a proposal accepted by a minority of replica (say just one) to be later replayed (and thus committed). I've pushed [2 in-jvm dtests|https://github.com/pcmanus/cassandra/commit/3442277905362b38e0d6a2b8170916fcfd18d469] that demonstrate the issue for both cases (again, serial reads and non-applying updates). They use "filters" to selectively drop messages to make failure consistent but aren't otherwise very involved. As [~kohlisankalp] mentioned initially, the "simplest"\[1\] way to fix this that I see is to commit an empty update in both cases. Actually committing, which sets the {{mostRecentCommit}} value in the Paxos state, ensures that no prior proposal can ever be replayed. I've pushed a patch to do so on 3.0/3.11 below (will merge up on 4.0, but wanted to make sure we're ok on the approach first): ||version|| | [3.0|https://github.com/pcmanus/cassandra/commits/C-12126-3.0] | | [3.11|https://github.com/pcmanus/cassandra/commits/C-12126-3.11] | The big downside of this patch however is the performance impact. Currently, a {{SERIAL}} read (that finds nothing in progress it needs to replay) is 2 round-trips (a prepare phase, followed by the actual read). With this patch, it is 3 round-trips as we have to propose our empty commit and get acceptance (we don't really have to wait for responses on the commit though), which will be noticeable for performance sensitive use-cases. Similarly, the performance of LWT that don't apply will be impacted. That said, I don't seen another approach to fixing this that would be as acceptable for 3.0/3.11 in terms of risks, and imo 'slower but correct' beats 'faster but broken' any day, so I'm in favor of moving forward with this fix. Opinions? \[1\]: I mean by that both the simplicity of the change, but also of validating that this fix the problem at hand without creating new correctness problems. > CAS Reads Inconsistencies > -- > > Key: CASSANDRA-12126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12126 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Legacy/Coordination >Reporter: Sankalp Kohli >Assignee: Sylvain Lebresne >Priority: Normal > Labels: LWT, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > While looking at the CAS code in Cassandra, I found a potential issue with > CAS Reads. Here is how it can happen with RF=3 > 1) You issue a CAS Write and it fails in the propose phase. A machine replies > true to a propose and saves the commit in accepted filed. The other two > machines B and C does not get to the accept phase. > Current state is that machine A has this commit in paxos table as accepted > but not committed and B and C does not. > 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the > value written in step 1. This step is as if nothing is inflight. > 3) Issue another CAS Read and it goes to A and B. Now we will discover that > there is something inflight from A and will propose and commit it with the > current ballot. Now we can read the value written in step 1 as part of this > CAS read. > If we skip step 3 and instead run step 4, we will never learn about value > written in step 1. > 4. Issue a CAS Write and it involves only B and C. This will succeed and > commit a different value than step 1. Step 1 value will never be seen again > and was never seen before. > If you read the Lamport “paxos made simple” paper and read section 2.3. It > talks about this issue which is how learners can find out if majority of the > acceptors have accepted the proposal. > In step 3, it is correct that we propose the value again since we dont know > if it was accepted by majority of acceptors. When we ask majority of > acceptors, and more than one acceptors but not majority has something in > flight, we have no way of knowin
[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116570#comment-17116570 ] Michael Semb Wever commented on CASSANDRA-15805: [~slebresne], i've re-run #131 for you as https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/134/pipeline (taking away the stress and cdc stages, as they don't exist in 3.0) and #132 as https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/135/pipeline (just a retry). (The CI failure in #132 is addressed in CASSANDRA-15826.) > Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones > interacts with collection tombstones > -- > > Key: CASSANDRA-15805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15805 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Local/SSTable >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > The legacy reading code ({{LegacyLayout}} and > {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly > the case where a range tombstone covering multiple rows interacts with a > collection tombstone. > A simple example of this problem is if one runs on 2.X: > {noformat} > CREATE TABLE t ( > k int, > c1 text, > c2 text, > a text, > b set, > c text, > PRIMARY KEY((k), c1, c2) > ); > // Delete all rows where c1 is 'A' > DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A'; > // Inserts a row covered by that previous range tombstone > INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, > 'bar') USING TIMESTAMP 2; > // Delete the collection of that previously inserted row > DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X'; > {noformat} > If the following is ran on 2.X (with everything either flushed in the same > table or compacted together), then this will result in the inserted row being > duplicated (one part containing the {{a}} column, the other the {{c}} one). > I will note that this is _not_ a duplicate of CASSANDRA-15789 and this > reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, > the additional code added to CASSANDRA-15789 to force merging duplicated rows > if they are produced _will_ end up fixing this as a consequence (assuming > there is no variation of this problem that leads to other visible issues than > duplicated rows). That said, I "think" we'd still rather fix the source of > the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111019#comment-17111019 ] Sylvain Lebresne edited comment on CASSANDRA-15805 at 5/26/20, 9:42 AM: Thanks for the review. I addressed the comments, squash-cleaned, 'merged' into 3.11 and started CI (first try at https://ci-cassandra.apache.org, not sure how that will go). ||branch||CI|| | [3.0|https://github.com/pcmanus/cassandra/commits/C-15805-3.0] | [ci-cassandra #134|https://ci-cassandra.apache.org/job/Cassandra-devbranch/134/] | | [3.11|https://github.com/pcmanus/cassandra/commits/C-15805-3.11] | [ci-cassandra #135|https://ci-cassandra.apache.org/job/Cassandra-devbranch/135/] | was (Author: slebresne): Thanks for the review. I addressed the comments, squash-cleaned, 'merged' into 3.11 and started CI (first try at https://ci-cassandra.apache.org, not sure how that will go). ||branch||CI|| | [3.0|https://github.com/pcmanus/cassandra/commits/C-15805-3.0] | [ci-cassandra #131|https://ci-cassandra.apache.org/job/Cassandra-devbranch/131/] | | [3.11|https://github.com/pcmanus/cassandra/commits/C-15805-3.11] | [ci-cassandra #132|https://ci-cassandra.apache.org/job/Cassandra-devbranch/132/] | > Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones > interacts with collection tombstones > -- > > Key: CASSANDRA-15805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15805 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Local/SSTable >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > The legacy reading code ({{LegacyLayout}} and > {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly > the case where a range tombstone covering multiple rows interacts with a > collection tombstone. > A simple example of this problem is if one runs on 2.X: > {noformat} > CREATE TABLE t ( > k int, > c1 text, > c2 text, > a text, > b set, > c text, > PRIMARY KEY((k), c1, c2) > ); > // Delete all rows where c1 is 'A' > DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A'; > // Inserts a row covered by that previous range tombstone > INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, > 'bar') USING TIMESTAMP 2; > // Delete the collection of that previously inserted row > DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X'; > {noformat} > If the following is ran on 2.X (with everything either flushed in the same > table or compacted together), then this will result in the inserted row being > duplicated (one part containing the {{a}} column, the other the {{c}} one). > I will note that this is _not_ a duplicate of CASSANDRA-15789 and this > reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, > the additional code added to CASSANDRA-15789 to force merging duplicated rows > if they are produced _will_ end up fixing this as a consequence (assuming > there is no variation of this problem that leads to other visible issues than > duplicated rows). That said, I "think" we'd still rather fix the source of > the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15725: --- Status: Ready to Commit (was: Review In Progress) > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116548#comment-17116548 ] Benedict Elliott Smith commented on CASSANDRA-15725: +1 > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacek Lewandowski updated CASSANDRA-15833: -- Bug Category: Parent values: Correctness(12982)Level 1 values: Transient Incorrect Response(12987) Complexity: Low Hanging Fruit Discovered By: User Report Fix Version/s: 4.x 3.11.x Severity: Low Status: Open (was: Triage Needed) > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.x > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
Jacek Lewandowski created CASSANDRA-15833: - Summary: Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 Key: CASSANDRA-15833 URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 Project: Cassandra Issue Type: Bug Components: Consistency/Repair Reporter: Jacek Lewandowski Assignee: Jacek Lewandowski CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. This results in digest mismatch when querying incomplete set of columns from a table with consistency that requires reaching instances running pre CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in Cassandra 3.4). The fix is to bring back the previous behaviour until there are no instances running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15812) Submitting Validation requests can block ANTI_ENTROPY stage
[ https://issues.apache.org/jira/browse/CASSANDRA-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15812: --- Reviewers: Benjamin Lerer > Submitting Validation requests can block ANTI_ENTROPY stage > > > Key: CASSANDRA-15812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15812 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0-alpha > > > RepairMessages are handled on Stage.ANTI_ENTROPY, which has a thread pool > with core/max capacity of one, ie. we can only process one message at a time. > > Scheduling validation compactions may however block the stage completely, by > blocking on CompactionManager's ValidationExecutor while submitting a new > validation compaction, in cases where there are already more validations > running than can be executed in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116533#comment-17116533 ] Michael Semb Wever edited comment on CASSANDRA-15783 at 5/26/20, 8:30 AM: -- Committed as [598a92180e2ad95b48419605d270c53497739f35 |https://github.com/apache/cassandra/commit/598a92180e2ad95b48419605d270c53497739f35] and [d7aacd3fa9b7d4c4fef80f5550a2576303e29890 |https://github.com/apache/cassandra-dtest/commit/d7aacd3fa9b7d4c4fef80f5550a2576303e29890]. (added CHANGES.txt line for this ticket and CASSANDRA-15657) was (Author: michaelsembwever): Committed as [598a92180e2ad95b48419605d270c53497739f35 |https://github.com/apache/cassandra/commit/598a92180e2ad95b48419605d270c53497739f35] and [d7aacd3fa9b7d4c4fef80f5550a2576303e29890 |https://github.com/apache/cassandra-dtest/commit/d7aacd3fa9b7d4c4fef80f5550a2576303e29890]. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0, 4.0-alpha5 > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-15783: --- Fix Version/s: (was: 4.0-alpha) 4.0-alpha5 4.0 Since Version: 4.0-alpha5 Source Control Link: https://github.com/apache/cassandra/commit/598a92180e2ad95b48419605d270c53497739f35 https://github.com/apache/cassandra-dtest/commit/d7aacd3fa9b7d4c4fef80f5550a2576303e29890 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as [598a92180e2ad95b48419605d270c53497739f35 |https://github.com/apache/cassandra/commit/598a92180e2ad95b48419605d270c53497739f35] and [d7aacd3fa9b7d4c4fef80f5550a2576303e29890 |https://github.com/apache/cassandra-dtest/commit/d7aacd3fa9b7d4c4fef80f5550a2576303e29890]. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0, 4.0-alpha5 > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Use isTransient=false for ZCS sstables
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 598a921 Use isTransient=false for ZCS sstables 598a921 is described below commit 598a92180e2ad95b48419605d270c53497739f35 Author: Zhao Yang AuthorDate: Mon May 11 04:29:47 2020 +0800 Use isTransient=false for ZCS sstables patch by Zhao Yang; reviewed by Blake Eggleston, Dinesh Joshi, Ekaterina Dimitrova for CASSANDRA-15783 --- CHANGES.txt| 1 + doc/source/new/transientreplication.rst| 4 ++-- .../CassandraEntireSSTableStreamReader.java| 9 ++-- .../io/sstable/metadata/IMetadataSerializer.java | 10 + .../io/sstable/metadata/MetadataSerializer.java| 25 +- 5 files changed, 35 insertions(+), 14 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 01f74f0..1be10dd 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0-alpha5 + * Speed up entire-file-streaming file containment check and allow entire-file-streaming for all compaction strategies (CASSANDRA-15657,CASSANDRA-15783) * Provide ability to configure IAuditLogger (CASSANDRA-15748) * Fix nodetool enablefullquerylog blocking param parsing (CASSANDRA-15819) * Add isTransient to SSTableMetadataView (CASSANDRA-15806) diff --git a/doc/source/new/transientreplication.rst b/doc/source/new/transientreplication.rst index 438f437..aa39a11 100644 --- a/doc/source/new/transientreplication.rst +++ b/doc/source/new/transientreplication.rst @@ -33,7 +33,7 @@ Certain nodes act as full replicas (storing all the data for a given token range The optimization that is made possible with transient replication is called "Cheap quorums", which implies that data redundancy is increased without corresponding increase in storage usage. -Transient replication is useful when sufficient full replicas are unavailable to receive and store all the data. +Transient replication is useful when sufficient full replicas are available to receive and store all the data. Transient replication allows you to configure a subset of replicas to only replicate data that hasn't been incrementally repaired. As an optimization, we can avoid writing data to a transient replica if we have successfully written data to the full replicas. @@ -55,7 +55,7 @@ As an example, create a keyspace with replication factor (RF) 3. :: CREATE KEYSPACE CassandraKeyspaceSimple WITH replication = {'class': 'SimpleStrategy', - 'replication_factor' : 4/1}; + 'replication_factor' : 3/1}; As another example, ``some_keysopace keyspace`` will have 3 replicas in DC1, 1 of which is transient, and 5 replicas in DC2, 2 of which are transient: diff --git a/src/java/org/apache/cassandra/db/streaming/CassandraEntireSSTableStreamReader.java b/src/java/org/apache/cassandra/db/streaming/CassandraEntireSSTableStreamReader.java index 479ee71..eac37d1 100644 --- a/src/java/org/apache/cassandra/db/streaming/CassandraEntireSSTableStreamReader.java +++ b/src/java/org/apache/cassandra/db/streaming/CassandraEntireSSTableStreamReader.java @@ -22,6 +22,7 @@ import java.io.File; import java.io.IOException; import java.util.Collection; +import com.google.common.base.Function; import com.google.common.base.Throwables; import org.apache.cassandra.db.lifecycle.LifecycleNewTracker; import org.slf4j.Logger; @@ -29,12 +30,12 @@ import org.slf4j.LoggerFactory; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.Directories; -import org.apache.cassandra.db.lifecycle.LifecycleTransaction; import org.apache.cassandra.io.sstable.Component; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.sstable.SSTableMultiWriter; import org.apache.cassandra.io.sstable.format.SSTableFormat; import org.apache.cassandra.io.sstable.format.big.BigTableZeroCopyWriter; +import org.apache.cassandra.io.sstable.metadata.StatsMetadata; import org.apache.cassandra.io.util.DataInputPlus; import org.apache.cassandra.schema.TableId; import org.apache.cassandra.streaming.ProgressInfo; @@ -54,6 +55,7 @@ public class CassandraEntireSSTableStreamReader implements IStreamReader private final TableId tableId; private final StreamSession session; +private final StreamMessageHeader messageHeader; private final CassandraStreamHeader header; private final int fileSequenceNumber; @@ -71,6 +73,7 @@ public class CassandraEntireSSTableStreamReader implements IStreamReader this.header = streamHeader; this.session = session; +this.messageHeader = messageHeader; this.tableId = messageHeader.tableId; this.fileSequenceNumber = messageHeader.sequenceNumber; } @@ -132,7 +135,9 @@ public class CassandraEntireSSTableStreamReader implem
[cassandra-dtest] branch master updated: Add legacy streaming test for transient replica repair tests, and test for lcs
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git The following commit(s) were added to refs/heads/master by this push: new d7aacd3 Add legacy streaming test for transient replica repair tests, and test for lcs d7aacd3 is described below commit d7aacd3fa9b7d4c4fef80f5550a2576303e29890 Author: Zhao Yang AuthorDate: Mon May 11 15:51:22 2020 +0800 Add legacy streaming test for transient replica repair tests, and test for lcs patch by Zhao Yang; reviewed by Blake Eggleston, Dinesh Joshi, Ekaterina Dimitrova for CASSANDRA-15783 --- transient_replication_test.py | 126 +- 1 file changed, 75 insertions(+), 51 deletions(-) diff --git a/transient_replication_test.py b/transient_replication_test.py index 990a984..e04162f 100644 --- a/transient_replication_test.py +++ b/transient_replication_test.py @@ -179,6 +179,8 @@ class TransientReplicationBase(Tester): # Make sure digest is not attempted against the transient node self.node3.byteman_submit(['./byteman/throw_on_digest.btm']) +def stream_entire_sstables(self): +return True def replication_factor(self): return '3/1' @@ -186,6 +188,10 @@ class TransientReplicationBase(Tester): def tokens(self): return [0, 1, 2] +def use_lcs(self): +session = self.exclusive_cql_connection(self.node1) +session.execute("ALTER TABLE %s.%s with compaction={'class': 'LeveledCompactionStrategy'};" % (self.keyspace, self.table)) + def setup_schema(self): session = self.exclusive_cql_connection(self.node1) replication_params = OrderedDict() @@ -202,6 +208,7 @@ class TransientReplicationBase(Tester): patch_start(self.cluster) self.cluster.set_configuration_options(values={'hinted_handoff_enabled': False, 'num_tokens': 1, + 'stream_entire_sstables': self.stream_entire_sstables(), 'commitlog_sync_period_in_ms': 500, 'enable_transient_replication': True, 'dynamic_snitch': False}) @@ -403,13 +410,63 @@ class TestTransientReplication(TransientReplicationBase): [[1, 1, 1]], cl=ConsistencyLevel.QUORUM) -def _test_speculative_write_repair_cycle(self, primary_range, optimized_repair, repair_coordinator, expect_node3_data): +@pytest.mark.no_vnodes +def test_cheap_quorums(self): +""" writes shouldn't make it to transient nodes """ +session = self.exclusive_cql_connection(self.node1) +for node in self.nodes: +self.assert_has_no_sstables(node) + +tm = lambda n: self.table_metrics(n) + +with tm(self.node1) as tm1, tm(self.node2) as tm2, tm(self.node3) as tm3: +assert tm1.write_count == 0 +assert tm2.write_count == 0 +assert tm3.write_count == 0 +self.insert_row(1, 1, 1, session=session) +assert tm1.write_count == 1 +assert tm2.write_count == 1 +assert tm3.write_count == 0 + +@pytest.mark.no_vnodes +def test_speculative_write(self): +""" if a full replica isn't responding, we should send the write to the transient replica """ +session = self.exclusive_cql_connection(self.node1) +self.node2.byteman_submit(['./byteman/slow_writes.btm']) + +self.insert_row(1, 1, 1, session=session) +self.assert_local_rows(self.node1, [[1,1,1]]) +self.assert_local_rows(self.node2, []) +self.assert_local_rows(self.node3, [[1,1,1]]) + +@pytest.mark.skip(reason="Doesn't test quite the right combination of forbidden RF changes right now") +def test_keyspace_rf_changes(self): +""" they should throw an exception """ +session = self.exclusive_cql_connection(self.node1) +replication_params = OrderedDict() +replication_params['class'] = 'NetworkTopologyStrategy' +assert self.replication_factor() == '3/1' +replication_params['datacenter1'] = '5/2' +replication_params = ', '.join("'%s': '%s'" % (k, v) for k, v in replication_params.items()) +with pytest.raises(ConfigurationException): +session.execute("ALTER KEYSPACE %s WITH REPLICATION={%s}" % (self.keyspace, replication_params)) + +@since('4.0') +class TestTransientReplicationRepairStreamEntireSSTable(TransientReplicationBase): + +def stream_entire_sstables(self): +return True + +def _test_speculative_write_repair_cycle(self, primary_range, optimized_repair, repair_coordinator, expect_node3_data, use_lcs=False): """ if