[jira] [Updated] (CASSANDRA-18673) Reduce size of per-SSTable index components
[ https://issues.apache.org/jira/browse/CASSANDRA-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18673: - Source Control Link: https://github.com/apache/cassandra/pull/2498 (was: {color:red}colored text{color}https://github.com/apache/cassandra/pull/2498) > Reduce size of per-SSTable index components > --- > > Key: CASSANDRA-18673 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18673 > Project: Cassandra > Issue Type: Improvement > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Urgent > Time Spent: 6h 20m > Remaining Estimate: 0h > > The current per-SSTable index components are large because the primary keys > that are stored in them include the token as part of the byte comparable. The > byte comparable puts the token first meaning that we get very little prefix > compression from either the trie or the sorted terms store. > We can fix this by removing the token from the primary key serialization. > This would allow us to get the prefix compression from the trie and the > sorted terms store. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18673) Reduce size of per-SSTable index components
[ https://issues.apache.org/jira/browse/CASSANDRA-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18673: - Source Control Link: {color:red}colored text{color}https://github.com/apache/cassandra/pull/2498 (was: https://github.com/apache/cassandra/pull/2498) > Reduce size of per-SSTable index components > --- > > Key: CASSANDRA-18673 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18673 > Project: Cassandra > Issue Type: Improvement > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Urgent > Time Spent: 6h 20m > Remaining Estimate: 0h > > The current per-SSTable index components are large because the primary keys > that are stored in them include the token as part of the byte comparable. The > byte comparable puts the token first meaning that we get very little prefix > compression from either the trie or the sorted terms store. > We can fix this by removing the token from the primary key serialization. > This would allow us to get the prefix compression from the trie and the > sorted terms store. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18466) Paxos only repair is treated as an incremental repair
[ https://issues.apache.org/jira/browse/CASSANDRA-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18466: - Test and Documentation Plan: (was: Added test in LongLeveledCompactionStrategyTest: testValidationDuringConstruction) > Paxos only repair is treated as an incremental repair > - > > Key: CASSANDRA-18466 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18466 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Andrew >Assignee: Ningzi Zhan >Priority: Normal > Labels: lhf > Fix For: 4.1.x, 5.x > > > Paxos only repair tries to continue or is treated as an incremental repair. > This happened on 4.1.0 and 4.1.1 when trying to run repair in preparation for > enabling paxos_state_purging. The repair was in preparation mode triggered > multiple anti-compactions on the nodes. Running the command with --full > behaves in the expected way, ie. only the paxos data is repaired and it's > finished within a few seconds. > {code:java} > nodetool repair --paxos-only // This does not behave as expected, does it > complete quickly and seems to be waiting on anticompactions > {code} > {code:java} > nodetool repair --full --paxos-only // Completes within a few seconds as > expected > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18329) Upgrade jamm
[ https://issues.apache.org/jira/browse/CASSANDRA-18329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722189#comment-17722189 ] Matt Fleming commented on CASSANDRA-18329: -- This ticket is blocked on [https://github.com/jbellis/jamm/pull/50] right? > Upgrade jamm > > > Key: CASSANDRA-18329 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18329 > Project: Cassandra > Issue Type: Task > Components: Jamm >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.x > > > Jamm is currently under maintenance that will solve JDK11 issues and enable > it to work with post JDK11+ versions up to JDK17. > This ticket will serve as a placeholder for upgrading Jamm in Cassandra when > the new Jamm release is out. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16718) Changing listen_address with prefer_local may lead to issues
[ https://issues.apache.org/jira/browse/CASSANDRA-16718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719534#comment-17719534 ] Matt Fleming commented on CASSANDRA-16718: -- Patch looks good to me. nb +1 > Changing listen_address with prefer_local may lead to issues > > > Key: CASSANDRA-16718 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16718 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Jan Karlsson >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.x > > > Many container based solution function by assigning new listen_addresses when > nodes are stopped. Changing the listen_address is usually as simple as > turning off the node and changing the yaml file. > However, if prefer_local is enabled, I observed that nodes were unable to > join the cluster and fail with 'Unable to gossip with any seeds'. > Trace shows that the changing node will try to communicate with the existing > node but the response is never received. I assume it is because the existing > node attempts to communicate with the local address during the shadow round. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18067) On-disk numeric index
[ https://issues.apache.org/jira/browse/CASSANDRA-18067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18067: - Fix Version/s: NA (was: 4.x) > On-disk numeric index > - > > Key: CASSANDRA-18067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18067 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > Fix For: NA > > > An on-disk numeric index for all datatypes not supported by the on-disk > literal index (CASSANDRA-18062) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18062) On-disk string index with index building and on-disk query path
[ https://issues.apache.org/jira/browse/CASSANDRA-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18062: - Fix Version/s: (was: 4.x) > On-disk string index with index building and on-disk query path > --- > > Key: CASSANDRA-18062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18062 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > > An on-disk index for string (literal) datatypes. This index is used for the > following datatypes: > * UTF8Type > * AsciiType > * CompositeType > * Frozen types > This includes the ability to write the index to disk at index creation, by > specific index rebuild and during SSTable compaction. > Also the ability to query the on-disk index and combine the results with > those from the in-memory indexes created by CASSANDRA-18058. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18062) On-disk string index with index building and on-disk query path
[ https://issues.apache.org/jira/browse/CASSANDRA-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18062: - Fix Version/s: NA > On-disk string index with index building and on-disk query path > --- > > Key: CASSANDRA-18062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18062 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > Fix For: NA > > > An on-disk index for string (literal) datatypes. This index is used for the > following datatypes: > * UTF8Type > * AsciiType > * CompositeType > * Frozen types > This includes the ability to write the index to disk at index creation, by > specific index rebuild and during SSTable compaction. > Also the ability to query the on-disk index and combine the results with > those from the in-memory indexes created by CASSANDRA-18058. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18062) On-disk string index with index building and on-disk query path
[ https://issues.apache.org/jira/browse/CASSANDRA-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691712#comment-17691712 ] Matt Fleming commented on CASSANDRA-18062: -- [~maedhroz] Sure can do. I was just monkeying about with the versions to create a simpler kanban board, but I'm happy to use NA here. > On-disk string index with index building and on-disk query path > --- > > Key: CASSANDRA-18062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18062 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > Fix For: 4.x > > > An on-disk index for string (literal) datatypes. This index is used for the > following datatypes: > * UTF8Type > * AsciiType > * CompositeType > * Frozen types > This includes the ability to write the index to disk at index creation, by > specific index rebuild and during SSTable compaction. > Also the ability to query the on-disk index and combine the results with > those from the in-memory indexes created by CASSANDRA-18058. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18067) On-disk numeric index
[ https://issues.apache.org/jira/browse/CASSANDRA-18067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18067: - Fix Version/s: 4.x > On-disk numeric index > - > > Key: CASSANDRA-18067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18067 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > Fix For: 4.x > > > An on-disk numeric index for all datatypes not supported by the on-disk > literal index (CASSANDRA-18062) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18062) On-disk string index with index building and on-disk query path
[ https://issues.apache.org/jira/browse/CASSANDRA-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-18062: - Fix Version/s: 4.x > On-disk string index with index building and on-disk query path > --- > > Key: CASSANDRA-18062 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18062 > Project: Cassandra > Issue Type: New Feature > Components: Feature/SAI >Reporter: Mike Adamson >Assignee: Mike Adamson >Priority: Normal > Fix For: 4.x > > > An on-disk index for string (literal) datatypes. This index is used for the > following datatypes: > * UTF8Type > * AsciiType > * CompositeType > * Frozen types > This includes the ability to write the index to disk at index creation, by > specific index rebuild and during SSTable compaction. > Also the ability to query the on-disk index and combine the results with > those from the in-memory indexes created by CASSANDRA-18058. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17361) STCS documentation on website mentions LCS in title
Matt Fleming created CASSANDRA-17361: Summary: STCS documentation on website mentions LCS in title Key: CASSANDRA-17361 URL: https://issues.apache.org/jira/browse/CASSANDRA-17361 Project: Cassandra Issue Type: Bug Reporter: Matt Fleming The STCS page here, [https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/stcs.html,] says "Leveled Compaction Strategy" in the title where it should say "Size-tiered Compaction Strategy. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450581#comment-17450581 ] Matt Fleming commented on CASSANDRA-16840: -- Hi Aleks :) Yeah, I agree this should have a test. I'll add one. > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449654#comment-17449654 ] Matt Fleming commented on CASSANDRA-16840: -- Is there anyone available and has the interest to review this patch? > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17057) Refactor and support pluggable failure detection
[ https://issues.apache.org/jira/browse/CASSANDRA-17057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437240#comment-17437240 ] Matt Fleming commented on CASSANDRA-17057: -- Ah, you're right. I completely missed that pluggable failure detection support was added as part of CEP-10. Sorry for the noise. > Refactor and support pluggable failure detection > > > Key: CASSANDRA-17057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17057 > Project: Cassandra > Issue Type: Improvement >Reporter: Matt Fleming >Priority: Normal > > Making it possible to supply custom failure detectors enables supporting new > failure detection algorithms and makes testing using mocks much easier. > The general idea is to introduce a new config parameter, such as > org.apache.cassandra.custom_failure_detector_class, that specifies the > failure detection class to use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17059) Support adding custom verbs at runtime
Matt Fleming created CASSANDRA-17059: Summary: Support adding custom verbs at runtime Key: CASSANDRA-17059 URL: https://issues.apache.org/jira/browse/CASSANDRA-17059 Project: Cassandra Issue Type: Improvement Reporter: Matt Fleming Cassandra already has support for registering custom verbs at build time, but there's value in allowing verbs to be added at runtime since that enables new use cases where it's inconvenient or impossible to modify the Cassandra source. Additionally, apps that went to register new verbs benefit from running custom code after the default verb handlers execute. This can be achieved with straightforward modifications to the Sink interface (e.g. adding a PostSink class). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17058) Refactor and support pluggable cluster membership
Matt Fleming created CASSANDRA-17058: Summary: Refactor and support pluggable cluster membership Key: CASSANDRA-17058 URL: https://issues.apache.org/jira/browse/CASSANDRA-17058 Project: Cassandra Issue Type: Improvement Reporter: Matt Fleming Allowing users to specify a classes that implement a new CustomTokenMetadataProvider class makes cluster membership pluggable (supporting custom code) and makes testing much easier. Users could specify the cluster membership algorithm using a new config parameter such as org.apache.cassandra.token_metadata_provider_class. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17057) Refactor and support pluggable failure detection
Matt Fleming created CASSANDRA-17057: Summary: Refactor and support pluggable failure detection Key: CASSANDRA-17057 URL: https://issues.apache.org/jira/browse/CASSANDRA-17057 Project: Cassandra Issue Type: Improvement Reporter: Matt Fleming Making it possible to supply custom failure detectors enables supporting new failure detection algorithms and makes testing using mocks much easier. The general idea is to introduce a new config parameter, such as org.apache.cassandra.custom_failure_detector_class, that specifies the failure detection class to use. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16840: - Test and Documentation Plan: Patch available here, https://github.com/mfleming/cassandra/commit/ff07793d04823d39735190d930260bfeea6df59f Status: Patch Available (was: In Progress) > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming reassigned CASSANDRA-16840: Assignee: Matt Fleming > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16840: - Source Control Link: https://github.com/mfleming/cassandra/commit/ff07793d04823d39735190d930260bfeea6df59f > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16840: - Source Control Link: https://github.com/mfleming/cassandra/commit/ff07793d04823d39735190d930260bfeea6df59f > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16840: - Source Control Link: (was: https://github.com/mfleming/cassandra/commit/ff07793d04823d39735190d930260bfeea6df59f) > Close native transport port before hint transfer during decommission > > > Key: CASSANDRA-16840 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Hints >Reporter: Matt Fleming >Priority: Normal > Fix For: 4.x > > > New hints can be generated on a node when it's decommissioning which is a > problem if the node has already started hint transfer because any hints that > come in after the transfer has begun will remain on-disk and not be > transferred to a peer. > You can work around this problem by manually closing the native transport > port before starting the decommission with {{nodetool disablebinary}} but it > feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16840) Close native transport port before hint transfer during decommission
Matt Fleming created CASSANDRA-16840: Summary: Close native transport port before hint transfer during decommission Key: CASSANDRA-16840 URL: https://issues.apache.org/jira/browse/CASSANDRA-16840 Project: Cassandra Issue Type: Improvement Components: Consistency/Hints Reporter: Matt Fleming New hints can be generated on a node when it's decommissioning which is a problem if the node has already started hint transfer because any hints that come in after the transfer has begun will remain on-disk and not be transferred to a peer. You can work around this problem by manually closing the native transport port before starting the decommission with {{nodetool disablebinary}} but it feels like something we might want to do automatically. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero
[ https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349349#comment-17349349 ] Matt Fleming commented on CASSANDRA-16668: -- [~adelapena] your changes look good! Thanks for doing that. +1 > Intermittent failure of > SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race > condition when shrinking maximum pool size to zero > - > > Key: CASSANDRA-16668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16668 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.0-rc > > > A difficult-to-hit race condition exists in > changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool > size from 0 -> 4 which results in the test failing like so: > {{junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected: but > was:junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected: but was: at > org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198) > at > org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}} > I can hit this issue maybe 2/3 times for every 100 invocations of the unit > test. > The issue that causes the failure is that if tasks are still enqueued when > the maximum pool size is set to zero and if all of the SEPWorker threads > enter the STOP state before the pool size is bumped to 4, then no SEPWorker > threads will be spun up to service the task queue. This causes the above > error. > Why don't we spin up SEPWorker threads when enqueing tasks? Because of the > guard logic in addTask: > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121] > In this scenario taskPermits will not be zero (because we have tasks on the > queue) so we never call {{maybeStartSpinningWorker()}}. > A trick to make this issue much easier to hit is to insert a > {{Thread.sleep(500)}} immediately after setting the pool size to zero. This > has the effect of guaranteeing that all SEPWorker threads will be STOP'd > before enqueueing more work. > Here's a fix that attempts to spin up an SEPWorker whenever we grow the > number of work permits: > https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero
[ https://issues.apache.org/jira/browse/CASSANDRA-16668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346892#comment-17346892 ] Matt Fleming commented on CASSANDRA-16668: -- I think that failure is probably unrelated because we saw similar failures for shutdownTest before this patch here [https://app.circleci.com/pipelines/github/adelapena/cassandra/441/workflows/bcf154ff-0b56-48ed-9f82-6b3e395f53ed/jobs/3880/tests#failed-test-0] Btw, I've also written a new unit test to catch this bug in the future: [https://github.com/mfleming/cassandra/commit/b4f43608c9a8db23a622608804d95629616a66da] > Intermittent failure of > SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race > condition when shrinking maximum pool size to zero > - > > Key: CASSANDRA-16668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16668 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Normal > Fix For: 4.0-rc > > > A difficult-to-hit race condition exists in > changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool > size from 0 -> 4 which results in the test failing like so: > {{junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected: but > was:junit.framework.AssertionFailedError: Test tasks did not hit max > concurrency goal expected: but was: at > org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198) > at > org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}} > I can hit this issue maybe 2/3 times for every 100 invocations of the unit > test. > The issue that causes the failure is that if tasks are still enqueued when > the maximum pool size is set to zero and if all of the SEPWorker threads > enter the STOP state before the pool size is bumped to 4, then no SEPWorker > threads will be spun up to service the task queue. This causes the above > error. > Why don't we spin up SEPWorker threads when enqueing tasks? Because of the > guard logic in addTask: > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121] > In this scenario taskPermits will not be zero (because we have tasks on the > queue) so we never call {{maybeStartSpinningWorker()}}. > A trick to make this issue much easier to hit is to insert a > {{Thread.sleep(500)}} immediately after setting the pool size to zero. This > has the effect of guaranteeing that all SEPWorker threads will be STOP'd > before enqueueing more work. > Here's a fix that attempts to spin up an SEPWorker whenever we grow the > number of work permits: > https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16668) Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero
Matt Fleming created CASSANDRA-16668: Summary: Intermittent failure of SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest caused by race condition when shrinking maximum pool size to zero Key: CASSANDRA-16668 URL: https://issues.apache.org/jira/browse/CASSANDRA-16668 Project: Cassandra Issue Type: Bug Reporter: Matt Fleming A difficult-to-hit race condition exists in changingMaxWorkersMeetsConcurrencyGoalsTest when changing the maximum pool size from 0 -> 4 which results in the test failing like so: {{junit.framework.AssertionFailedError: Test tasks did not hit max concurrency goal expected: but was:junit.framework.AssertionFailedError: Test tasks did not hit max concurrency goal expected: but was: at org.apache.cassandra.concurrent.SEPExecutorTest.assertMaxTaskConcurrency(SEPExecutorTest.java:198) at org.apache.cassandra.concurrent.SEPExecutorTest.changingMaxWorkersMeetsConcurrencyGoalsTest(SEPExecutorTest.java:132)}} I can hit this issue maybe 2/3 times for every 100 invocations of the unit test. The issue that causes the failure is that if tasks are still enqueued when the maximum pool size is set to zero and if all of the SEPWorker threads enter the STOP state before the pool size is bumped to 4, then no SEPWorker threads will be spun up to service the task queue. This causes the above error. Why don't we spin up SEPWorker threads when enqueing tasks? Because of the guard logic in addTask: [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L113,L121] In this scenario taskPermits will not be zero (because we have tasks on the queue) so we never call {{maybeStartSpinningWorker()}}. A trick to make this issue much easier to hit is to insert a {{Thread.sleep(500)}} immediately after setting the pool size to zero. This has the effect of guaranteeing that all SEPWorker threads will be STOP'd before enqueueing more work. Here's a fix that attempts to spin up an SEPWorker whenever we grow the number of work permits: https://github.com/mfleming/cassandra/commit/071516d29e41da9924af24e8002822d3c6af0e01 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16632) Add gossip tests from CASSANDRA-16588
[ https://issues.apache.org/jira/browse/CASSANDRA-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16632: - Authors: Matt Fleming Description: While working on CASSANDRA-16588 I had some tests that were very useful for getting the cluster gossip state into particular configurations and they even caught some oversights in the original suggestion for CASSANDRA-16588's fix. Patch here: https://github.com/mfleming/cassandra-dtest/commit/f3eb50f33444da3ea599f2d51129b54f2024ead4 was:While working on CASSANDRA-16588 I had some tests that were very useful for getting the cluster gossip state into particular configurations and they even caught some oversights in the original suggestion for CASSANDRA-16588's fix. > Add gossip tests from CASSANDRA-16588 > - > > Key: CASSANDRA-16632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16632 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > While working on CASSANDRA-16588 I had some tests that were very useful for > getting the cluster gossip state into particular configurations and they even > caught some oversights in the original suggestion for CASSANDRA-16588's fix. > Patch here: > https://github.com/mfleming/cassandra-dtest/commit/f3eb50f33444da3ea599f2d51129b54f2024ead4 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16632) Add gossip tests from CASSANDRA-16588
[ https://issues.apache.org/jira/browse/CASSANDRA-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16632: - Component/s: Test/dtest/python > Add gossip tests from CASSANDRA-16588 > - > > Key: CASSANDRA-16632 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16632 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > While working on CASSANDRA-16588 I had some tests that were very useful for > getting the cluster gossip state into particular configurations and they even > caught some oversights in the original suggestion for CASSANDRA-16588's fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16632) Add gossip tests from CASSANDRA-16588
Matt Fleming created CASSANDRA-16632: Summary: Add gossip tests from CASSANDRA-16588 Key: CASSANDRA-16632 URL: https://issues.apache.org/jira/browse/CASSANDRA-16632 Project: Cassandra Issue Type: Improvement Reporter: Matt Fleming While working on CASSANDRA-16588 I had some tests that were very useful for getting the cluster gossip state into particular configurations and they even caught some oversights in the original suggestion for CASSANDRA-16588's fix. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16623) Remove references to run_dtests from README
[ https://issues.apache.org/jira/browse/CASSANDRA-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17329194#comment-17329194 ] Matt Fleming commented on CASSANDRA-16623: -- I think the big problem is that run_dtests.py doesn't actually provide any useful output (see the suspected issue with pipe buffering mentioned in the GH PR) which makes it a bad introduction for people with less experience. Regardless of whether the execute dtest passes or fails, nothing is displayed the user after the "test session starts" line. > Remove references to run_dtests from README > --- > > Key: CASSANDRA-16623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Assignee: Matt Fleming >Priority: Low > Fix For: 4.0.x > > > Newcomers to cassandra-dtest that look through README.md will see that the > run_dtests.py script is the quickest way to get started running tests. > Unfortunately, the script has a number of problems and I'm not sure it ever > work properly after the move to the pytest framework. > h2. Process stdout/stderr buffering > Firstly, when I execute run_dtests.py I don't see any output after > {{$ ./run_dtests.py --dtest-tests paging_test.py }} > {{= test session starts > ==}} > This looks likely to be because of the buffering that pytest does internally > for stdout and stderr and because of the way that it's executed by > run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following > line for most of the execution because there's no data available in the pipe > for stderr: > {{stderr_output = sp.stderr.readline()}} > See also [https://github.com/pytest-dev/pytest/issues/1886] > h2. --pytest-options doesn't work > Secondly, the options specified in --pytest-options aren't actually passed > through to pytest. > h2. Most devs run pytest directly > When I spoke to [~edimitrova] it seemed like most developers just run the > tests directly with pytest which would explain why run_dtests.py has > bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16623) Remove references to run_dtests from README
[ https://issues.apache.org/jira/browse/CASSANDRA-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16623: - Description: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after {{$ ./run_dtests.py --dtest-tests paging_test.py }} {{= test session starts ==}} This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: {{stderr_output = sp.stderr.readline()}} See also [https://github.com/pytest-dev/pytest/issues/1886] h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to [~edimitrova] it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. was: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after {{$ ./run_dtests.py --dtest-tests paging_test.py }} {{ = test session starts ==}} This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: {{stderr_output = sp.stderr.readline()}} See also https://github.com/pytest-dev/pytest/issues/1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to [~edimitrova] it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. > Remove references to run_dtests from README > --- > > Key: CASSANDRA-16623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > Newcomers to cassandra-dtest that look through README.md will see that the > run_dtests.py script is the quickest way to get started running tests. > Unfortunately, the script has a number of problems and I'm not sure it ever > work properly after the move to the pytest framework. > h2. Process stdout/stderr buffering > Firstly, when I execute run_dtests.py I don't see any output after > {{$ ./run_dtests.py --dtest-tests paging_test.py }} > {{= test session starts > ==}} > This looks likely to be because of the buffering that pytest does internally > for stdout and stderr and because of the way that it's executed by > run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following > line for most of the execution because there's no data available in the pipe > for stderr: > {{stderr_output = sp.stderr.readline()}} > See also [https://github.com/pytest-dev/pytest/issues/1886] > h2. --pytest-options doesn't work > Secondly, the options specified in --pytest-options aren't actually passed > through to pytest. > h2. Most devs run pytest directly > When I spoke to [~edimitrova] it seemed like most developers just run the > tests directly with pytest which would explain why run_dtests.py has > bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16623) Remove references to run_dtests from README
[ https://issues.apache.org/jira/browse/CASSANDRA-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16623: - Description: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after {{$ ./run_dtests.py --dtest-tests paging_test.py }} {{ = test session starts ==}} This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: {{stderr_output = sp.stderr.readline()}} See also https://github.com/pytest-dev/pytest/issues/1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to [~edimitrova] it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. was: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after $ ./run_dtests.py --dtest-tests paging_test.py = test session starts == This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: stderr_output = sp.stderr.readline() See also pytest-dev/pytest#1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to [~edimitrova] it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. > Remove references to run_dtests from README > --- > > Key: CASSANDRA-16623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > Newcomers to cassandra-dtest that look through README.md will see that the > run_dtests.py script is the quickest way to get started running tests. > Unfortunately, the script has a number of problems and I'm not sure it ever > work properly after the move to the pytest framework. > h2. Process stdout/stderr buffering > Firstly, when I execute run_dtests.py I don't see any output after > {{$ ./run_dtests.py --dtest-tests paging_test.py }} > {{ = test session starts > ==}} > This looks likely to be because of the buffering that pytest does internally > for stdout and stderr and because of the way that it's executed by > run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following > line for most of the execution because there's no data available in the pipe > for stderr: > {{stderr_output = sp.stderr.readline()}} > See also https://github.com/pytest-dev/pytest/issues/1886 > h2. --pytest-options doesn't work > Secondly, the options specified in --pytest-options aren't actually passed > through to pytest. > h2. Most devs run pytest directly > When I spoke to [~edimitrova] it seemed like most developers just run the > tests directly with pytest which would explain why run_dtests.py has > bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16623) Remove references to run_dtests from README
[ https://issues.apache.org/jira/browse/CASSANDRA-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Fleming updated CASSANDRA-16623: - Description: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after $ ./run_dtests.py --dtest-tests paging_test.py = test session starts == This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: stderr_output = sp.stderr.readline() See also pytest-dev/pytest#1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to [~edimitrova] it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. was: Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after $ ./run_dtests.py --dtest-tests paging_test.py = test session starts == This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: stderr_output = sp.stderr.readline() See also pytest-dev/pytest#1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to @ekaterinadimitrova2 it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. > Remove references to run_dtests from README > --- > > Key: CASSANDRA-16623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > Newcomers to cassandra-dtest that look through README.md will see that the > run_dtests.py script is the quickest way to get started running tests. > Unfortunately, the script has a number of problems and I'm not sure it ever > work properly after the move to the pytest framework. > h2. Process stdout/stderr buffering > Firstly, when I execute run_dtests.py I don't see any output after > $ ./run_dtests.py --dtest-tests paging_test.py > = test session starts > == > This looks likely to be because of the buffering that pytest does internally > for stdout and stderr and because of the way that it's executed by > run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following > line for most of the execution because there's no data available in the pipe > for stderr: > stderr_output = sp.stderr.readline() > See also pytest-dev/pytest#1886 > h2. --pytest-options doesn't work > Secondly, the options specified in --pytest-options aren't actually passed > through to pytest. > h2. Most devs run pytest directly > When I spoke to [~edimitrova] it seemed like most developers just run the > tests directly with pytest which would explain why run_dtests.py has > bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16623) Remove references to run_dtests from README
[ https://issues.apache.org/jira/browse/CASSANDRA-16623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326599#comment-17326599 ] Matt Fleming commented on CASSANDRA-16623: -- A patch to update README.md can be found here https://github.com/mfleming/cassandra-dtest/tree/run_dtests > Remove references to run_dtests from README > --- > > Key: CASSANDRA-16623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/python >Reporter: Matt Fleming >Priority: Normal > > Newcomers to cassandra-dtest that look through README.md will see that the > run_dtests.py script is the quickest way to get started running tests. > Unfortunately, the script has a number of problems and I'm not sure it ever > work properly after the move to the pytest framework. > h2. Process stdout/stderr buffering > Firstly, when I execute run_dtests.py I don't see any output after > $ ./run_dtests.py --dtest-tests paging_test.py > = test session starts > == > This looks likely to be because of the buffering that pytest does internally > for stdout and stderr and because of the way that it's executed by > run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following > line for most of the execution because there's no data available in the pipe > for stderr: > stderr_output = sp.stderr.readline() > See also pytest-dev/pytest#1886 > h2. --pytest-options doesn't work > Secondly, the options specified in --pytest-options aren't actually passed > through to pytest. > h2. Most devs run pytest directly > When I spoke to @ekaterinadimitrova2 it seemed like most developers just run > the tests directly with pytest which would explain why run_dtests.py has > bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16623) Remove references to run_dtests from README
Matt Fleming created CASSANDRA-16623: Summary: Remove references to run_dtests from README Key: CASSANDRA-16623 URL: https://issues.apache.org/jira/browse/CASSANDRA-16623 Project: Cassandra Issue Type: Improvement Components: Test/dtest/python Reporter: Matt Fleming Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after $ ./run_dtests.py --dtest-tests paging_test.py = test session starts == This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: stderr_output = sp.stderr.readline() See also pytest-dev/pytest#1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. h2. Most devs run pytest directly When I spoke to @ekaterinadimitrova2 it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16622) Remove references to run_dtests from README
Matt Fleming created CASSANDRA-16622: Summary: Remove references to run_dtests from README Key: CASSANDRA-16622 URL: https://issues.apache.org/jira/browse/CASSANDRA-16622 Project: Cassandra Issue Type: Improvement Components: Test/dtest/python Reporter: Matt Fleming Newcomers to cassandra-dtest that look through README.md will see that the run_dtests.py script is the quickest way to get started running tests. Unfortunately, the script has a number of problems and I'm not sure it ever work properly after the move to the pytest framework. h2. Process stdout/stderr buffering Firstly, when I execute run_dtests.py I don't see any output after $ ./run_dtests.py --dtest-tests paging_test.py = test session starts == This looks likely to be because of the buffering that pytest does internally for stdout and stderr and because of the way that it's executed by run_dtests.py, i.e. I suspect that run_dtests.py is blocked on the following line for most of the execution because there's no data available in the pipe for stderr: stderr_output = sp.stderr.readline() See also pytest-dev/pytest#1886 h2. --pytest-options doesn't work Secondly, the options specified in --pytest-options aren't actually passed through to pytest. I ran into this when trying to get more output during the execution of run_dtests.py (see above). h2. Most devs run pytest directly When I spoke to @ekaterinadimitrova2 it seemed like most developers just run the tests directly with pytest which would explain why run_dtests.py has bitrotted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16588) NPE getting host_id in Gossiper.isSafeForStartup
[ https://issues.apache.org/jira/browse/CASSANDRA-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17323917#comment-17323917 ] Matt Fleming commented on CASSANDRA-16588: -- I think the patch from Sam is a bit too aggressive and will incorrectly think that gossip data for the local node that contains dead states ("left", "removing", "hibernate", etc) is the bad ACK that we're trying to detect to avoid the NPE in isSafeForStartup. You should be able to trigger this by assassinating a non-seed node in a cluster. We should probably filter out deadStates because they won't trigger the NPE. Something like this https://github.com/mfleming/cassandra/commit/e68602ae300e6a2567e1b59efa4229ff3456e521 > NPE getting host_id in Gossiper.isSafeForStartup > > > Key: CASSANDRA-16588 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16588 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Normal > Fix For: 3.11.x, 4.0-rc > > > As seen here: > https://ci-cassandra.apache.org/job/Cassandra-devbranch/604/testReport/junit/org.apache.cassandra.distributed.upgrade/MixedModeGossipTest/testStatusFieldShouldExistInOldVersionNodesEdgeCase/ > {noformat} > java.lang.NullPointerException > at org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:952) > at > org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:657) > at > org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:933) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:784) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:729) > at > org.apache.cassandra.distributed.impl.Instance.lambda$startup$10(Instance.java:541) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} > I believe what is happening is a GossipDigestAck has been queued to ack the > shutdown state from the node on the seed, but isn't actually sent until the > node has restarted and gone into shadow. Since the ack contains the node's > IP, it assumes a host_id will be there but since this is not an actual shadow > response, it is not. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org