date:20200824

[jira] [Assigned] (CASSANDRA-15988) Add nodetool getfullquerylog

2020-08-24 Thread Stefan Miklosovic (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-15988:
-

Assignee: Stefan Miklosovic

> Add nodetool getfullquerylog 
> -
>
> Key: CASSANDRA-15988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15988
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/fql
>Reporter: Ekaterina Dimitrova
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0-rc
>
>
> This ticket is raised based on CASSANDRA-15791 and valuable feedback provided 
> by [~jshook].
> There are two outstanding questions:
>  * forming the exact shape of such a command and how it can benefit the 
> users; to be discussed in detail in this ticket
>  * Is this a thing we as a project can add to 4.0 beta or it should be 
> considered in 4.1 for example
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15804) system_schema keyspace complain of schema mismatch during upgrade

2020-08-24 Thread Stefan Miklosovic (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-15804:
-

Assignee: Stefan Miklosovic

> system_schema keyspace complain of schema mismatch during upgrade
> -
>
> Key: CASSANDRA-15804
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15804
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Pedro Gordo
>Assignee: Stefan Miklosovic
>Priority: Low
> Fix For: 3.11.x, 4.0-beta
>
>
> When upgrading from 3.11.4 to 3.11.6, we got the following error:
> {code:Plain Text}
> ERROR [MessagingService-Incoming-/10.20.11.59] 2020-05-07 13:53:52,627 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/10.20.11.59,5,main]
> java.lang.RuntimeException: Unknown column kind during deserialization
> at 
> org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:464) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:419)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:195)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:851)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:839)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:425)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:675)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:658)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> {code}
> I've noticed that system_schema.dropped_columns has a new column called 
> "kind".
> No issues arise from this error message, and the error disappeared after 
> upgrading all nodes. But it still caused concerns due to the ERROR logging 
> level, although "nodetool describecluster" reported only one schema version.
> It makes sense for the system keyspaces to not be included for the 
> "describecluster" schema version check, but it seems to me that these 
> internal schema mismatches should be ignored if the versions are different 
> between the nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183771#comment-17183771
 ] 

Michael Semb Wever commented on CASSANDRA-16070:


A few dtests 
[broke|https://the-asf.slack.com/archives/CK23JSY2K/p1598314509030400] from 
this commit. And the test run above also show them, mea culpa.

[~brandon.williams] fixed it with 
https://github.com/apache/cassandra-dtest/commit/bc270afa85f11b06c52f7b8abe1d3ef1f6716751

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta2
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15937) JMX output inconsistencies from CASSANDRA-7544 storage-port-configurable-per-node

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183765#comment-17183765
 ] 

Michael Semb Wever commented on CASSANDRA-15937:


bq. So, looks like we need the dtest branch to update 
~/cassandra-dtest/requirements.txt to point to the ccm branch? 
https://github.com/apache/cassandra-dtest/blob/master/requirements.txt#L3


That's correct [~dcapwell]. Usually by adding a "throw away" commit that 
contains the requirements.txt hack, and a note on the jira about it and the 
necessary merge order.

> JMX output inconsistencies from CASSANDRA-7544 
> storage-port-configurable-per-node
> -
>
> Key: CASSANDRA-15937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/JMX
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CASSANDRA-7544 introduced changes to allow the storage port number to be 
> configured per-node. As part of that work it introduces new MBeans for 
> MessagingService, FailureDetector providing new 'WithPort' versions that 
> include the new port information, however there are some mistakes and 
> inconsistencies.
> {code:java}
>                            3.11.6                trunk                  trunk 
> w/Port          Notes
>   
>  AllEndpointStates        /127.0.0.1\n...       /127.0.0.3\n...        
> 127.0.0.3:7000\n        (trunk /w port different)
>  SimpleStates             /127.0.0.2=UP         /127.0.0.2=UP          
> 127.0.0.3:7000=UP       (trunk /w port different)
>  LargeMessagePendingTasks /127.0.0.1=0          /127.0.0.1=0           
> 127.0.0.3:7000=0        (trunk /w port different)
>  TimeoutsPerHost          127.0.0.1=0           /127.0.0.1=0           
> 127.0.0.3:7000=0        3.0/3.11.6 & trunk differ.
>  BackPressurePerHost      127.0.0.1=Infinity    /127.0.0.2=Infinity    
> /127.0.0.2=Infinity     3.11 & trunk differ, missing port number for 
> BackPressurePerHostWithPort
>  SchemaVersions          {...=[127.0.0.1,...]} {...=[127.0.0.1,...]}  
> {...=[127.0.0.1:7000,...]
>   
>  TokenToEndpointMap      {-92...8=127.0.0.1,   -92...8=127.0.0.1      
> -92..8=127.0.0.1:7000
>  HostIdMap               127.0.0.1=1ee..6f0af  127.0.0.1=e06...7e     MISSING 
>                  Deprecated for EndpointToHostId
>  EndpointToHostId        127.0.0.1=1ee..6f0a   127.0.0.1=e06...7e     
> 127.0.0.1:7000=e0..7e
>  HostIdToEndpoint        1ee..6f0a=127.0.0.1   e06..7e=127.0.0.1      
> e06..7e=127.0.0.1:7000
>  LoadMap                 127.0.0.1=185.01 KiB  127.0.0.1:7000=106.08 KiB  
> 127.0.0.1=106.08 Ki  LoadMap and LoadMapWithPort are flipped.
>  LiveNodes               127.0.0.1             127.0.0.1              
> 127.0.0.1:7000
>  Ownership               /127.0.0.1=0.33   /127.0.0.1=0.33    
> 127.0.0.1:7000=0.33
>  Scores                  /127.0.0.1=0.0        /127.0.0.1=0.0         
> 127.0.0.1:7000=0.0
>   {code}
>  
>  Proposed changes
>   
>  1) AllEndpointStats, SimpleStates, Connection message tracking, 
> TimeoutsPerHost - include the host/ip:port in the WithPort version
> 2) Add port number to BackPressurePerHostWithPort
> 3) Correct LoadMap to omit port / LoadMapWithPort to include port
> 4) Ownership - update with port to host/ip:port version
> 5) Scores - update with port to host/ip:port version
>   
>   
>  Additionally while dumping out all of the JMX info with `sjk mxdump`
>   
> 6) DynamicEndpointSnitch.getScoresWithPort now returns an InetAddressAndPort 
> which should just be a String
> 7) ClientMetrics.clientsByProtocolVersion returns a Guava Immutable map
> 8) StorageService.getIdealConsistencyLevel fails if none set (as we try and 
> call ConsistencyLevel.toString on a null pointer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16065) Distinguish partition errors published by Cassandra-Diff between source and target

2020-08-24 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16065:

Reviewers: Marcus Eriksson

> Distinguish partition errors published by Cassandra-Diff between source and 
> target
> --
>
> Key: CASSANDRA-16065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/diff
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The partition errors found during diff are persisted without the error origin 
> information. 
> Therefore, I am proposing to add an error origin indicator, (e.g. 0 for 
> source and 1 for target) when persisting the partition error details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-16073:

Test and Documentation Plan: See PR
 Status: Patch Available  (was: In Progress)

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16074) Add metric for client concurrent byte throttle

2020-08-24 Thread Chris Lohfink (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-16074:
--
Impacts: Docs  (was: None)
Test and Documentation Plan: update metrics.rst and unit tests
 Status: Patch Available  (was: Open)

https://github.com/apache/cassandra/pull/719

> Add metric for client concurrent byte throttle
> --
>
> Key: CASSANDRA-16074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16074
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> Add a metric to expose the current bytes and bytes per ip used that is used 
> in the existing throttle so its possible to determine what to set it to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-dtest] branch master updated: make request a kwarg to cleanup_cluster

2020-08-24 Thread brandonwilliams

This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/master by this push:
 new bc270af  make request a kwarg to cleanup_cluster
bc270af is described below

commit bc270afa85f11b06c52f7b8abe1d3ef1f6716751
Author: Brandon Williams 
AuthorDate: Mon Aug 24 21:24:25 2020 -0500

make request a kwarg to cleanup_cluster
---
 dtest_setup.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/dtest_setup.py b/dtest_setup.py
index 646bc23..5688604 100644
--- a/dtest_setup.py
+++ b/dtest_setup.py
@@ -347,9 +347,9 @@ class DTestSetup(object):
 """
 self.log_watch_thread.join(timeout=60)
 
-def cleanup_cluster(self, request):
+def cleanup_cluster(self, request=None):
 with log_filter('cassandra'):  # quiet noise from driver when nodes 
start going down
-if self.dtest_config.keep_test_dir or 
(self.dtest_config.keep_failed_test_dir and request.node.rep_call.failed):
+if self.dtest_config.keep_test_dir or 
(self.dtest_config.keep_failed_test_dir and request and 
request.node.rep_call.failed):
 
self.cluster.stop(gently=self.dtest_config.enable_jacoco_code_coverage)
 else:
 # when recording coverage the jvm has to exit normally


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16074) Add metric for client concurrent byte throttle

2020-08-24 Thread Chris Lohfink (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-16074:
--
Change Category: Operability
 Complexity: Low Hanging Fruit
 Status: Open  (was: Triage Needed)

> Add metric for client concurrent byte throttle
> --
>
> Key: CASSANDRA-16074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16074
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> Add a metric to expose the current bytes and bytes per ip used that is used 
> in the existing throttle so its possible to determine what to set it to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16074) Add metric for client concurrent byte throttle

2020-08-24 Thread Chris Lohfink (Jira)

Chris Lohfink created CASSANDRA-16074:
-

 Summary: Add metric for client concurrent byte throttle
 Key: CASSANDRA-16074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16074
 Project: Cassandra
  Issue Type: New Feature
  Components: Messaging/Client, Observability/Metrics
Reporter: Chris Lohfink
Assignee: Chris Lohfink


Add a metric to expose the current bytes and bytes per ip used that is used in 
the existing throttle so its possible to determine what to set it to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15937) JMX output inconsistencies from CASSANDRA-7544 storage-port-configurable-per-node

2020-08-24 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183664#comment-17183664
 ] 

David Capwell commented on CASSANDRA-15937:
---

This is what I see in circle ci

{code}
  clone_dtest:
steps:
- run:
name: Clone Cassandra dtest Repository (via git)
command: |
  git clone --single-branch --branch $DTEST_BRANCH --depth 1 
$DTEST_REPO ~/cassandra-dtest
  create_venv:
parameters:
  python_version:
type: enum
default: "3.6"
enum: ["3.6", "3.7", "3.8"]
steps:
- run:
name: Configure virtualenv and python Dependencies
command: |
  # note, this should be super quick as all dependencies should be 
pre-installed in the docker image
  # if additional dependencies were added to requirmeents.txt and the 
docker image hasn't been updated
  # we'd have to install it here at runtime -- which will make things 
slow, so do yourself a favor and
  # rebuild the docker image! (it automatically pulls the latest 
requirements.txt on build)
  source ~/env<>/bin/activate
  export PATH=$JAVA_HOME/bin:$PATH
  pip3 install --exists-action w --upgrade -r 
~/cassandra-dtest/requirements.txt
  pip3 uninstall -y cqlsh
  pip3 freeze
{code}

So, looks like we need the dtest branch to update 
~/cassandra-dtest/requirements.txt to point to the ccm branch?  
https://github.com/apache/cassandra-dtest/blob/master/requirements.txt#L3

[~mck] does this sound right to you?  Know a different process?


[~jmeredithco] ill test this tomorrow, will update my scripts to link those two 
in with circle ci.

> JMX output inconsistencies from CASSANDRA-7544 
> storage-port-configurable-per-node
> -
>
> Key: CASSANDRA-15937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/JMX
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CASSANDRA-7544 introduced changes to allow the storage port number to be 
> configured per-node. As part of that work it introduces new MBeans for 
> MessagingService, FailureDetector providing new 'WithPort' versions that 
> include the new port information, however there are some mistakes and 
> inconsistencies.
> {code:java}
>                            3.11.6                trunk                  trunk 
> w/Port          Notes
>   
>  AllEndpointStates        /127.0.0.1\n...       /127.0.0.3\n...        
> 127.0.0.3:7000\n        (trunk /w port different)
>  SimpleStates             /127.0.0.2=UP         /127.0.0.2=UP          
> 127.0.0.3:7000=UP       (trunk /w port different)
>  LargeMessagePendingTasks /127.0.0.1=0          /127.0.0.1=0           
> 127.0.0.3:7000=0        (trunk /w port different)
>  TimeoutsPerHost          127.0.0.1=0           /127.0.0.1=0           
> 127.0.0.3:7000=0        3.0/3.11.6 & trunk differ.
>  BackPressurePerHost      127.0.0.1=Infinity    /127.0.0.2=Infinity    
> /127.0.0.2=Infinity     3.11 & trunk differ, missing port number for 
> BackPressurePerHostWithPort
>  SchemaVersions          {...=[127.0.0.1,...]} {...=[127.0.0.1,...]}  
> {...=[127.0.0.1:7000,...]
>   
>  TokenToEndpointMap      {-92...8=127.0.0.1,   -92...8=127.0.0.1      
> -92..8=127.0.0.1:7000
>  HostIdMap               127.0.0.1=1ee..6f0af  127.0.0.1=e06...7e     MISSING 
>                  Deprecated for EndpointToHostId
>  EndpointToHostId        127.0.0.1=1ee..6f0a   127.0.0.1=e06...7e     
> 127.0.0.1:7000=e0..7e
>  HostIdToEndpoint        1ee..6f0a=127.0.0.1   e06..7e=127.0.0.1      
> e06..7e=127.0.0.1:7000
>  LoadMap                 127.0.0.1=185.01 KiB  127.0.0.1:7000=106.08 KiB  
> 127.0.0.1=106.08 Ki  LoadMap and LoadMapWithPort are flipped.
>  LiveNodes               127.0.0.1             127.0.0.1              
> 127.0.0.1:7000
>  Ownership               /127.0.0.1=0.33   /127.0.0.1=0.33    
> 127.0.0.1:7000=0.33
>  Scores                  /127.0.0.1=0.0        /127.0.0.1=0.0         
> 127.0.0.1:7000=0.0
>   {code}
>  
>  Proposed changes
>   
>  1) AllEndpointStats, SimpleStates, Connection message tracking, 
> TimeoutsPerHost - include the host/ip:port in the WithPort version
> 2) Add port number to BackPressurePerHostWithPort
> 3) Correct LoadMap to omit port / LoadMapWithPort to include port
> 4) Ownership - update with port to host/ip:port version
> 5) Scores - update with port to host/ip:port version
>   
>   
>  Additionally while dumping out all of the JMX info with `sjk mxdump`
>   
> 6) DynamicEndpointSnitch.getScoresWithPort now returns an

[jira] [Commented] (CASSANDRA-15393) Add byte array backed cells

2020-08-24 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183659#comment-17183659
 ] 

David Capwell commented on CASSANDRA-15393:
---

bq. It might have been less of an issue, if the whole code base would have 
proper unit testing for all production code paths with all potential inputs or 
even fuzzy testing, but that's unfortunately not the case.

If it helps, CASSANDRA-16064 (Caleb linked this JIRA with that one as they both 
created some of the same classes) provides the ability to generate random types 
(primitives, all collections, UDTs, reversed, etc.) and data for arbitrary 
schemas.  Also added logic to make it a bit easier to mock out part of our code 
such as Schema class; the intent of CASSANDRA-16064 is to start adding such 
tests you talk about and lower the bar to add more.

If there are sections you feel need closer testing, I can take a look if 
CASSANDRA-16064 offers enough for those sections, or if anything else is 
desired.  Let me know.


> Add byte array backed cells
> ---
>
> Key: CASSANDRA-15393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15937) JMX output inconsistencies from CASSANDRA-7544 storage-port-configurable-per-node

2020-08-24 Thread Jon Meredith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183651#comment-17183651
 ] 

Jon Meredith commented on CASSANDRA-15937:
--

Thanks for prepping. The failures look genuine and related to reverting some of 
the formatting changes in CASSANDRA-7544. Fixing requires updates to both Dtest 
and the cassandra-dtest branch of CCM. Not sure how they should get merged or 
run through CI as you need the rebased Cassandra branch, the updated dtest 
branch with the ccm branch checked out inside it.

[Dtest 
changes|https://github.com/jonmeredith/cassandra-dtest/tree/CASSANDRA-15937] 
[CCM changes |https://github.com/jonmeredith/ccm/tree/CASSANDRA-15937]

> JMX output inconsistencies from CASSANDRA-7544 
> storage-port-configurable-per-node
> -
>
> Key: CASSANDRA-15937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/JMX
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CASSANDRA-7544 introduced changes to allow the storage port number to be 
> configured per-node. As part of that work it introduces new MBeans for 
> MessagingService, FailureDetector providing new 'WithPort' versions that 
> include the new port information, however there are some mistakes and 
> inconsistencies.
> {code:java}
>                            3.11.6                trunk                  trunk 
> w/Port          Notes
>   
>  AllEndpointStates        /127.0.0.1\n...       /127.0.0.3\n...        
> 127.0.0.3:7000\n        (trunk /w port different)
>  SimpleStates             /127.0.0.2=UP         /127.0.0.2=UP          
> 127.0.0.3:7000=UP       (trunk /w port different)
>  LargeMessagePendingTasks /127.0.0.1=0          /127.0.0.1=0           
> 127.0.0.3:7000=0        (trunk /w port different)
>  TimeoutsPerHost          127.0.0.1=0           /127.0.0.1=0           
> 127.0.0.3:7000=0        3.0/3.11.6 & trunk differ.
>  BackPressurePerHost      127.0.0.1=Infinity    /127.0.0.2=Infinity    
> /127.0.0.2=Infinity     3.11 & trunk differ, missing port number for 
> BackPressurePerHostWithPort
>  SchemaVersions          {...=[127.0.0.1,...]} {...=[127.0.0.1,...]}  
> {...=[127.0.0.1:7000,...]
>   
>  TokenToEndpointMap      {-92...8=127.0.0.1,   -92...8=127.0.0.1      
> -92..8=127.0.0.1:7000
>  HostIdMap               127.0.0.1=1ee..6f0af  127.0.0.1=e06...7e     MISSING 
>                  Deprecated for EndpointToHostId
>  EndpointToHostId        127.0.0.1=1ee..6f0a   127.0.0.1=e06...7e     
> 127.0.0.1:7000=e0..7e
>  HostIdToEndpoint        1ee..6f0a=127.0.0.1   e06..7e=127.0.0.1      
> e06..7e=127.0.0.1:7000
>  LoadMap                 127.0.0.1=185.01 KiB  127.0.0.1:7000=106.08 KiB  
> 127.0.0.1=106.08 Ki  LoadMap and LoadMapWithPort are flipped.
>  LiveNodes               127.0.0.1             127.0.0.1              
> 127.0.0.1:7000
>  Ownership               /127.0.0.1=0.33   /127.0.0.1=0.33    
> 127.0.0.1:7000=0.33
>  Scores                  /127.0.0.1=0.0        /127.0.0.1=0.0         
> 127.0.0.1:7000=0.0
>   {code}
>  
>  Proposed changes
>   
>  1) AllEndpointStats, SimpleStates, Connection message tracking, 
> TimeoutsPerHost - include the host/ip:port in the WithPort version
> 2) Add port number to BackPressurePerHostWithPort
> 3) Correct LoadMap to omit port / LoadMapWithPort to include port
> 4) Ownership - update with port to host/ip:port version
> 5) Scores - update with port to host/ip:port version
>   
>   
>  Additionally while dumping out all of the JMX info with `sjk mxdump`
>   
> 6) DynamicEndpointSnitch.getScoresWithPort now returns an InetAddressAndPort 
> which should just be a String
> 7) ClientMetrics.clientsByProtocolVersion returns a Guava Immutable map
> 8) StorageService.getIdealConsistencyLevel fails if none set (as we try and 
> call ConsistencyLevel.toString on a null pointer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16064) Add test which validates that Message serializedSize(version) == serialize(out, version).length

2020-08-24 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183650#comment-17183650
 ] 

David Capwell commented on CASSANDRA-16064:
---

python dtests are passing other than 7 tests which look to be caused by 
https://github.com/apache/cassandra-dtest/commit/cefddf845d63919c6e7b5efa35b28fe7a5ad1142
 changing the Api they call

> Add test which validates that Message serializedSize(version) == 
> serialize(out, version).length
> ---
>
> Key: CASSANDRA-16064
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16064
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> In 4.0 we require serializedSize(version) == serialize(out, version).length 
> for correctness in post40 message format as we write it into the message 
> header.  Given that this is a strong requirement for correct deserialization 
> of the message, we should have tests which help enforce this property.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15393) Add byte array backed cells

2020-08-24 Thread Blake Eggleston (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183621#comment-17183621
 ] 

Blake Eggleston commented on CASSANDRA-15393:
-

Thanks for the excellent review [~maedhroz]. I've pushed up some commits 
cleaning up a bunch of stuff (excluding docs for now) and addressing most of 
the points, with the exception of the comments below.

Let me know your thoughts on them:
{quote}Looks like AbstractType#decompose() is never used without 
ByteBufferAccessor? We could probably remove the type parameter and just make 
the ByteBuffer binding explicit.
{quote}
and
{quote}TypeSerializer#toCQLLiteral() is only used w/ ByteBuffer, so it doesn't 
look like it needs to be parameterized.
{quote}
I'd prefer that we keep the type/serializers consistent wrt type. In other 
words, I'd prefer some methods "unneccesarily" switch to using an accessor than 
having some parts of a class use accessors and other parts use byte buffers 
directly
{quote}ModificationStatement looks like it's dealing exclusively with 
ByteBuffer. Should the type parameters reflect that?
{quote}
and
{quote}Trying to propagate more typing information from 
ClusteringBoundOrBoundary.Serializer upward to its users for Slices and 
UnfilteredSerializer might help clarify some things
{quote}
I'm inclined to go the other way, and make underlying type information opaque 
everywhere except the serializers themselves. There are very few places where 
we need to see it, and making it explicit anywhere it doesn't need to be just 
makes it more difficult to make changes later on.
{quote}We might benefit in terms of usability/developer ergonomics if we push 
some capabilities of the accessor into Cell. (ex. methods like getLong()). 
Similar thing going on with ClusteringPrefix, where we could perhaps stop 
making calls like 
builder.add(prefix.get[image:3144E42A-756C-4044-BF0B-371E31605DF4-68435-0002F6486CD1F067/information.png],
 prefix.accessor()) and use 
bufferAt[image:4A0500E8-9C94-4DC7-919E-D17260E81A8C-68435-0002F6486CE35ECB/information.png]
 in CP itself. I suppose ArrayBackedBuilder might need to support something 
like add(ClusteringPrefix, int) if we still want to do that lazily, after the 
isDone() check.
{quote}
and
{quote}AbstractType#writeValue() could be implemented in Cell, given the latter 
knows both its value and accessor already, and ColumnSpecification already 
knows the column type?
{quote}
I'd done something sort of along these lines (albeit in the opposite direction) 
by adding the `ValueAware` class, so that AbstractType et al can operate 
directly on Cell objects without having to know what a cell is. Although I seem 
to default to keeping type information on the serializer / type side of things, 
there are benefits to each which we should discuss. And possibly a better name 
for ValueAware.
{quote}UNSET_BYTE_ARRAY, getChar(), putBoolean(), putChar(), writeWithLength() 
are unused in ByteArrayUtil. What if we just fold these methods into 
ByteArrayAccessor.
{quote}
These are all copied from java.io.Bits, so I assume the logic is correct and 
well tested. I'd like to leave them unused if you don't mind. ValueAccessor use 
is restricted to reading for the most part, but as it gets expanded to cover 
writing things, I think having known good implementations ready to go would be 
more beneficial than starting with a minimal implementation.

The idea of combining the byte array util and the accessor has also been 
floating around, and I'd prefer we didn't for both the reason above, and 
because people are pretty used to having ByteBufferUtil handy. Having 
ByteArrayUtil handy would make the appearance of byte[] everywhere a little 
less jarring (hopefully).
{quote}We might be able to factor our some common elements of ArrayCell and 
BufferCell.
{quote}
NativeCell prevents moving most of that into AbstractCell. Given the simplicity 
of the parts that can be factored, I'd lean towards keeping the class hierarchy 
simple over minimizing the size of the array and buffer cell implementations.

> Add byte array backed cells
> ---
>
> Key: CASSANDRA-15393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these

[jira] [Updated] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-15958:
--
Reviewers: Yifan Cai

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Yifan Cai (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183614#comment-17183614
 ] 

Yifan Cai commented on CASSANDRA-15958:
---

Hi Adam, thanks for the analysis! 
I agree with the cause, and I can reproduce the timeout exception (at closing) 
by inserting a pause between those 2 close call sites. 

The patch looks good to me. The {{closeFuture}} is updated/read within the 
synchronized block.

Only one thing to bring up is that there is a slight side effect introduced. 
With the patch, only the {{Consumer 
shutdownExecutors}} in the first call site will be registered. The other call 
sites, if supplying a different consumer, are all ignored. Maybe update the 
method docs for such behavior. 

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-08-24 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182149#comment-17182149
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 8/24/20, 9:46 PM:
---

-I suggest we open a separate ticket and keep the work incremental?-
*EDIT:* I'll make it part of this patch, it will be just a simple check, 
preventing users from being able to do the drop of compact storage on 
non-upgraded nodes. 


was (Author: e.dimitrova):
-I suggest we open a separate ticket and keep the work incremental?-
EDIT: I'll make it part of this patch, it will be just a simple check, 
preventing users from being able to do the drop of compact storage on 
non-upgraded nodes. 

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-08-24 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182149#comment-17182149
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 8/24/20, 9:29 PM:
---

-I suggest we open a separate ticket and keep the work incremental?-
EDIT: I'll make it part of this patch, it will be just a simple check, 
preventing users from being able to do the drop of compact storage on 
non-upgraded nodes. 


was (Author: e.dimitrova):
I suggest we open a separate ticket and keep the work incremental?

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16033) test_resume_stopped_build - materialized_views_test.TestMaterializedViews

2020-08-24 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183593#comment-17183593
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-16033 at 8/24/20, 8:57 PM:
---

I am not able to reproduce this one and I didn't see it anymore failing. I am 
un-assigning it as I work on different work now and someone else might want to 
try to work on it with more success in reproducing the problem. 


was (Author: e.dimitrova):
I am not able to reproduce this one and I didn't see it anymore failing. I am 
un-assigning it as I work on different work now and someone else might want to 
try to work on it. 

> test_resume_stopped_build - materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-16033
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16033
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failing in CircleCI [here | 
> https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/295/workflows/168d88ab-f55f-4560-a23e-8243aff7b1bd/jobs/1774]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-16033) test_resume_stopped_build - materialized_views_test.TestMaterializedViews

2020-08-24 Thread Ekaterina Dimitrova (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova reassigned CASSANDRA-16033:
---

Assignee: (was: Ekaterina Dimitrova)

> test_resume_stopped_build - materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-16033
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16033
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failing in CircleCI [here | 
> https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/295/workflows/168d88ab-f55f-4560-a23e-8243aff7b1bd/jobs/1774]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16033) test_resume_stopped_build - materialized_views_test.TestMaterializedViews

2020-08-24 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183593#comment-17183593
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16033:
-

I am not able to reproduce this one and I didn't see it anymore failing. I am 
un-assigning it as I work on different work now and someone else might want to 
try to work on it. 

> test_resume_stopped_build - materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-16033
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16033
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failing in CircleCI [here | 
> https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/295/workflows/168d88ab-f55f-4560-a23e-8243aff7b1bd/jobs/1774]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16065) Distinguish partition errors published by Cassandra-Diff between source and target

2020-08-24 Thread Yifan Cai (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183572#comment-17183572
 ] 

Yifan Cai commented on CASSANDRA-16065:
---

Hi [~marcuse], would you like to review?

> Distinguish partition errors published by Cassandra-Diff between source and 
> target
> --
>
> Key: CASSANDRA-16065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/diff
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The partition errors found during diff are persisted without the error origin 
> information. 
> Therefore, I am proposing to add an error origin indicator, (e.g. 0 for 
> source and 1 for target) when persisting the partition error details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16065) Distinguish partition errors published by Cassandra-Diff between source and target

2020-08-24 Thread Yifan Cai (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183571#comment-17183571
 ] 

Yifan Cai commented on CASSANDRA-16065:
---

PR: [https://github.com/apache/cassandra-diff/pull/11]
Code: 
[https://github.com/yifan-c/cassandra-diff/tree/distinguish-partition-error]

There are multiple call sites that client sends requests to the compared 
clusters and may get exceptions. An exception wrapper, 
{{ClusterSourcedException}} is introduced in order to distinguish the error 
source cluster. The call sites that may throw exceptions are handled and wraps 
with {{ClusterSourcedException}}. 
 A new column, "{{error_source varchar}}", is added to the "partition_errors" 
table to store the error source.

> Distinguish partition errors published by Cassandra-Diff between source and 
> target
> --
>
> Key: CASSANDRA-16065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/diff
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The partition errors found during diff are persisted without the error origin 
> information. 
> Therefore, I am proposing to add an error origin indicator, (e.g. 0 for 
> source and 1 for target) when persisting the partition error details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16065) Distinguish partition errors published by Cassandra-Diff between source and target

2020-08-24 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16065:
--
Test and Documentation Plan: unit test
 Status: Patch Available  (was: Open)

> Distinguish partition errors published by Cassandra-Diff between source and 
> target
> --
>
> Key: CASSANDRA-16065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/diff
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The partition errors found during diff are persisted without the error origin 
> information. 
> Therefore, I am proposing to add an error origin indicator, (e.g. 0 for 
> source and 1 for target) when persisting the partition error details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16065) Distinguish partition errors published by Cassandra-Diff between source and target

2020-08-24 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16065:
--
Change Category: Operability
 Complexity: Low Hanging Fruit
   Assignee: Yifan Cai
 Status: Open  (was: Triage Needed)

> Distinguish partition errors published by Cassandra-Diff between source and 
> target
> --
>
> Key: CASSANDRA-16065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tool/diff
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The partition errors found during diff are persisted without the error origin 
> information. 
> Therefore, I am proposing to add an error origin indicator, (e.g. 0 for 
> source and 1 for target) when persisting the partition error details. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Adam Holmberg (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183533#comment-17183533
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

Looks like there are some additional sources of flakiness in this test.
https://app.circleci.com/pipelines/github/aholmberg/cassandra/25/workflows/536c8ec1-51d3-4ef0-95c6-7dd9677225dc/jobs/207

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging

2020-08-24 Thread Adam Holmberg (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183526#comment-17183526
 ] 

Adam Holmberg commented on CASSANDRA-15958:
---

[patch|https://github.com/apache/cassandra/compare/trunk...aholmberg:CASSANDRA-15958?expand=1]

The timeout occurs [waiting for this close 
future|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L268],
 failing intermittently due to a race condition. The test 
[closes|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L723]
 the inbound connection 
[twice|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/test/unit/org/apache/cassandra/net/ConnectionTest.java#L268].
 If the first execution finishes and [shuts down the 
executor|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L128]
 before the second, the 
[FutureCombiner|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L125-L126]
 fails to 
[addListener|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/FutureCombiner.java#L83]
 to the channel, and the [done 
future|https://github.com/aholmberg/cassandra/blob/CASSANDRA-15958/src/java/org/apache/cassandra/net/InboundSockets.java#L131]
 will never complete.

> org.apache.cassandra.net.ConnectionTest testMessagePurging
> --
>
> Key: CASSANDRA-15958
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15958
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> Build: 
> https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/
> java.util.concurrent.TimeoutException
>   at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258)
>   at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268)
>   at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236)
>   at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15909) Make Table/Keyspace Metric Names Consistent With Each Other

2020-08-24 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183492#comment-17183492
 ] 

Caleb Rackliffe commented on CASSANDRA-15909:
-

[~djoshi] Any interest in picking this one up as a committer/reviewer once I 
finalize the patch?

> Make Table/Keyspace Metric Names Consistent With Each Other
> ---
>
> Key: CASSANDRA-15909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15909
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Stephen Mallette
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As part of CASSANDRA-15821 it became apparent that certain metric names found 
> in keyspace and tables had different names but were in fact the same metric - 
> they are as follows:
> * Table.SyncTime == Keyspace.RepairSyncTime
> * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows
> * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime
> * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize
> * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize
> * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize
> * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize
> Also, client metrics are the only metrics to start with a lower case letter. 
> Change those to upper case to match all the other metrics.
> Unifying this naming would help make metrics more consistent as part of 
> CASSANDRA-15582



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15977) 4.0 quality testing: Read Repair

2020-08-24 Thread Jira



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-15977:
--
Test and Documentation Plan: 
https://docs.google.com/document/d/1-gldHcdLSMRbDhhI8ahs_tPeAZsjurjXr38xABVjWHE/edit?usp=sharing
 Status: Patch Available  (was: In Progress)

> 4.0 quality testing: Read Repair
> 
>
> Key: CASSANDRA-15977
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15977
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest, Test/unit
>Reporter: Andres de la Peña
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is a subtask of CASSANDRA-15579 focusing on read repair.
> [This 
> document|https://docs.google.com/document/d/1-gldHcdLSMRbDhhI8ahs_tPeAZsjurjXr38xABVjWHE/edit?usp=sharing]
>  lists and describes the existing functional tests for read repair, so we can 
> have a broad view of what is currently covered. We can comment on this 
> document and add ideas for new cases/tests, so it can gradually evolve to a 
> more or less detailed test plan.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

2020-08-24 Thread Aleksey Yeschenko (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-15158:
--
Authors: Blake Eggleston, Stefan Miklosovic  (was: Blake Eggleston)

> Wait for schema agreement rather than in flight schema requests when 
> bootstrapping
> --
>
> Key: CASSANDRA-15158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip, Cluster/Schema
>Reporter: Vincent White
>Assignee: Blake Eggleston
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches 
> (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of 
> in-flight schema pull requests, and we don't proceed with 
> bootstrapping/stream until all the latches are released (or we timeout 
> waiting for each one). One issue with this is that if we have a large schema, 
> or the retrieval of the schema from the other nodes was unexpectedly slow 
> then we have no explicit check in place to ensure we have actually received a 
> schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the 
> node to wait on each latche longer, there are cases where this doesn't help 
> because the callbacks for the schema pull requests have expired off the 
> messaging service's callback map 
> (org.apache.cassandra.net.MessagingService#callbacks) after 
> request_timeout_in_ms (default 10 seconds) before the other nodes were able 
> to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the 
> rest of the live nodes before proceeding with bootstrapping. It also adds a 
> check to prevent the new node from flooding existing nodes with simultaneous 
> schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters 
> getting stuck for extended amounts of time as they wait 
> `migration_task_wait_in_seconds` on each of the latches left orphaned by the 
> timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-11928) dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183392#comment-17183392
 ] 

Berenguer Blasi commented on CASSANDRA-11928:
-

#justfyi #collborating CASSANDRA-16073 is my take at trying to fix tracing test 
failures. Apologies I missed this ticket. I didn't go down the route of 
investigating mismatched CLs but found a few timeouts I could repro locally and 
a PR that fixes them. If we came to agreement that is good we could close this 
one

> dtest failure in cql_tracing_test.TestCqlTracing.tracing_simple_test
> 
>
> Key: CASSANDRA-11928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11928
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Craig Kodman
>Priority: Normal
>  Labels: dtest, flaky
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest/727/testReport/cql_tracing_test/TestCqlTracing/tracing_simple_test
> Failed on CassCI build cassandra-3.0_dtest #727
> Is it a problem that the tracing message with the query is missing?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183389#comment-17183389
 ] 

Berenguer Blasi commented on CASSANDRA-16073:
-

Investigating this failure I suspect this should fix all the tracing test 
failures and hopefully strengthen dtests overall. It was mainly a matter of 
hitting a couple timeouts. You can repro with a VM with little memory (3GB) and 
lowering CPU until it repros. The fix is basically to increase timeouts as 
there are no errors, just things being slow which aligns to circle and ASF 
jenkins heavy stressed test envs.

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183390#comment-17183390
 ] 

Berenguer Blasi commented on CASSANDRA-16073:
-

[~e.dimitrova] thanks for the pointers. Unfortunately I had the PR ready 
already lol. So I will drop a message on those tickets and we can close as we 
think best.

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15986) Repair tests flakiness

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183385#comment-17183385
 ] 

Berenguer Blasi commented on CASSANDRA-15986:
-

I see repair failures still. But with CASSANDRA-16070 keeping logs now + a 
'generic timeout' fix I am working on for another flaky in CASSANDRA-16073 yes, 
let's let it roll a few days +1

> Repair tests flakiness
> --
>
> Key: CASSANDRA-15986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15986
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta2
>
>
> Repair tests come up in test failure reports every now and then. I have tried 
> to repro the 
> [latest|https://ci-cassandra.apache.org/job/Cassandra-trunk/241/testReport/junit/dtest-novnode.repair_tests.repair_test/TestRepair/test_simple_sequential_repair/]
>  locally 100 times with no luck.
> _dtest-novnode.repair_tests.repair_test/TestRepair/test_simple_sequential_repair_
> Still from experience from fixing other flaky tests I have some intuition 
> where the problems may lie. The proposed fix should add no harm if merged. We 
> can reopen the ticket if repair tests keep failing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Ekaterina Dimitrova (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183379#comment-17183379
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16073:
-

Hey [~Bereng],
I just noticed this ticket in #Cassandra-noise, I think me and you created 
duplicates of an old ticket :-) 
I opened CASSANDRA-16045 which [~mck2] recognized and linked to CASSANDRA-11928.
I think we might want to close CASSANDRA-16045 and this one and continue the 
work started at CASSANDRA-11928, what do you think?

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-16073:

 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
Discovered By: Unit Test
 Severity: Normal
   Status: Open  (was: Triage Needed)

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-16073:

Fix Version/s: 4.0-beta

> cql_tracing_test.py failing repeatedly on jenkins
> -
>
> Key: CASSANDRA-16073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16073) cql_tracing_test.py failing repeatedly on jenkins

2020-08-24 Thread Berenguer Blasi (Jira)

Berenguer Blasi created CASSANDRA-16073:
---

 Summary: cql_tracing_test.py failing repeatedly on jenkins
 Key: CASSANDRA-16073
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16073
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: Berenguer Blasi
Assignee: Berenguer Blasi


Jenkins has been failing tracing tests repeatedly



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16060) Cassandra crashes with OutOfMemory Exception

2020-08-24 Thread Brandon Williams (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183368#comment-17183368
 ] 

Brandon Williams commented on CASSANDRA-16060:
--

bq. They only have 58 elements in total, but each QueuedMessage has a size of 
~113MB.

That is curiously large.  Can you post your cassandra.yaml?

> Cassandra crashes with OutOfMemory Exception
> 
>
> Key: CASSANDRA-16060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas De Keulenaer
>Priority: Normal
>
> Out Cassandra instance has run perfectly for almost 5 years, but aout a month 
> ago, the performance has dropped significantly. Reads are incredibly slow or 
> time out, making the cluster almost useless. The load on the nodes has 
> skyrocketed to almost 100% CPU usage. On top of that, the nodes crash with 
> OutOfMemoryError. 
> I also have [heap 
> dumps|https://arcelormittal-my.sharepoint.com/:f:/g/personal/sidtdke_sidmar_be/Es7PaQq15jdKgIvDfMFfBRoBR6sr60t156AxbA0dFZvzJg?e=rohgHn].
>  
> Cassandra version: 3.11.1
> Java version: OpenJDK 1.8.0_201
> We have upgraded our nodes to version 3.11.7 and applied the recommended 
> OS/system setting, but this has not improved performance.
> {code:java}
> ERROR [ReadStage-9] 2020-08-10 14:56:01,399 JVMStabilityInspector.java:142 - 
> JVM state determined to be unstable.  Exiting forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
>   at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) 
> ~[na:1.8.0_201]
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_201]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:159)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:413)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.Cell$Serializer.serialize(Cell.java:210) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$serializeRowBody$0(UnfilteredSerializer.java:248)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$120/205506350.accept(Unknown
>  Source) ~[na:na]
>   at 
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at org.apache.cassandra.db.rows.BTreeRow.apply(BTreeRow.java:172) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:236)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:205)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:137)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:125)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:137)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:167)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:160)
>  ~[apache-cassandra-3.11.1.jar:3.

[jira] [Commented] (CASSANDRA-16060) Cassandra crashes with OutOfMemory Exception

2020-08-24 Thread Thomas De Keulenaer (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183365#comment-17183365
 ] 

Thomas De Keulenaer commented on CASSANDRA-16060:
-

The OutboundTcpConnection ("MessagingService-Outgoing-/139.53.231.101-Large") 
has more than 6GB of heap data. That heap data is spread over 2 containers 
(LinkedBlockingQueue and ArrayList) of OutboundTcpConnection$QueuedMessage. 
They only have 58 elements in total, but each QueuedMessage has a size of 
~113MB.



> Cassandra crashes with OutOfMemory Exception
> 
>
> Key: CASSANDRA-16060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Thomas De Keulenaer
>Priority: Normal
>
> Out Cassandra instance has run perfectly for almost 5 years, but aout a month 
> ago, the performance has dropped significantly. Reads are incredibly slow or 
> time out, making the cluster almost useless. The load on the nodes has 
> skyrocketed to almost 100% CPU usage. On top of that, the nodes crash with 
> OutOfMemoryError. 
> I also have [heap 
> dumps|https://arcelormittal-my.sharepoint.com/:f:/g/personal/sidtdke_sidmar_be/Es7PaQq15jdKgIvDfMFfBRoBR6sr60t156AxbA0dFZvzJg?e=rohgHn].
>  
> Cassandra version: 3.11.1
> Java version: OpenJDK 1.8.0_201
> We have upgraded our nodes to version 3.11.7 and applied the recommended 
> OS/system setting, but this has not improved performance.
> {code:java}
> ERROR [ReadStage-9] 2020-08-10 14:56:01,399 JVMStabilityInspector.java:142 - 
> JVM state determined to be unstable.  Exiting forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
>   at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) 
> ~[na:1.8.0_201]
>   at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_201]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:159)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:413)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.Cell$Serializer.serialize(Cell.java:210) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$serializeRowBody$0(UnfilteredSerializer.java:248)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$120/205506350.accept(Unknown
>  Source) ~[na:na]
>   at 
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at org.apache.cassandra.db.rows.BTreeRow.apply(BTreeRow.java:172) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:236)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:205)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:137)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:125)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:137)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>   at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java

[jira] [Updated] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16070:
---
  Fix Version/s: (was: 4.0-beta)
 4.0-beta2
Source Control Link: 
https://github.com/apache/cassandra-dtest/commit/cefddf845d63919c6e7b5efa35b28fe7a5ad1142
 and 
https://github.com/apache/cassandra-builds/commit/3c91749e9f13b2b728b193197291a368dce6dc8a
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed [dtest 
cefddf845d63919c6e7b5efa35b28fe7a5ad1142|https://github.com/apache/cassandra-dtest/commit/cefddf845d63919c6e7b5efa35b28fe7a5ad1142]
 and [builds 
3c91749e9f13b2b728b193197291a368dce6dc8a|https://github.com/apache/cassandra-builds/commit/3c91749e9f13b2b728b193197291a368dce6dc8a].

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta2
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-builds] branch master updated: Use new dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread mck

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/master by this push:
 new 3c91749  Use new dtest option to keep ccm test directories for just 
failed tests
3c91749 is described below

commit 3c91749e9f13b2b728b193197291a368dce6dc8a
Author: mck 
AuthorDate: Mon Aug 24 10:57:06 2020 +0200

Use new dtest option to keep ccm test directories for just failed tests

 patch by Mick Semb Wever; reviewed by Brandon Williams for CASSANDRA-16070
---
 build-scripts/cassandra-dtest-pytest.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/build-scripts/cassandra-dtest-pytest.sh 
b/build-scripts/cassandra-dtest-pytest.sh
index ce8283e..28b9d38 100755
--- a/build-scripts/cassandra-dtest-pytest.sh
+++ b/build-scripts/cassandra-dtest-pytest.sh
@@ -68,13 +68,13 @@ cd cassandra-dtest/
 mkdir -p ${TMPDIR}
 set +e # disable immediate exit from this point
 if [ "${DTEST_TARGET}" = "dtest" ]; then
-DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--skip-resource-intensive-tests"
+DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--skip-resource-intensive-tests --keep-failed-test-dir"
 elif [ "${DTEST_TARGET}" = "dtest-novnode" ]; then
-DTEST_ARGS="--skip-resource-intensive-tests"
+DTEST_ARGS="--skip-resource-intensive-tests --keep-failed-test-dir"
 elif [ "${DTEST_TARGET}" = "dtest-offheap" ]; then
-DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--use-off-heap-memtables --skip-resource-intensive-tests"
+DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--use-off-heap-memtables --skip-resource-intensive-tests --keep-failed-test-dir"
 elif [ "${DTEST_TARGET}" = "dtest-large" ]; then
-DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--only-resource-intensive-tests"
+DTEST_ARGS="--use-vnodes --num-tokens=${NUM_TOKENS} 
--only-resource-intensive-tests --keep-failed-test-dir"
 elif [ "${DTEST_TARGET}" = "dtest-upgrade" ]; then
 DTEST_ARGS="--execute-upgrade-tests-only "
 export RUN_STATIC_UPGRADE_MATRIX=true


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16070:
---
Status: Ready to Commit  (was: Review In Progress)

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16070:
---
Reviewers: Brandon Williams, Michael Semb Wever  (was: Berenguer Blasi)
   Status: Review In Progress  (was: Patch Available)

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15909) Make Table/Keyspace Metric Names Consistent With Each Other

2020-08-24 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183354#comment-17183354
 ] 

Caleb Rackliffe commented on CASSANDRA-15909:
-

Thanks [~spmallette]. I'll throw this one (and possibly CASSANDRA-15821) in my 
knapsack and carry it forward.

> Make Table/Keyspace Metric Names Consistent With Each Other
> ---
>
> Key: CASSANDRA-15909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15909
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Stephen Mallette
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As part of CASSANDRA-15821 it became apparent that certain metric names found 
> in keyspace and tables had different names but were in fact the same metric - 
> they are as follows:
> * Table.SyncTime == Keyspace.RepairSyncTime
> * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows
> * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime
> * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize
> * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize
> * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize
> * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize
> Also, client metrics are the only metrics to start with a lower case letter. 
> Change those to upper case to match all the other metrics.
> Unifying this naming would help make metrics more consistent as part of 
> CASSANDRA-15582



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15909) Make Table/Keyspace Metric Names Consistent With Each Other

2020-08-24 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-15909:

Authors: Caleb Rackliffe, Stephen Mallette  (was: Caleb Rackliffe)

> Make Table/Keyspace Metric Names Consistent With Each Other
> ---
>
> Key: CASSANDRA-15909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15909
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Stephen Mallette
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As part of CASSANDRA-15821 it became apparent that certain metric names found 
> in keyspace and tables had different names but were in fact the same metric - 
> they are as follows:
> * Table.SyncTime == Keyspace.RepairSyncTime
> * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows
> * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime
> * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize
> * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize
> * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize
> * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize
> Also, client metrics are the only metrics to start with a lower case letter. 
> Change those to upper case to match all the other metrics.
> Unifying this naming would help make metrics more consistent as part of 
> CASSANDRA-15582



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra-dtest] branch master updated: Add "--keep-failed-test-dir" option that only keeps the ccm test directory for failed tests

2020-08-24 Thread mck

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/master by this push:
 new cefddf8  Add "--keep-failed-test-dir" option that only keeps the ccm 
test directory for failed tests
cefddf8 is described below

commit cefddf845d63919c6e7b5efa35b28fe7a5ad1142
Author: Mick Semb Wever 
AuthorDate: Sun Aug 23 23:26:31 2020 +0200

Add "--keep-failed-test-dir" option that only keeps the ccm test directory 
for failed tests

 patch by Mick Semb Wever; reviewed by Brandon Williams
---
 conftest.py | 12 +++-
 dtest_config.py |  2 ++
 dtest_setup.py  |  4 ++--
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/conftest.py b/conftest.py
index 1fa5e22..25cc791 100644
--- a/conftest.py
+++ b/conftest.py
@@ -81,6 +81,9 @@ def pytest_addoption(parser):
 parser.addoption("--keep-test-dir", action="store_true", default=False,
  help="Do not remove/cleanup the test ccm cluster 
directory and it's artifacts "
   "after the test completes")
+parser.addoption("--keep-failed-test-dir", action="store_true", 
default=False,
+ help="Do not remove/cleanup the test ccm cluster 
directory and it's artifacts "
+  "after the test fails")
 parser.addoption("--enable-jacoco-code-coverage", action="store_true", 
default=False,
  help="Enable JaCoCo Code Coverage Support")
 parser.addoption("--upgrade-version-selection", action="store", 
default="indev",
@@ -285,6 +288,13 @@ def fixture_dtest_create_cluster_func():
 """
 return DTestSetup.create_ccm_cluster
 
+@pytest.hookimpl(hookwrapper=True, tryfirst=True)
+def pytest_runtest_makereport(item, call):
+outcome = yield
+rep = outcome.get_result()
+setattr(item, "rep_" + rep.when, rep)
+return rep
+
 @pytest.fixture(scope='function', autouse=False)
 def fixture_dtest_setup(request,
 dtest_config,
@@ -336,7 +346,7 @@ def fixture_dtest_setup(request,
 except Exception as e:
 logger.error("Error saving log:", str(e))
 finally:
-dtest_setup.cleanup_cluster()
+dtest_setup.cleanup_cluster(request)
 
 
 #Based on https://bugs.python.org/file25808/14894.patch
diff --git a/dtest_config.py b/dtest_config.py
index 25e9550..bb5ce8c 100644
--- a/dtest_config.py
+++ b/dtest_config.py
@@ -20,6 +20,7 @@ class DTestConfig:
 self.execute_upgrade_tests_only = False
 self.disable_active_log_watching = False
 self.keep_test_dir = False
+self.keep_failed_test_dir = False
 self.enable_jacoco_code_coverage = False
 self.jemalloc_path = find_libjemalloc()
 
@@ -42,6 +43,7 @@ class DTestConfig:
 self.execute_upgrade_tests_only = 
request.config.getoption("--execute-upgrade-tests-only")
 self.disable_active_log_watching = 
request.config.getoption("--disable-active-log-watching")
 self.keep_test_dir = request.config.getoption("--keep-test-dir")
+self.keep_failed_test_dir = 
request.config.getoption("--keep-failed-test-dir")
 self.enable_jacoco_code_coverage = 
request.config.getoption("--enable-jacoco-code-coverage")
 
 def get_version_from_build(self):
diff --git a/dtest_setup.py b/dtest_setup.py
index abc50b5..646bc23 100644
--- a/dtest_setup.py
+++ b/dtest_setup.py
@@ -347,9 +347,9 @@ class DTestSetup(object):
 """
 self.log_watch_thread.join(timeout=60)
 
-def cleanup_cluster(self):
+def cleanup_cluster(self, request):
 with log_filter('cassandra'):  # quiet noise from driver when nodes 
start going down
-if self.dtest_config.keep_test_dir:
+if self.dtest_config.keep_test_dir or 
(self.dtest_config.keep_failed_test_dir and request.node.rep_call.failed):
 
self.cluster.stop(gently=self.dtest_config.enable_jacoco_code_coverage)
 else:
 # when recording coverage the jvm has to exit normally


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15909) Make Table/Keyspace Metric Names Consistent With Each Other

2020-08-24 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe reassigned CASSANDRA-15909:
---

Assignee: Caleb Rackliffe

> Make Table/Keyspace Metric Names Consistent With Each Other
> ---
>
> Key: CASSANDRA-15909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15909
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Stephen Mallette
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As part of CASSANDRA-15821 it became apparent that certain metric names found 
> in keyspace and tables had different names but were in fact the same metric - 
> they are as follows:
> * Table.SyncTime == Keyspace.RepairSyncTime
> * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows
> * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime
> * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize
> * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize
> * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize
> * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize
> Also, client metrics are the only metrics to start with a lower case letter. 
> Change those to upper case to match all the other metrics.
> Unifying this naming would help make metrics more consistent as part of 
> CASSANDRA-15582



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15937) JMX output inconsistencies from CASSANDRA-7544 storage-port-configurable-per-node

2020-08-24 Thread Jon Meredith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1718#comment-1718
 ] 

Jon Meredith commented on CASSANDRA-15937:
--

Will do - looks like regexes in tests need to be updated for the toString vs 
getHostByAddressAndPort formats.  I'm pleasantly suprrised they're being tested.

> JMX output inconsistencies from CASSANDRA-7544 
> storage-port-configurable-per-node
> -
>
> Key: CASSANDRA-15937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/JMX
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CASSANDRA-7544 introduced changes to allow the storage port number to be 
> configured per-node. As part of that work it introduces new MBeans for 
> MessagingService, FailureDetector providing new 'WithPort' versions that 
> include the new port information, however there are some mistakes and 
> inconsistencies.
> {code:java}
>                            3.11.6                trunk                  trunk 
> w/Port          Notes
>   
>  AllEndpointStates        /127.0.0.1\n...       /127.0.0.3\n...        
> 127.0.0.3:7000\n        (trunk /w port different)
>  SimpleStates             /127.0.0.2=UP         /127.0.0.2=UP          
> 127.0.0.3:7000=UP       (trunk /w port different)
>  LargeMessagePendingTasks /127.0.0.1=0          /127.0.0.1=0           
> 127.0.0.3:7000=0        (trunk /w port different)
>  TimeoutsPerHost          127.0.0.1=0           /127.0.0.1=0           
> 127.0.0.3:7000=0        3.0/3.11.6 & trunk differ.
>  BackPressurePerHost      127.0.0.1=Infinity    /127.0.0.2=Infinity    
> /127.0.0.2=Infinity     3.11 & trunk differ, missing port number for 
> BackPressurePerHostWithPort
>  SchemaVersions          {...=[127.0.0.1,...]} {...=[127.0.0.1,...]}  
> {...=[127.0.0.1:7000,...]
>   
>  TokenToEndpointMap      {-92...8=127.0.0.1,   -92...8=127.0.0.1      
> -92..8=127.0.0.1:7000
>  HostIdMap               127.0.0.1=1ee..6f0af  127.0.0.1=e06...7e     MISSING 
>                  Deprecated for EndpointToHostId
>  EndpointToHostId        127.0.0.1=1ee..6f0a   127.0.0.1=e06...7e     
> 127.0.0.1:7000=e0..7e
>  HostIdToEndpoint        1ee..6f0a=127.0.0.1   e06..7e=127.0.0.1      
> e06..7e=127.0.0.1:7000
>  LoadMap                 127.0.0.1=185.01 KiB  127.0.0.1:7000=106.08 KiB  
> 127.0.0.1=106.08 Ki  LoadMap and LoadMapWithPort are flipped.
>  LiveNodes               127.0.0.1             127.0.0.1              
> 127.0.0.1:7000
>  Ownership               /127.0.0.1=0.33   /127.0.0.1=0.33    
> 127.0.0.1:7000=0.33
>  Scores                  /127.0.0.1=0.0        /127.0.0.1=0.0         
> 127.0.0.1:7000=0.0
>   {code}
>  
>  Proposed changes
>   
>  1) AllEndpointStats, SimpleStates, Connection message tracking, 
> TimeoutsPerHost - include the host/ip:port in the WithPort version
> 2) Add port number to BackPressurePerHostWithPort
> 3) Correct LoadMap to omit port / LoadMapWithPort to include port
> 4) Ownership - update with port to host/ip:port version
> 5) Scores - update with port to host/ip:port version
>   
>   
>  Additionally while dumping out all of the JMX info with `sjk mxdump`
>   
> 6) DynamicEndpointSnitch.getScoresWithPort now returns an InetAddressAndPort 
> which should just be a String
> 7) ClientMetrics.clientsByProtocolVersion returns a Guava Immutable map
> 8) StorageService.getIdealConsistencyLevel fails if none set (as we try and 
> call ConsistencyLevel.toString on a null pointer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Brandon Williams (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183297#comment-17183297
 ] 

Brandon Williams commented on CASSANDRA-16070:
--

+1, though I think in most cases the logs are sufficient.

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16071:
---
Status: In Progress  (was: Patch Available)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183111#comment-17183111
 ] 

Michael Semb Wever edited comment on CASSANDRA-16071 at 8/24/20, 1:34 PM:
--

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/264/pipeline]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/265/pipeline]

(will update with tests …)


was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/261/pipeline]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/262/pipeline]

(will update with tests …)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183111#comment-17183111
 ] 

Michael Semb Wever edited comment on CASSANDRA-16071 at 8/24/20, 1:03 PM:
--

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/261/pipeline]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/262/pipeline]

(will update with tests …)


was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with tests and CI run later today…)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183253#comment-17183253
 ] 

Michael Semb Wever edited comment on CASSANDRA-16071 at 8/24/20, 1:03 PM:
--

bq. How about we rename "maxMemMb" to "maxMemBytes" and rename
"IndexMode#maxCompactionFlushMemoryInMb" to "maxCompactionFlushMemoryInBytes".

Done. And CI runs added. Working on a (super simple) unit test…


was (Author: michaelsembwever):
bq. How about we rename "maxMemMb" to "maxMemBytes" and rename
"IndexMode#maxCompactionFlushMemoryInMb" to "maxCompactionFlushMemoryInBytes".

Done. And CI runs added.

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183253#comment-17183253
 ] 

Michael Semb Wever commented on CASSANDRA-16071:


bq. How about we rename "maxMemMb" to "maxMemBytes" and rename
"IndexMode#maxCompactionFlushMemoryInMb" to "maxCompactionFlushMemoryInBytes".

Done. And CI runs added.

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183127#comment-17183127
 ] 

Michael Semb Wever edited comment on CASSANDRA-16072 at 8/24/20, 12:38 PM:
---

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_cas_improvements]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/259/pipeline]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_cas_improvements]
 with CI 
[run|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/260/pipeline]





was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_cas_improvements]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_cas_improvements]


(will update with CI later today)

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183210#comment-17183210
 ] 

Michael Semb Wever commented on CASSANDRA-16070:


In the CI run 
[here|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-dtest/35/label=cassandra,split=24/]
 is an example of the ccm test directory being available for failed tests.

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:57 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}} and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:53 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb" (153GB)}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:52 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
 {{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
{{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :

{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang edited comment on CASSANDRA-16071 at 8/24/20, 11:50 AM:
-

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes" and rename 
{{"IndexMode#maxCompactionFlushMemoryInMb"}} to 
{{"maxCompactionFlushMemoryInBytes"}}. So it should be :

{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}


was (Author: jasonstack):
{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183177#comment-17183177
 ] 

ZhaoYang commented on CASSANDRA-16071:
--

{code:java}
 long maxMemMb = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1048576 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}
[~mck] I think the default {{"1048576 * 0.15 mb"}} may still cause OOM.

 

How about we rename {{"maxMemMb"}} to {{"maxMemBytes"}}. So it should be :
{code:java}
 long maxMemBytes = indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION) == null
  ? (long) (1073741824 * INDEX_MAX_FLUSH_DEFAULT_MULTIPLIER) // 1G default for 
memtable
  : 1048576L * Long.parseLong(indexOptions.get(INDEX_MAX_FLUSH_MEMORY_OPTION));
{code}

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16071:
-
Test and Documentation Plan: CI running
 Status: Patch Available  (was: In Progress)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread ZhaoYang (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhaoYang updated CASSANDRA-16071:
-
Reviewers: ZhaoYang

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16069) Loss of functionality around null clustering when dropping compact storage

2020-08-24 Thread Romain Hardouin (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183171#comment-17183171
 ] 

Romain Hardouin commented on CASSANDRA-16069:
-

IMHO Solution #3 seems the safest solution if it's clearly documented.

Dropping COMPACT STORAGE is not something users do in production without 
testing. They must ensure that apps/services work without errors.

If a service relies on this brittle "feature", it will still be able to access 
data using a slice query. There is no data unavailability, which is the most 
important thing I think.

On top of that, it's not a sneaky change with a silent error. INSERT and UPDATE 
will throw InvalidRequest that should appear during tests.

> Loss of functionality around null clustering when dropping compact storage
> --
>
> Key: CASSANDRA-16069
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16069
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Priority: Normal
>
> For backward compatibility reasons[1], it is allowed to insert rows where 
> some of the clustering columns are {{null}} for compact tables. That support 
> is a tad limited/inconsistent[2] but essentially you can do:
> {noformat}
> cqlsh:ks> CREATE TABLE t (k int, c1 int, c2 int, v int, PRIMARY KEY (k, c1, 
> c2)) WITH COMPACT STORAGE;
> cqlsh:ks> INSERT INTO t(k, c1, v) VALUES (1, 1, 1);
> cqlsh:ks> SELECT * FROM t;
>  k | c1 | c2   | v
> ---++--+---
>  1 |  1 | null | 1
> (1 rows)
> cqlsh:ks> UPDATE t SET v = 2 WHERE k = 1 AND c1 = 1;
> cqlsh:ks> SELECT * FROM t;
>  k | c1 | c2   | v
> ---++--+---
>  1 |  1 | null | 2
> (1 rows)
> {noformat}
> This is not allowed on non-compact tables however:
> {noformat}
> cqlsh:ks> CREATE TABLE t2 (k int, c1 int, c2 int, v int, PRIMARY KEY (k, c1, 
> c2));
> cqlsh:ks> INSERT INTO t2(k, c1, v) VALUES (1, 1, 1);
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Some 
> clustering keys are missing: c2"
> cqlsh:ks> UPDATE t2 SET v = 2 WHERE k = 1 AND c1 = 1;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Some 
> clustering keys are missing: c2"
> {noformat}
> Which means that a user with a compact table that rely on this will not be 
> able to use {{DROP COMPACT STORAGE}}.
> Which is a problem for the 4.0 upgrade story. Problem to which we need an 
> answer.
>  
> 
> [1]: the underlying {{CompositeType}} used by such tables allows to provide 
> only a prefix of components, so thrift users could have used such 
> functionality. We thus had to support it in CQL, or those users wouldn't have 
> been able to upgrade to CQL easily.
> [2]: building on the example above, the value for {{c2}} is essentially 
> {{null}}, yet none of the following is currently allowed:
> {noformat}
> cqlsh:ks> INSERT INTO t(k, c1, c2, v) VALUES (1, 1, null, 1);
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
> null value in condition for column c2"
> cqlsh:ks> UPDATE t SET v = 2 WHERE k = 1 AND c1 = 1 AND c2 = null;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
> null value in condition for column c2"
> cqlsh:ks> SELECT * FROM c WHERE k = 1 AND c1 = 1 AND c2 = null;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
> null value in condition for column c2"
> {noformat}
> Not only is that unintuitive/inconsistent, but the {{SELECT}} one means there 
> is no way to select only the row. You can skip specifying {{c2}} in the 
> {{SELECT}}, but this become a slice selection essentially, as shown below:
> {noformat}
> cqlsh:ks> INSERT INTO ct(k, c1, c2, v) VALUES (1, 1, 1, 1);
> cqlsh:ks> SELECT * FROM ct WHERE k = 1 AND c1 = 1;
>  k | c1 | c2   | v
> ---++--+---
>  1 |  1 | null | 1
>  1 |  1 |1 | 1
> (2 rows)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16062) Avoid NPE in getCompactionInfo

2020-08-24 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16062:

  Since Version: 4.0-alpha1
Source Control Link: 
https://github.com/apache/cassandra/commit/219eb86fd22805d419667c791af4419cd2b3d00a
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

and committed, thanks

> Avoid NPE in getCompactionInfo
> --
>
> Key: CASSANDRA-16062
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16062
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0-beta
>
>
> We currently initialize {{sstables}} after calling {{beginCompaction(this)}} 
> in the {{CompactionIterator}} constructor which creates a window where we can 
> get NPE creating the {{CompactionInfo}} for cancelling compactions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated: Initialize sstables earlier to avoid NPE in CompactionIterator

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 219eb86  Initialize sstables earlier to avoid NPE in CompactionIterator
219eb86 is described below

commit 219eb86fd22805d419667c791af4419cd2b3d00a
Author: Marcus Eriksson 
AuthorDate: Thu Aug 20 08:51:29 2020 +0200

Initialize sstables earlier to avoid NPE in CompactionIterator

Patch by marcuse; reviewed by Brandon Williams and Jon Meredith for 
CASSANDRA-16062
---
 src/java/org/apache/cassandra/db/compaction/CompactionIterator.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java
index 78bdfb0..ec6a4d4 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java
@@ -99,6 +99,9 @@ public class CompactionIterator extends CompactionInfo.Holder 
implements Unfilte
 bytes += scanner.getLengthInBytes();
 this.totalBytes = bytes;
 this.mergeCounters = new long[scanners.size()];
+// note that we leak `this` from the constructor when calling 
beginCompaction below, this means we have to get the sstables before
+// calling that to avoid a NPE.
+sstables = 
scanners.stream().map(ISSTableScanner::getBackingSSTables).flatMap(Collection::stream).collect(ImmutableSet.toImmutableSet());
 this.activeCompactions = activeCompactions == null ? 
ActiveCompactionsTracker.NOOP : activeCompactions;
 this.activeCompactions.beginCompaction(this); // note that 
CompactionTask also calls this, but CT only creates CompactionIterator with a 
NOOP ActiveCompactions
 
@@ -109,7 +112,6 @@ public class CompactionIterator extends 
CompactionInfo.Holder implements Unfilte
 merged = Transformation.apply(merged, new Purger(controller, 
nowInSec));
 merged = DuplicateRowChecker.duringCompaction(merged, type);
 compacted = Transformation.apply(merged, new 
AbortableUnfilteredPartitionTransformation(this));
-sstables = 
scanners.stream().map(ISSTableScanner::getBackingSSTables).flatMap(Collection::stream).collect(ImmutableSet.toImmutableSet());
 }
 
 public TableMetadata metadata()


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16062) Avoid NPE in getCompactionInfo

2020-08-24 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16062:

Reviewers: Brandon Williams, Jon Meredith  (was: Brandon Williams)

> Avoid NPE in getCompactionInfo
> --
>
> Key: CASSANDRA-16062
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16062
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0-beta
>
>
> We currently initialize {{sstables}} after calling {{beginCompaction(this)}} 
> in the {{CompactionIterator}} constructor which creates a window where we can 
> get NPE creating the {{CompactionInfo}} for cancelling compactions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15582) 4.0 quality testing: metrics

2020-08-24 Thread Stephen Mallette (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette reassigned CASSANDRA-15582:


Assignee: (was: Stephen Mallette)

As I've mentioned on a few other tickets, I don't believe I will have time to 
take these metrics issues along any further. I've removed myself as the 
"Assignee".

> 4.0 quality testing: metrics
> 
>
> Key: CASSANDRA-15582
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15582
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Screen Shot 2020-04-07 at 5.47.17 PM.png
>
>
> In past releases we've unknowingly broken metrics integrations and introduced 
> performance regressions in metrics collection and reporting. We strive in 4.0 
> to not do that. Metrics should work well!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15909) Make Table/Keyspace Metric Names Consistent With Each Other

2020-08-24 Thread Stephen Mallette (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette reassigned CASSANDRA-15909:


Assignee: (was: Stephen Mallette)

I don't think I'm going to have time to get to make any further changes on 
this. Sorry to leave it a bit unfinished but perhaps it won't be hard for 
someone else to finish things up. I've removed myself as the "Assignee". 

> Make Table/Keyspace Metric Names Consistent With Each Other
> ---
>
> Key: CASSANDRA-15909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15909
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Stephen Mallette
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As part of CASSANDRA-15821 it became apparent that certain metric names found 
> in keyspace and tables had different names but were in fact the same metric - 
> they are as follows:
> * Table.SyncTime == Keyspace.RepairSyncTime
> * Table.RepairedDataTrackingOverreadRows == Keyspace.RepairedOverreadRows
> * Table.RepairedDataTrackingOverreadTime == Keyspace.RepairedOverreadTime
> * Table.AllMemtablesHeapSize == Keyspace.AllMemtablesOnHeapDataSize
> * Table.AllMemtablesOffHeapSize == Keyspace.AllMemtablesOffHeapDataSize
> * Table.MemtableOnHeapSize == Keyspace.MemtableOnHeapDataSize
> * Table.MemtableOffHeapSize == Keyspace.MemtableOffHeapDataSize
> Also, client metrics are the only metrics to start with a lower case letter. 
> Change those to upper case to match all the other metrics.
> Unifying this naming would help make metrics more consistent as part of 
> CASSANDRA-15582



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15821) Metrics Documentation Enhancements

2020-08-24 Thread Stephen Mallette (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Mallette reassigned CASSANDRA-15821:


Assignee: (was: Stephen Mallette)

Yes - that makes sense to me...that way all the documentation gets dealt with 
in one place. Unfortunately, I think I'm going to have to back away from 
putting the final touches on this one at this point. I will remove myself as 
the "Assignee". 

> Metrics Documentation Enhancements
> --
>
> Key: CASSANDRA-15821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15821
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Stephen Mallette
>Priority: Normal
> Fix For: 4.0-beta
>
>
> CASSANDRA-15582 involves quality around metrics and it was mentioned that 
> reviewing and [improving 
> documentation|https://github.com/apache/cassandra/blob/trunk/doc/source/operating/metrics.rst]
>  around metrics would fall into that scope. Please consider some of this 
> analysis in determining what improvements to make here:
> Please see [this 
> spreadsheet|https://docs.google.com/spreadsheets/d/1iPWfCMIG75CI6LbYuDtCTjEOvZw-5dyH-e08bc63QnI/edit?usp=sharing]
>  that itemizes almost all of cassandra's metrics and whether they are 
> documented or not (and other notes).  That spreadsheet is "almost all" 
> because there are some metrics that don't seem to initialize as part of 
> Cassandra startup (i was able to trigger some to initialize, but all were not 
> immediately obvious). The missing metrics seem to be related to the following:
> * ThreadPool metrics - only some initialize at startup the list of which 
> follow below
> * Streaming Metrics
> * HintedHandoff Metrics
> * HintsService Metrics
> Here are the ThreadPool scopes that get listed:
> {code}
> AntiEntropyStage
> CacheCleanupExecutor
> CompactionExecutor
> GossipStage
> HintsDispatcher
> MemtableFlushWriter
> MemtablePostFlush
> MemtableReclaimMemory
> MigrationStage
> MutationStage
> Native-Transport-Requests
> PendingRangeCalculator
> PerDiskMemtableFlushWriter_0
> ReadStage
> Repair-Task
> RequestResponseStage
> Sampler
> SecondaryIndexManagement
> ValidationExecutor
> ViewBuildExecutor
> {code}
> I noticed that Keyspace Metrics have this note: "Most of these metrics are 
> the same as the Table Metrics above, only they are aggregated at the Keyspace 
> level." I think I've isolated those metrics on table that are not on keyspace 
> to specifically be:
> {code}
> BloomFilterFalsePositives
> BloomFilterFalseRatio
> BytesAnticompacted
> BytesFlushed
> BytesMutatedAnticompaction
> BytesPendingRepair
> BytesRepaired
> BytesUnrepaired
> CompactionBytesWritten
> CompressionRatio
> CoordinatorReadLatency
> CoordinatorScanLatency
> CoordinatorWriteLatency
> EstimatedColumnCountHistogram
> EstimatedPartitionCount
> EstimatedPartitionSizeHistogram
> KeyCacheHitRate
> LiveSSTableCount
> MaxPartitionSize
> MeanPartitionSize
> MinPartitionSize
> MutatedAnticompactionGauge
> PercentRepaired
> RowCacheHitOutOfRange
> RowCacheHit
> RowCacheMiss
> SpeculativeSampleLatencyNanos
> SyncTime
> WaitingOnFreeMemtableSpace
> DroppedMutations
> {code}
> Someone with greater knowledge of this area might consider it worth the 
> effort to see if any of these metrics should be aggregated to the keyspace 
> level in case they were inadvertently missed. In any case, perhaps the 
> documentation could easily now reflect which metric names could be expected 
> on Keyspace.
> The DroppedMessage metrics have a much larger body of scopes than just what 
> were documented:
> {code}
> ASYMMETRIC_SYNC_REQ
> BATCH_REMOVE_REQ
> BATCH_REMOVE_RSP
> BATCH_STORE_REQ
> BATCH_STORE_RSP
> CLEANUP_MSG
> COUNTER_MUTATION_REQ
> COUNTER_MUTATION_RSP
> ECHO_REQ
> ECHO_RSP
> FAILED_SESSION_MSG
> FAILURE_RSP
> FINALIZE_COMMIT_MSG
> FINALIZE_PROMISE_MSG
> FINALIZE_PROPOSE_MSG
> GOSSIP_DIGEST_ACK
> GOSSIP_DIGEST_ACK2
> GOSSIP_DIGEST_SYN
> GOSSIP_SHUTDOWN
> HINT_REQ
> HINT_RSP
> INTERNAL_RSP
> MUTATION_REQ
> MUTATION_RSP
> PAXOS_COMMIT_REQ
> PAXOS_COMMIT_RSP
> PAXOS_PREPARE_REQ
> PAXOS_PREPARE_RSP
> PAXOS_PROPOSE_REQ
> PAXOS_PROPOSE_RSP
> PING_REQ
> PING_RSP
> PREPARE_CONSISTENT_REQ
> PREPARE_CONSISTENT_RSP
> PREPARE_MSG
> RANGE_REQ
> RANGE_RSP
> READ_REPAIR_REQ
> READ_REPAIR_RSP
> READ_REQ
> READ_RSP
> REPAIR_RSP
> REPLICATION_DONE_REQ
> REPLICATION_DONE_RSP
> REQUEST_RSP
> SCHEMA_PULL_REQ
> SCHEMA_PULL_RSP
> SCHEMA_PUSH_REQ
> SCHEMA_PUSH_RSP
> SCHEMA_VERSION_REQ
> SCHEMA_VERSION_RSP
> SNAPSHOT_MSG
> SNAPSHOT_REQ
> SNAPSHOT_RSP
> STATUS_REQ
> STATUS_RSP
> SYNC_REQ
> SYNC_RSP
> TRUNCATE_REQ
> TRUNCATE_RSP
> VALIDATION_REQ
> VALIDATION_RSP
> _SAMPLE
> _TEST_1
> _TEST_2
> _TRACE
> {code}
> I suppose I may yet be missing some metrics as my knowledge of what's 
> available is limited

[jira] [Updated] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements

2020-08-24 Thread Sam Tunnicliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-14801:

Reviewers: Blake Eggleston, Sam Tunnicliffe, Sam Tunnicliffe  (was: Blake 
Eggleston, Sam Tunnicliffe)
   Blake Eggleston, Sam Tunnicliffe, Sam Tunnicliffe  (was: Blake 
Eggleston)
   Status: Review In Progress  (was: Patch Available)

> calculatePendingRanges no longer safe for multiple adjacent range movements
> ---
>
> Key: CASSANDRA-14801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination, Legacy/Distributed Metadata
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Correctness depended upon the narrowing to a {{Set}}, 
> which we no longer do - we maintain a collection of all {{Replica}}.  Our 
> {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result 
> contain the same endpoint multiple times; and our {{EndpointsForToken}} 
> obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, 
> resulting in cluster-wide failures for writes to the affected token ranges 
> for the duration of the range movement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183144#comment-17183144
 ] 

Michael Semb Wever commented on CASSANDRA-16072:


[~benedict], given you reviewed 15922, might you be interesting in reviewing 
this?

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15899) Dropping a column can break queries until the schema is fully propagated

2020-08-24 Thread Alex Petrov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183133#comment-17183133
 ] 

Alex Petrov commented on CASSANDRA-15899:
-

A minor comment: shouold we consider implementing {{isPlaceholder}} on 
{{Placeholder}} class to avoid instanceOf checks? Also, it might be useful to 
add tests for {{*}} not only {{id, v1}} in {{SchemaTest.java}}.

> Dropping a column can break queries until the schema is fully propagated
> 
>
> Key: CASSANDRA-15899
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15899
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Marcus Eriksson
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 3.0.x
>
>
> With a table like:
> {code}
> CREATE TABLE ks.tbl (id int primary key, v1 int, v2 int, v3 int)
> {code}
> and we drop {{v2}}, we get this exception on the replicas which haven't seen 
> the schema change:
> {code}
> ERROR [SharedPool-Worker-1] node2 2020-06-24 09:49:08,107 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,node2]
> java.lang.IllegalStateException: [ColumnDefinition{name=v1, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=v2, type=org.apache.cassandra.db.marshal.Int32Type, 
> kind=REGULAR, position=-1}, ColumnDefinition{name=v3, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}] 
> is not a subset of [v1 v3]
>   at 
> org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:546) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:478) 
> ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:184)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:114)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:102)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:132)
>  ~[main/:na]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
>  ~[main/:na]
> ...
> {code}
> Note that it doesn't matter if we {{SELECT *}} or {{SELECT id, v1}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183111#comment-17183111
 ] 

Michael Semb Wever edited comment on CASSANDRA-16071 at 8/24/20, 10:05 AM:
---

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with tests and CI run later today…)


was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with tests and CI run later today…)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183127#comment-17183127
 ] 

Michael Semb Wever commented on CASSANDRA-16072:


Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_cas_improvements]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_cas_improvements


(will update with CI later today)

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16072:
---
Test and Documentation Plan: ci-cassandra.a.o
 Status: Patch Available  (was: Open)

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183127#comment-17183127
 ] 

Michael Semb Wever edited comment on CASSANDRA-16072 at 8/24/20, 10:02 AM:
---

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_cas_improvements]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_cas_improvements]


(will update with CI later today)


was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_cas_improvements]
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_cas_improvements


(will update with CI later today)

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16072:
---
Change Category: Performance
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.0-beta
 3.11.x
 Status: Open  (was: Triage Needed)

> Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS 
> loops to atomic adds
> --
>
> Key: CASSANDRA-16072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints, Local/Commit Log
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> Follow up to CASSANDRA-15922
> Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current 
> offset when allocating. Like in CASSANDRA\-15922 the loops on 
> {{.compareAndSet(..)}} can be replaced with atomic adds using the {{. 
> getAndAdd(..)}} method.
> In highly contended environments the CAS failures can be high, starving 
> writes in a running Cassandra node. On the same cluster CASSANDRA\-15922 was 
> found, after CASSANDRA\-15922's fix was deployed, there was still problems 
> around commit log flushing and hints. No flamegraph was collected that 
> demonstrated the thread contention as clearly as was found in 
> CASSANDRA\-15922, but the performance fix proposed here hopefully is obvious 
> enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16072) Reduce thread contention in CommitLogSegment and HintsBuffer by rewriting CAS loops to atomic adds

2020-08-24 Thread Michael Semb Wever (Jira)

Michael Semb Wever created CASSANDRA-16072:
--

 Summary: Reduce thread contention in CommitLogSegment and 
HintsBuffer by rewriting CAS loops to atomic adds
 Key: CASSANDRA-16072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16072
 Project: Cassandra
  Issue Type: Improvement
  Components: Consistency/Hints, Local/Commit Log
Reporter: Michael Semb Wever
Assignee: Michael Semb Wever


Follow up to CASSANDRA-15922

Both CommitLogSegment and HintsBuffer use AtomicIntegers for the current offset 
when allocating. Like in CASSANDRA\-15922 the loops on {{.compareAndSet(..)}} 
can be replaced with atomic adds using the {{. getAndAdd(..)}} method.

In highly contended environments the CAS failures can be high, starving writes 
in a running Cassandra node. On the same cluster CASSANDRA\-15922 was found, 
after CASSANDRA\-15922's fix was deployed, there was still problems around 
commit log flushing and hints. No flamegraph was collected that demonstrated 
the thread contention as clearly as was found in CASSANDRA\-15922, but the 
performance fix proposed here hopefully is obvious enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12662) OOM when using SASI index

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183113#comment-17183113
 ] 

Michael Semb Wever commented on CASSANDRA-12662:


Thanks for the detailed info [~scottcarey]. 

(1) has been filed and a patch raised in CASSANDRA-16071

> OOM when using SASI index
> -
>
> Key: CASSANDRA-12662
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12662
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
> Environment: Linux, 4 CPU cores, 16Gb RAM, Cassandra process utilizes 
> ~8Gb, of which ~4Gb is Java heap
>Reporter: Maxim Podkolzine
>Priority: Urgent
> Fix For: 3.11.x
>
> Attachments: memory-dump.png
>
>
> 2.8Gb of the heap is taken by the index data, pending for flush (see the 
> screenshot). As a result the node fails with OOM.
> Questions:
> - Why can't Cassandra keep up with the inserted data and flush it?
> - What resources/configuration should be changed to improve the performance?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183111#comment-17183111
 ] 

Michael Semb Wever edited comment on CASSANDRA-16071 at 8/24/20, 9:47 AM:
--

Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with tests and CI run later today…)


was (Author: michaelsembwever):
Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with CI later today…)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183111#comment-17183111
 ] 

Michael Semb Wever commented on CASSANDRA-16071:


Patches
 - 
[3.11|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/cassandra-3.11_max_compaction_flush_memory_in_mb_fix]
 
 - 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk_max_compaction_flush_memory_in_mb_fix]
 

(will update with CI later today…)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16071:
---
Description: 
In CASSANDRA-12662, [~scottcarey] 
[reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
 that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
interpreted in bytes rather than megabytes as its name implies.

{quote}
1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
actually memory in BYTES.  If you take it at face value, and set it to say, 
'512' thinking that means 512MB,  you will produce a million temp files rather 
quickly in a large compaction, which will exhaust even large values of 
max_map_count rapidly, and get the OOM: Map Error issue above and possibly have 
a very difficult situation to get a cluster back into a place where nodes 
aren't crashing while initilaizing or soon after.  This issue is minor if you 
know about it in advance and set the value IN BYTES.
{quote}



  was:
In CASSANDRA-12662 [~scottcarey] 
[reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
 that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
interpreted in bytes rather than megabytes as its name implies.

{quote}
1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
actually memory in BYTES.  If you take it at face value, and set it to say, 
'512' thinking that means 512MB,  you will produce a million temp files rather 
quickly in a large compaction, which will exhaust even large values of 
max_map_count rapidly, and get the OOM: Map Error issue above and possibly have 
a very difficult situation to get a cluster back into a place where nodes 
aren't crashing while initilaizing or soon after.  This issue is minor if you 
know about it in advance and set the value IN BYTES.
{quote}




> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662, [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16071:
---
 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Low Hanging Fruit
Discovered By: User Report
Fix Version/s: 4.0-beta
   3.11.x
 Severity: Normal
   Status: Open  (was: Triage Needed)

> max_compaction_flush_memory_in_mb is interpreted as bytes
> -
>
> Key: CASSANDRA-16071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
>
> In CASSANDRA-12662 [~scottcarey] 
> [reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
>  that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
> interpreted in bytes rather than megabytes as its name implies.
> {quote}
> 1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
> actually memory in BYTES.  If you take it at face value, and set it to say, 
> '512' thinking that means 512MB,  you will produce a million temp files 
> rather quickly in a large compaction, which will exhaust even large values of 
> max_map_count rapidly, and get the OOM: Map Error issue above and possibly 
> have a very difficult situation to get a cluster back into a place where 
> nodes aren't crashing while initilaizing or soon after.  This issue is minor 
> if you know about it in advance and set the value IN BYTES.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16071) max_compaction_flush_memory_in_mb is interpreted as bytes

2020-08-24 Thread Michael Semb Wever (Jira)

Michael Semb Wever created CASSANDRA-16071:
--

 Summary: max_compaction_flush_memory_in_mb is interpreted as bytes
 Key: CASSANDRA-16071
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16071
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SASI
Reporter: Michael Semb Wever
Assignee: Michael Semb Wever


In CASSANDRA-12662 [~scottcarey] 
[reported|https://issues.apache.org/jira/browse/CASSANDRA-12662?focusedCommentId=17070055&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17070055]
 that the {{max_compaction_flush_memory_in_mb}} setting gets incorrectly 
interpreted in bytes rather than megabytes as its name implies.

{quote}
1.  the setting 'max_compaction_flush_memory_in_mb' is a misnomer, it is 
actually memory in BYTES.  If you take it at face value, and set it to say, 
'512' thinking that means 512MB,  you will produce a million temp files rather 
quickly in a large compaction, which will exhaust even large values of 
max_map_count rapidly, and get the OOM: Map Error issue above and possibly have 
a very difficult situation to get a cluster back into a place where nodes 
aren't crashing while initilaizing or soon after.  This issue is minor if you 
know about it in advance and set the value IN BYTES.
{quote}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183105#comment-17183105
 ] 

Michael Semb Wever commented on CASSANDRA-16070:


DTest 
[patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
Cassandra-builds 
[patch|https://github.com/apache/cassandra-builds/compare/master...thelastpickle:mck/keep_failed_test_dir]
CI 
[run|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-dtest/35/]

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183048#comment-17183048
 ] 

Berenguer Blasi edited comment on CASSANDRA-15991 at 8/24/20, 9:20 AM:
---

[~samt] any chance to get this reviewed? I am back and available for any 
questions that may rise.


was (Author: bereng):
[~samt] any chance to get this reviewed? I am back an available for any 
questions that may rise.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated: ninja-fix: remove outdated java11 compiling instructions

2020-08-24 Thread mck

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 40d0df4  ninja-fix: remove outdated java11 compiling instructions
40d0df4 is described below

commit 40d0df40f1bc16497bfd34c5c7afacf308b0
Author: Mick Semb Wever 
AuthorDate: Sat Aug 22 13:04:37 2020 +0200

ninja-fix: remove outdated java11 compiling instructions
---
 NEWS.txt | 2 --
 1 file changed, 2 deletions(-)

diff --git a/NEWS.txt b/NEWS.txt
index 7b9676b..7f20a7d 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -52,8 +52,6 @@ New features
 - *Experimental* support for Java 11 has been added. JVM options that 
differ between or are
   specific for Java 8 and 11 have been moved from jvm.options into 
jvm8.options and jvm11.options.
   IMPORTANT: Running C* on Java 11 is *experimental* and do it at your own 
risk.
-  Compilation recommendations: configure Java 11 SDK via JAVA_HOME and 
Java 8 SDK via JAVA8_HOME.
-  Release builds require Java 11 + Java 8. Development builds can use Java 
8 without 11.
 - LCS now respects the max_threshold parameter when compacting - this was 
hard coded to 32
   before, but now it is possible to do bigger compactions when compacting 
from L0 to L1.
   This also applies to STCS-compactions in L0 - if there are more than 32 
sstables in L0


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-08-24 Thread Sam Tunnicliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183049#comment-17183049
 ] 

Sam Tunnicliffe commented on CASSANDRA-15991:
-

[~Bereng] yes, it's on my todo list. Hopefully I can get to it in the next few 
days.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-08-24 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183048#comment-17183048
 ] 

Berenguer Blasi commented on CASSANDRA-15991:
-

[~samt] any chance to get this reviewed? I am back an available for any 
questions that may rise.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15816) Transports are stopped in the wrong order

2020-08-24 Thread Marcus Eriksson (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15816:

  Fix Version/s: (was: 3.11.x)
 (was: 3.0.x)
 3.11.8
 3.0.22
  Since Version: 3.0 alpha 1
Source Control Link: 
https://github.com/apache/cassandra/commit/acbaeb1ee8d0aabe9ffb198df76fb6839b23f072
  (was: https://github.com/apache/cassandra/pull/593)
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

and committed, thanks

> Transports are stopped in the wrong order
> -
>
> Key: CASSANDRA-15816
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15816
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Normal
> Fix For: 4.0-beta, 3.0.22, 3.11.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Stopping gossip while native is running is almost always wrong, change the 
> order of shutdown and log a warning when done manually



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 6de09a622d80785a7baa084a400229680f7e8130
Merge: dfd0aeb 63cdfc9
Author: Marcus Eriksson 
AuthorDate: Mon Aug 24 09:27:29 2020 +0200

Merge branch 'cassandra-3.11' into trunk

 CHANGES.txt  |  1 +
 .../org/apache/cassandra/service/StorageService.java | 16 +++-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --cc CHANGES.txt
index 77a0652,714b104..415f699
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,25 -1,8 +1,26 @@@
 -3.11.8
 +4.0-beta2
 + * Prevent repair from overrunning compaction (CASSANDRA-15817)
 + * fix cqlsh COPY functions in Python 3.8 on Mac (CASSANDRA-16053)
 + * Strip comment blocks from cqlsh input before processing statements 
(CASSANDRA-15802)
 + * Fix unicode chars error input (CASSANDRA-15990)
 + * Improved testability for CacheMetrics and ChunkCacheMetrics 
(CASSANDRA-15788)
 + * Handle errors in StreamSession#prepare (CASSANDRA-15852)
 + * FQL replay should have options to ignore DDL statements (CASSANDRA-16039)
 + * Remove COMPACT STORAGE internals (CASSANDRA-13994)
 + * Make TimestampSerializer accept fractional seconds of varying precision 
(CASSANDRA-15976)
 + * Improve cassandra-stress logging when using a profile file that doesn't 
exist (CASSANDRA-14425)
 + * Improve logging for socket connection/disconnection (CASSANDRA-15980)
 + * Throw FSWriteError upon write failures in order to apply DiskFailurePolicy 
(CASSANDRA-15928)
 + * Forbid altering UDTs used in partition keys (CASSANDRA-15933)
 + * Fix version parsing logic when upgrading from 3.0 (CASSANDRA-15973)
 + * Optimize NoSpamLogger use in hot paths (CASSANDRA-15766)
 + * Verify sstable components on startup (CASSANDRA-15945)
 +Merged from 3.11:
   * Fix short read protection for GROUP BY queries (CASSANDRA-15459)
 + * stop_paranoid disk failure policy is ignored on CorruptSSTableException 
after node is up (CASSANDRA-15191)
   * Frozen RawTuple is not annotated with frozen in the toString method 
(CASSANDRA-15857)
  Merged from 3.0:
+  * Fix gossip shutdown order (CASSANDRA-15816)
   * Remove broken 'defrag-on-read' optimization (CASSANDRA-15432)
   * Check for endpoint collision with hibernating nodes (CASSANDRA-14599)
   * Operational improvements and hardening for replica filtering protection 
(CASSANDRA-15907)
diff --cc src/java/org/apache/cassandra/service/StorageService.java
index 0d10418,ab30bfc..6c8d729
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@@ -397,11 -467,11 +403,6 @@@ public class StorageService extends Not
  
  public void stopTransports()
  {
- if (isGossipActive())
 -if (isRPCServerRunning())
--{
- logger.error("Stopping gossiper");
- stopGossiping();
 -logger.error("Stopping RPC server");
 -stopRPCServer();
--}
  if (isNativeTransportRunning())
  {
  logger.error("Stopping native transport");


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated (dfd0aeb -> 6de09a6)

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from dfd0aeb  Don't allow repair to overrun compaction
 new acbaeb1  Fix gossip shutdown order
 new 63cdfc9  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 6de09a6  Merge branch 'cassandra-3.11' into trunk

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt  |  1 +
 .../org/apache/cassandra/service/StorageService.java | 16 +++-
 2 files changed, 12 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 63cdfc9bf43ad89301e3fc590539d526d225d348
Merge: 0076217 acbaeb1
Author: Marcus Eriksson 
AuthorDate: Mon Aug 24 09:24:12 2020 +0200

Merge branch 'cassandra-3.0' into cassandra-3.11

 CHANGES.txt  |  1 +
 .../org/apache/cassandra/service/StorageService.java | 16 +++-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --cc CHANGES.txt
index eef996d,5b23245..714b104
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,7 -1,5 +1,8 @@@
 -3.0.22:
 +3.11.8
 + * Fix short read protection for GROUP BY queries (CASSANDRA-15459)
 + * Frozen RawTuple is not annotated with frozen in the toString method 
(CASSANDRA-15857)
 +Merged from 3.0:
+  * Fix gossip shutdown order (CASSANDRA-15816)
   * Remove broken 'defrag-on-read' optimization (CASSANDRA-15432)
   * Check for endpoint collision with hibernating nodes (CASSANDRA-14599)
   * Operational improvements and hardening for replica filtering protection 
(CASSANDRA-15907)
diff --cc src/java/org/apache/cassandra/service/StorageService.java
index 240a15e,0aba23c..ab30bfc
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@@ -318,11 -312,17 +318,17 @@@ public class StorageService extends Not
  // should only be called via JMX
  public void stopGossiping()
  {
 -if (initialized)
 +if (gossipActive)
  {
  logger.warn("Stopping gossip by operator request");
+ 
+ if (isNativeTransportRunning())
+ {
+ logger.warn("Disabling gossip while native transport is still 
active is unsafe");
+ }
+ 
  Gossiper.instance.stop();
 -initialized = false;
 +gossipActive = false;
  }
  }
  
@@@ -476,6 -477,11 +477,11 @@@
  logger.error("Stopping native transport");
  stopNativeTransport();
  }
 -if (isInitialized())
++if (isGossipActive())
+ {
+ logger.error("Stopping gossiper");
+ stopGossiping();
+ }
  }
  
  /**


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-3.0 updated: Fix gossip shutdown order

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-3.0 by this push:
 new acbaeb1  Fix gossip shutdown order
acbaeb1 is described below

commit acbaeb1ee8d0aabe9ffb198df76fb6839b23f072
Author: Jeff Jirsa 
AuthorDate: Fri May 15 16:29:45 2020 -0700

Fix gossip shutdown order

Patch by Jeff Jirsa; reviewed by Robert Stupp and marcuse for 
CASSANDRA-15816
---
 CHANGES.txt  |  1 +
 .../org/apache/cassandra/service/StorageService.java | 16 +++-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 2d66ee2..5b23245 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.22:
+ * Fix gossip shutdown order (CASSANDRA-15816)
  * Remove broken 'defrag-on-read' optimization (CASSANDRA-15432)
  * Check for endpoint collision with hibernating nodes (CASSANDRA-14599)
  * Operational improvements and hardening for replica filtering protection 
(CASSANDRA-15907)
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index f0b183d..0aba23c 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -315,6 +315,12 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 if (initialized)
 {
 logger.warn("Stopping gossip by operator request");
+
+if (isNativeTransportRunning())
+{
+logger.warn("Disabling gossip while native transport is still 
active is unsafe");
+}
+
 Gossiper.instance.stop();
 initialized = false;
 }
@@ -461,11 +467,6 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 
 public void stopTransports()
 {
-if (isInitialized())
-{
-logger.error("Stopping gossiper");
-stopGossiping();
-}
 if (isRPCServerRunning())
 {
 logger.error("Stopping RPC server");
@@ -476,6 +477,11 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
 logger.error("Stopping native transport");
 stopNativeTransport();
 }
+if (isInitialized())
+{
+logger.error("Stopping gossiper");
+stopGossiping();
+}
 }
 
 /**


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-3.11 updated (0076217 -> 63cdfc9)

2020-08-24 Thread marcuse

This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 0076217  Merge branch 'cassandra-3.0' into cassandra-3.11
 new acbaeb1  Fix gossip shutdown order
 new 63cdfc9  Merge branch 'cassandra-3.0' into cassandra-3.11

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt  |  1 +
 .../org/apache/cassandra/service/StorageService.java | 16 +++-
 2 files changed, 12 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Berenguer Blasi (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-16070:

Reviewers: Berenguer Blasi

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16070) Add dtest option to keep ccm test directories for just failed tests

2020-08-24 Thread Michael Semb Wever (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16070:
---
Change Category: Quality Assurance
 Complexity: Normal
  Fix Version/s: 4.0-beta
   Assignee: Michael Semb Wever
 Status: Open  (was: Triage Needed)

> Add dtest option to keep ccm test directories for just failed tests
> ---
>
> Key: CASSANDRA-16070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-beta
>
>
> DTests already have an option {{`--keep-test-dir`}} that keeps the ccm test 
> directories. This is useful for debugging failures, especially those that 
> can't be reproduced locally.
> [Introducing|https://github.com/apache/cassandra-builds/commit/51eb85b57b62a542ca456e52a20bee06955f6ec1#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  this option to ci-cassandra.a.o 
> [failed|https://github.com/apache/cassandra-builds/commit/d1600acde19cdbd906b0ff89318d3e8a3f400a70#diff-a885314255cf7d5c7c04889bf01aa2ab]
>  due to lack of disk space.
> This 
> [patch|https://github.com/apache/cassandra-dtest/compare/master...thelastpickle:mck/keep_failed_test_dir]
>  introduces a new option {{`--keep-failed-test-dir`}} that keeps the ccm test 
> directory only for dtests that fail.
> This should suffice, if disk space is still a problem, a further option of 
> {{`--keep-failed-test-log-dir`}} can be added that only keeps the logs inside 
> the ccm test directory, as the majority of space taken up by these 
> directories are the cassandra data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

1 2 >

1 - 100 of 107 matches

Mail list logo