[jira] [Commented] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment

2019-08-12 Thread feroz shaik (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905808#comment-16905808
 ] 

feroz shaik commented on CASSANDRA-15274:
-

Thank you [~philoconduin] . I just want to add other things to this problem 
that we already went through for the community to be aware off.
 # Power disruptions if any - Nothing of that sort reported by infra team.
 # Storage related glitches/issues - Nothing.
 # Network issues - Nothing. (we have not looked in detail with packet capture 
and drops etc, but from monitoring it is clean).
 # Schema change - It was reported on some forum that dropping a column and 
re-creating it back with a different datatype could cause corruptions - This 
was checked but there was no sort of such schema change on the cluster. 
 # CRC check - This is something we are still investigating. If CRC was not 
being done effectively, there is another theory why it would only fail for 
certain data files and not all? From what we have been seeing is that the 
corruption could be on any CF, with no pattern to single compaction strategy 
used etc..

 

Another important consideration to take into account is our PROD env which is 
same like PRE-PROD in terms of infrastructure and C* config setup, schema. The 
only difference is the amount of data residing there - its only 6-10G avg as 
compared to 200 G avg'ng on pre-prod. We do not have any issues there (PROD). 

> Multiple Corrupt datafiles across entire environment 
> -
>
> Key: CASSANDRA-15274
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15274
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Phil O Conduin
>Priority: Normal
>
> Cassandra Version: 2.2.13
> PRE-PROD environment.
>  * 2 datacenters.
>  * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_)
>  * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d)
>  * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site 
> B.
> We also have 2 Reaper Nodes we use for repair.  One reaper node in each 
> datacenter each running with its own Cassandra back end in a cluster together.
> OS Details [Red Hat Linux]
> cass_a@x 0 10:53:01 ~ $ uname -a
> Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 
> x86_64 x86_64 GNU/Linux
> cass_a@x 0 10:57:31 ~ $ cat /etc/*release
> NAME="Red Hat Enterprise Linux Server"
> VERSION="7.6 (Maipo)"
> ID="rhel"
> Storage Layout 
> cass_a@xx 0 10:46:28 ~ $ df -h
> Filesystem                         Size  Used Avail Use% Mounted on
> /dev/mapper/vg01-lv_root            20G  2.2G   18G  11% /
> devtmpfs                            63G     0   63G   0% /dev
> tmpfs                               63G     0   63G   0% /dev/shm
> tmpfs                               63G  4.1G   59G   7% /run
> tmpfs                               63G     0   63G   0% /sys/fs/cgroup
> >> 4 cassandra instances
> /dev/sdd                           1.5T  802G  688G  54% /data/ssd4
> /dev/sda                           1.5T  798G  692G  54% /data/ssd1
> /dev/sdb                           1.5T  681G  810G  46% /data/ssd2
> /dev/sdc                           1.5T  558G  932G  38% /data/ssd3
> Cassandra load is about 200GB and the rest of the space is snapshots
> CPU
> cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
> CPU(s):                64
> Thread(s) per core:    2
> Core(s) per socket:    16
> Socket(s):             2
> *Description of problem:*
> During repair of the cluster, we are seeing multiple corruptions in the log 
> files on a lot of instances.  There seems to be no pattern to the corruption. 
>  It seems that the repair job is finding all the corrupted files for us.  The 
> repair will hang on the node where the corrupted file is found.  To fix this 
> we remove/rename the datafile and bounce the Cassandra instance.  Our 
> hardware/OS team have stated there is no problem on their side.  I do not 
> believe it the repair causing the corruption. 
>  
> So let me give you an example of a corrupted file and maybe someone might be 
> able to work through it with me?
> When this corrupted file was reported in the log it looks like it was the 
> repair that found it.
> $ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until 
> "2019-08-07 22:45:00"
> Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Writing 
> Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, 
> 0%/0% of on/off-heap limit)
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle 
> tree for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, 
> (-1476350953672479093,-1474461
> Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread 
> 

[jira] [Commented] (CASSANDRA-15263) LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null

2019-08-12 Thread feroz shaik (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905770#comment-16905770
 ] 

feroz shaik commented on CASSANDRA-15263:
-

Thank you [~benedict] for those findings. I think what I will try to do next is 
to convince our customer that these messages logged are non-impacting 
exceptions and would not pose any failures for requests (read/write) . I hope 
my understanding is correct! In meanwhile, can we expect any interim fix for 
this issue from community? 

I once again convey my deep gratitude for all your support. 

> LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null
> ---
>
> Key: CASSANDRA-15263
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15263
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: feroz shaik
>Assignee: Benedict
>Priority: Normal
>  Labels: 2.1.16, 3.11.4
> Attachments: sample.system.log, schema.txt, 
> sstabledump_sal_purge_d03.json, sstablemetadata_sal_purge_d03, 
> stack_trace.txt, system.log, system.log, system.log, system.log, 
> system_latest.log
>
>
> We have  hit a problem today while upgrading from 2.1.16 to 3.11.4.
> we encountered this as soon as the first node started up with 3.11.4 
> The full error stack is attached - [^stack_trace.txt] 
>  
> The below errors continued in the log file as long as the process was up.
> ERROR [Native-Transport-Requests-12] 2019-08-06 03:00:47,135 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-8] 2019-08-06 03:00:48,778 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-13] 2019-08-06 03:00:57,454 
>  
> The nodetool version says 3.11.4 and the no of connections on native por t- 
> 9042 was similar to other nodes. The exceptions were scary that we had to 
> call off the change. Any help and insights to this problem from the community 
> is appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15273) cassandra does not start with new systemd version

2019-08-12 Thread maxwellguo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905753#comment-16905753
 ] 

maxwellguo commented on CASSANDRA-15273:


I think you should attach some logs then we can get more information for this 
question. the older version of cassandra should also be attached .

> cassandra does not start with new systemd version
> -
>
> Key: CASSANDRA-15273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15273
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksandr Yatskin
>Priority: Normal
>
> After update systemd with  fixed vulnerability 
> https://access.redhat.com/security/cve/cve-2018-16888, the cassandra service 
> does not start correctly.
> Environment: RHEL 7, systemd-219-67.el7_7.1, cassandra-3.11.4-1 
> (https://www.apache.org/dist/cassandra/redhat/311x/cassandra-3.11.4-1.noarch.rpm)
> ---
> systemctl status cassandra
> ● cassandra.service - LSB: distributed storage system for structured data
>  Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
>  Active: failed (Result: resources) since Fri 2019-08-09 17:20:26 MSK; 1s ago
>  Docs: man:systemd-sysv-generator(8)
>  Process: 2414 ExecStop=/etc/rc.d/init.d/cassandra stop (code=exited, 
> status=0/SUCCESS)
>  Process: 2463 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited, 
> status=0/SUCCESS)
>  Main PID: 1884 (code=exited, status=143)
> Aug 09 17:20:23 desktop43.example.com systemd[1]: Unit cassandra.service 
> entered failed state.
> Aug 09 17:20:23 desktop43.example.com systemd[1]: cassandra.service failed.
> Aug 09 17:20:23 desktop43.example.com systemd[1]: Starting LSB: distributed 
> storage system for structured data...
> Aug 09 17:20:23 desktop43.example.com su[2473]: (to cassandra) root on none
> Aug 09 17:20:26 desktop43.example.com cassandra[2463]: Starting Cassandra: OK
> Aug 09 17:20:26 desktop43.example.com systemd[1]: New main PID 2545 does not 
> belong to service, and PID file is not owned by root. Refusing.
> Aug 09 17:20:26 desktop43.example.com systemd[1]: New main PID 2545 does not 
> belong to service, and PID file is not owned by root. Refusing.
> Aug 09 17:20:26 desktop43.example.com systemd[1]: Failed to start LSB: 
> distributed storage system for structured data.
> Aug 09 17:20:26 desktop43.example.com systemd[1]: Unit cassandra.service 
> entered failed state.
> Aug 09 17:20:26 desktop43.example.com systemd[1]: cassandra.service failed.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15210) Streaming with CDC does not honor cdc_enabled

2019-08-12 Thread Andrew Prudhomme (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905662#comment-16905662
 ] 

Andrew Prudhomme commented on CASSANDRA-15210:
--

For some more context, this is causing us problems in the case where a CDC 
tracked table has large partitions. Since the streaming bootstrap is played 
through the commit log, the stream will fail because of the (0.5 * commit log) 
mutation size limit. This issue means that streaming will fail even when CDC is 
disabled at the node level.

> Streaming with CDC does not honor cdc_enabled
> -
>
> Key: CASSANDRA-15210
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15210
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Feature/Change Data Capture
>Reporter: Andrew Prudhomme
>Assignee: Andrew Prudhomme
>Priority: Normal
>
> When SSTables are streamed for a CDC enabled table, the updates are processed 
> through the write path to ensure they are made available through the commit 
> log. However, currently only the CDC state of the table is checked. Since CDC 
> is enabled at both the node and table level, a node with CDC disabled (with 
> cdc_enabled: false) will unnecessarily send updates through the write path if 
> CDC is enabled on the table. This seems like an oversight.
> I'd imagine the fix would be something like
>  
> {code:java}
> -   hasCDC = cfs.metadata.params.cdc;
> +   hasCDC = cfs.metadata.params.cdc && 
> DatabaseDescriptor.isCDCEnabled();{code}
> in
> org.apache.cassandra.db.streaming.CassandraStreamReceiver (4)
> org.apache.cassandra.streaming.StreamReceiveTask (3.11)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15210) Streaming with CDC does not honor cdc_enabled

2019-08-12 Thread Andrew Prudhomme (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905658#comment-16905658
 ] 

Andrew Prudhomme commented on CASSANDRA-15210:
--

||Branch||Tests||
|[trunk|https://github.com/apache/cassandra/compare/trunk...aprudhomme:15210-trunk]|[cci|https://circleci.com/workflow-run/184e2de1-8893-481a-91f1-910ed4ac246a]|
|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...aprudhomme:15210-3.11]|[cci|https://circleci.com/workflow-run/f5fcba5a-e940-4cc7-b02d-540d8887c258]|

I did not have the circleci resources for dtests, so I ran them locally.

[trunk|https://pastebin.com/8aLbh7GF] - The 3 failed tests passed on retry. The 
rebuild_test error also occurred on the base branch.

[3.11|https://pastebin.com/bE9GBCm8] - The counter_test, largecolumn_test, 
offline_tools_test, and replace_address_test failures also occurred on the base 
branch. All other failures passed on retry.

 

> Streaming with CDC does not honor cdc_enabled
> -
>
> Key: CASSANDRA-15210
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15210
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Feature/Change Data Capture
>Reporter: Andrew Prudhomme
>Assignee: Andrew Prudhomme
>Priority: Normal
>
> When SSTables are streamed for a CDC enabled table, the updates are processed 
> through the write path to ensure they are made available through the commit 
> log. However, currently only the CDC state of the table is checked. Since CDC 
> is enabled at both the node and table level, a node with CDC disabled (with 
> cdc_enabled: false) will unnecessarily send updates through the write path if 
> CDC is enabled on the table. This seems like an oversight.
> I'd imagine the fix would be something like
>  
> {code:java}
> -   hasCDC = cfs.metadata.params.cdc;
> +   hasCDC = cfs.metadata.params.cdc && 
> DatabaseDescriptor.isCDCEnabled();{code}
> in
> org.apache.cassandra.db.streaming.CassandraStreamReceiver (4)
> org.apache.cassandra.streaming.StreamReceiveTask (3.11)
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15170) Reduce the time needed to release in-JVM dtest cluster resources after close

2019-08-12 Thread Jon Meredith (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905590#comment-16905590
 ] 

Jon Meredith commented on CASSANDRA-15170:
--

Pushed and updated - upgrade tests now work on 3.0, 3.x and trunk
 * fixed internode messaging to serialize/deserialize messages from the 
Cassandra magic number.
 * added missing call to shutdown() to the AbstractCluster.Wrapper.

 

That should be everything now and is good for final review, I have no further 
planned changes.

> Reduce the time needed to release in-JVM dtest cluster resources after close
> 
>
> Key: CASSANDRA-15170
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15170
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> There are a few issues that slow the in-JVM dtests from reclaiming metaspace 
> once the cluster is closed.
> IsolatedExecutor issues the shutdown on a SingleExecutorThreadPool, sometimes 
> this thread was still running 10s after the dtest cluster was closed.  
> Instead, switch to a ThreadPoolExecutor with a core pool size of 0 so that 
> the thread executing the class loader close executes sooner.
> If an OutboundTcpConnection is waiting to connect() and the endpoint is not 
> answering, it has to wait for a timeout before it exits. Instead it should 
> check the isShutdown flag and terminate early if shutdown has been requested.
> In 3.0 and above, HintsCatalog.load uses java.nio.Files.list outside of a 
> try-with-resources construct and leaks a file handle for the directory.  This 
> doesn't matter for normal usage, it leaks a file handle for each dtest 
> Instance created.
> On trunk, Netty global event executor threads are still running and delay GC 
> for the instance class loader.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15194) Improve readability of Table metrics Virtual tables units

2019-08-12 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-15194:
--
Status: Ready to Commit  (was: Review In Progress)

> Improve readability of Table metrics Virtual tables units
> -
>
> Key: CASSANDRA-15194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0
>
>
> I just noticed this strange output in the coordinator_reads output::
> {code}
> cqlsh:system_views> select * from coordinator_reads ;
>  count | keyspace_name  | table_name | 99th | max | 
> median | per_second
> ---+++--+-++
>   7573 | tlp_stress |   keyvalue |0 |   0 |   
>0 | 2.2375e-16
>   6076 | tlp_stress |  random_access |0 |   0 |   
>0 | 7.4126e-12
>390 | tlp_stress |sensor_data_udt |0 |   0 |   
>0 | 1.7721e-64
> 30 | system |  local |0 |   0 |   
>0 |   0.006406
> 11 |  system_schema |columns |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |indexes |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema | tables |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |  views |0 |   0 |   
>0 | 1.1192e-16
> {code}
> cc [~cnlwsu]
> btw I realize the output is technically correct, but it's not very readable.  
> For practical purposes this should just say 0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15194) Improve readability of Table metrics Virtual tables units

2019-08-12 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-15194:
--
Reviewers: Benedict, Jon Haddad  (was: Benedict)
   Status: Review In Progress  (was: Patch Available)

> Improve readability of Table metrics Virtual tables units
> -
>
> Key: CASSANDRA-15194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0
>
>
> I just noticed this strange output in the coordinator_reads output::
> {code}
> cqlsh:system_views> select * from coordinator_reads ;
>  count | keyspace_name  | table_name | 99th | max | 
> median | per_second
> ---+++--+-++
>   7573 | tlp_stress |   keyvalue |0 |   0 |   
>0 | 2.2375e-16
>   6076 | tlp_stress |  random_access |0 |   0 |   
>0 | 7.4126e-12
>390 | tlp_stress |sensor_data_udt |0 |   0 |   
>0 | 1.7721e-64
> 30 | system |  local |0 |   0 |   
>0 |   0.006406
> 11 |  system_schema |columns |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |indexes |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema | tables |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |  views |0 |   0 |   
>0 | 1.1192e-16
> {code}
> cc [~cnlwsu]
> btw I realize the output is technically correct, but it's not very readable.  
> For practical purposes this should just say 0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15194) Improve readability of Table metrics Virtual tables units

2019-08-12 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-15194:
--
Source Control Link: 
https://github.com/apache/cassandra/commit/9a175a1697b1107fb63480fb86ffe37b02122267
  Since Version: 4.x
 Status: Resolved  (was: Ready to Commit)
 Resolution: Fixed

> Improve readability of Table metrics Virtual tables units
> -
>
> Key: CASSANDRA-15194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0
>
>
> I just noticed this strange output in the coordinator_reads output::
> {code}
> cqlsh:system_views> select * from coordinator_reads ;
>  count | keyspace_name  | table_name | 99th | max | 
> median | per_second
> ---+++--+-++
>   7573 | tlp_stress |   keyvalue |0 |   0 |   
>0 | 2.2375e-16
>   6076 | tlp_stress |  random_access |0 |   0 |   
>0 | 7.4126e-12
>390 | tlp_stress |sensor_data_udt |0 |   0 |   
>0 | 1.7721e-64
> 30 | system |  local |0 |   0 |   
>0 |   0.006406
> 11 |  system_schema |columns |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |indexes |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema | tables |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |  views |0 |   0 |   
>0 | 1.1192e-16
> {code}
> cc [~cnlwsu]
> btw I realize the output is technically correct, but it's not very readable.  
> For practical purposes this should just say 0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15272) Enhance & reenable RepairTest

2019-08-12 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904704#comment-16904704
 ] 

Dinesh Joshi edited comment on CASSANDRA-15272 at 8/12/19 5:24 PM:
---

Patch: 
https://github.com/apache/cassandra/compare/trunk...dineshjoshi:15272-trunk?expand=1
CircleCI Test Run: 
https://circleci.com/workflow-run/03f4cd80-e5b9-481e-8620-943de5d72707


was (Author: djoshi3):
Patch: 
https://github.com/apache/cassandra/compare/trunk...dineshjoshi:15272-trunk?expand=1
CircleCI Test Run: 
https://circleci.com/workflow-run/dcb04e6a-5f3f-4528-8f00-ec85324e63d0

> Enhance & reenable RepairTest
> -
>
> Key: CASSANDRA-15272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15272
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Normal
>
> Currently the In-JVM RepairTest is not enabled on trunk (See for more info: 
> CASSANDRA-13938). This patch enables the In JVM RepairTest. It adds a new 
> test that tests the compression=off path for SSTables. It will help catch any 
> regressions in repair on this path. This does not fix the issue with the 
> compressed sstable streaming (CASSANDRA-13938). That should be addressed in 
> the original ticket.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk

2019-08-12 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 4a27d482eeb5f94fef534de5ae332978a5a5dae7
Merge: 9a175a1 bb126c0
Author: Mick Semb Wever 
AuthorDate: Mon Aug 12 18:19:01 2019 +0200

Merge branch 'cassandra-3.11' into trunk

 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.11 updated (5d72cdd -> bb126c0)

2019-08-12 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 5d72cdd  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 54aeb50  ninja fix CHANGES.txt for #14952
 new bb126c0  Merge branch 'cassandra-3.0' into cassandra-3.11

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated: ninja fix CHANGES.txt for #14952

2019-08-12 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-3.0 by this push:
 new 54aeb50  ninja fix CHANGES.txt for #14952
54aeb50 is described below

commit 54aeb507593dd4e3d5b8db34bc9fa6164ba504bc
Author: Mick Semb Wever 
AuthorDate: Mon Aug 12 18:10:04 2019 +0200

ninja fix CHANGES.txt for #14952
---
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index e4f4d22..41ddef6 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,5 +1,5 @@
 3.0.19
- * Fix NPE when using allocate_tokens_for_keyspace on new DC/rack 
(CASSANDRA-14592)
+ * Fix NPE when using allocate_tokens_for_keyspace on new DC/rack 
(CASSANDRA-14952)
  * Filter sstables earlier when running cleanup (CASSANDRA-15100)
  * Use mean row count instead of mean column count for index selectivity 
calculation (CASSANDRA-15259)
  * Avoid updating unchanged gossip states (CASSANDRA-15097)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11

2019-08-12 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit bb126c038b23df8b5daaba64cd17079b7af999df
Merge: 5d72cdd 54aeb50
Author: Mick Semb Wever 
AuthorDate: Mon Aug 12 18:17:34 2019 +0200

Merge branch 'cassandra-3.0' into cassandra-3.11

 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --cc CHANGES.txt
index 5da96e6,41ddef6..617bcc4
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,10 -1,5 +1,10 @@@
 -3.0.19
 +3.11.5
 + * Make sure user defined compaction transactions are always closed 
(CASSANDRA-15123)
 + * Fix cassandra-env.sh to use $CASSANDRA_CONF to find cassandra-jaas.config 
(CASSANDRA-14305)
 + * Fixed nodetool cfstats printing index name twice (CASSANDRA-14903)
 + * Add flag to disable SASI indexes, and warnings on creation 
(CASSANDRA-14866)
 +Merged from 3.0:
-  * Fix NPE when using allocate_tokens_for_keyspace on new DC/rack 
(CASSANDRA-14592)
+  * Fix NPE when using allocate_tokens_for_keyspace on new DC/rack 
(CASSANDRA-14952)
   * Filter sstables earlier when running cleanup (CASSANDRA-15100)
   * Use mean row count instead of mean column count for index selectivity 
calculation (CASSANDRA-15259)
   * Avoid updating unchanged gossip states (CASSANDRA-15097)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (9a175a1 -> 4a27d48)

2019-08-12 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 9a175a1  Improve readability of Table metrics Virtual tables units
 new 54aeb50  ninja fix CHANGES.txt for #14952
 new bb126c0  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 4a27d48  Merge branch 'cassandra-3.11' into trunk

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905336#comment-16905336
 ] 

mck commented on CASSANDRA-15260:
-

Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/thelastpickle/cassandra/commit/4513af58a532b91ab4449161a79e70f78b7ebcfc]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43/]|

I've opened the ticket, and will transition it to 'Submit Patch' after I get 
some unit tests in.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905336#comment-16905336
 ] 

mck edited comment on CASSANDRA-15260 at 8/12/19 4:05 PM:
--

Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/thelastpickle/cassandra/commit/4513af58a532b91ab4449161a79e70f78b7ebcfc]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43/]|

I've opened the ticket, and will 'Submit Patch' it after I get some unit tests 
in.


was (Author: michaelsembwever):
Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/thelastpickle/cassandra/commit/4513af58a532b91ab4449161a79e70f78b7ebcfc]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43/]|

I've opened the ticket, and will transition it to 'Submit Patch' after I get 
some unit tests in.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-15260:

 Complexity: Low Hanging Fruit
Change Category: Operability
 Status: Open  (was: Triage Needed)

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Improve readability of Table metrics Virtual tables units

2019-08-12 Thread clohfink
This is an automated email from the ASF dual-hosted git repository.

clohfink pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 9a175a1  Improve readability of Table metrics Virtual tables units
9a175a1 is described below

commit 9a175a1697b1107fb63480fb86ffe37b02122267
Author: Chris Lohfink 
AuthorDate: Thu Aug 8 12:43:18 2019 -0700

Improve readability of Table metrics Virtual tables units

Patch by Chris Lohfink; reviewed by Jon Haddad and Benedict Elliott Smith 
for CASSANDRA-15194
---
 CHANGES.txt|   1 +
 .../cassandra/db/virtual/AbstractVirtualTable.java |   2 +-
 .../apache/cassandra/db/virtual/SimpleDataSet.java |   7 +
 .../cassandra/db/virtual/TableMetricTables.java| 246 ++---
 4 files changed, 180 insertions(+), 76 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index fb246ff..389569b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Improve readability of Table metrics Virtual tables units (CASSANDRA-15194)
  * Fix error with non-existent table for nodetool tablehistograms 
(CASSANDRA-14410)
  * Catch non-IOException in FileUtils.close to make sure that all resources 
are closed (CASSANDRA-15225)
  * Align load column in nodetool status output (CASSANDRA-14787)
diff --git a/src/java/org/apache/cassandra/db/virtual/AbstractVirtualTable.java 
b/src/java/org/apache/cassandra/db/virtual/AbstractVirtualTable.java
index 2998b77..6c49b9a 100644
--- a/src/java/org/apache/cassandra/db/virtual/AbstractVirtualTable.java
+++ b/src/java/org/apache/cassandra/db/virtual/AbstractVirtualTable.java
@@ -42,7 +42,7 @@ import org.apache.cassandra.schema.TableMetadata;
  */
 public abstract class AbstractVirtualTable implements VirtualTable
 {
-private final TableMetadata metadata;
+protected final TableMetadata metadata;
 
 protected AbstractVirtualTable(TableMetadata metadata)
 {
diff --git a/src/java/org/apache/cassandra/db/virtual/SimpleDataSet.java 
b/src/java/org/apache/cassandra/db/virtual/SimpleDataSet.java
index bf40140..6cead97 100644
--- a/src/java/org/apache/cassandra/db/virtual/SimpleDataSet.java
+++ b/src/java/org/apache/cassandra/db/virtual/SimpleDataSet.java
@@ -73,6 +73,8 @@ public class SimpleDataSet extends 
AbstractVirtualTable.AbstractDataSet
 {
 if (null == currentRow)
 throw new IllegalStateException();
+if (null == value || columnName == null)
+throw new IllegalStateException(String.format("Invalid column: 
%s=%s for %s", columnName, value, currentRow));
 currentRow.add(columnName, value);
 return this;
 }
@@ -181,6 +183,11 @@ public class SimpleDataSet extends 
AbstractVirtualTable.AbstractDataSet
 
 return builder.build();
 }
+
+public String toString()
+{
+return "Row[...:" + clustering.toString(metadata)+']';
+}
 }
 
 @SuppressWarnings("unchecked")
diff --git a/src/java/org/apache/cassandra/db/virtual/TableMetricTables.java 
b/src/java/org/apache/cassandra/db/virtual/TableMetricTables.java
index acae2d0..4a043ad 100644
--- a/src/java/org/apache/cassandra/db/virtual/TableMetricTables.java
+++ b/src/java/org/apache/cassandra/db/virtual/TableMetricTables.java
@@ -18,29 +18,25 @@
 
 package org.apache.cassandra.db.virtual;
 
+import java.math.BigDecimal;
 import java.util.Collection;
 import java.util.function.Function;
 
-import com.google.common.base.Preconditions;
 import com.google.common.collect.ImmutableList;
+import org.apache.commons.math3.util.Precision;
 
-import com.codahale.metrics.Counter;
 import com.codahale.metrics.Counting;
 import com.codahale.metrics.Gauge;
-import com.codahale.metrics.Histogram;
 import com.codahale.metrics.Metered;
 import com.codahale.metrics.Metric;
 import com.codahale.metrics.Sampling;
 import com.codahale.metrics.Snapshot;
-import com.codahale.metrics.Timer;
 import org.apache.cassandra.db.ColumnFamilyStore;
 import org.apache.cassandra.db.Keyspace;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.db.marshal.CompositeType;
 import org.apache.cassandra.db.marshal.DoubleType;
-import org.apache.cassandra.db.marshal.Int32Type;
 import org.apache.cassandra.db.marshal.LongType;
-import org.apache.cassandra.db.marshal.ReversedType;
 import org.apache.cassandra.db.marshal.UTF8Type;
 import org.apache.cassandra.dht.IPartitioner;
 import org.apache.cassandra.dht.LocalPartitioner;
@@ -55,13 +51,14 @@ public class TableMetricTables
 {
 private final static String KEYSPACE_NAME = "keyspace_name";
 private final static String TABLE_NAME = "table_name";
-private final static String MEDIAN = "median";
+private final static String P50 = "50th";
 private final static String P99 = "99th";
 private final static String MAX = "max";
 private final 

[jira] [Commented] (CASSANDRA-15194) Improve readability of Table metrics Virtual tables units

2019-08-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905230#comment-16905230
 ] 

Benedict commented on CASSANDRA-15194:
--

bq. While we're here, should we consider renaming median to 50th, so it sorts 
correctly wrt 99th? For consistency I'd love to see 100th, but this would mess 
with order. It might be clearer to name them p50, p99, though, so we can also 
introduce p999 and maintain sort order.

I realise this was really unclear, but I ended up suggesting p50, p99 etc as 
names, so that if we introduce p999 it makes sense (though I guess we could 
always call it 99.9th, and this should sort correctly still)

Not essential, just making sure my lack of clarity wasn't obscuring the 
discussion.

LGTM, +1 either way

> Improve readability of Table metrics Virtual tables units
> -
>
> Key: CASSANDRA-15194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15194
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0
>
>
> I just noticed this strange output in the coordinator_reads output::
> {code}
> cqlsh:system_views> select * from coordinator_reads ;
>  count | keyspace_name  | table_name | 99th | max | 
> median | per_second
> ---+++--+-++
>   7573 | tlp_stress |   keyvalue |0 |   0 |   
>0 | 2.2375e-16
>   6076 | tlp_stress |  random_access |0 |   0 |   
>0 | 7.4126e-12
>390 | tlp_stress |sensor_data_udt |0 |   0 |   
>0 | 1.7721e-64
> 30 | system |  local |0 |   0 |   
>0 |   0.006406
> 11 |  system_schema |columns |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |indexes |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema | tables |0 |   0 |   
>0 | 1.1192e-16
> 11 |  system_schema |  views |0 |   0 |   
>0 | 1.1192e-16
> {code}
> cc [~cnlwsu]
> btw I realize the output is technically correct, but it's not very readable.  
> For practical purposes this should just say 0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15232) Arithmetic operators over decimal truncate results

2019-08-12 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15232:
-
Status: Ready to Commit  (was: Review In Progress)

Thanks [~Override], the patch looks great to me.  I'm just running it through 
our CI [here|https://circleci.com/gh/belliottsmith/cassandra/2796] before 
committing.

> Arithmetic operators over decimal truncate results
> --
>
> Key: CASSANDRA-15232
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15232
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Semantics
>Reporter: Benedict
>Assignee: Liudmila Kornilova
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The decimal operators hard-code a 128 bit precision for their computations.  
> Probably a precision needs to be configured or decided somehow, but it’s not 
> clear why 128bit was chosen.  Particularly for multiplication and addition, 
> it’s very unclear why we truncate, which is different to our behaviour for 
> e.g. sum() aggregates.  Probably for division we should also ensure that we 
> do not reduce the precision of the two operands.  A minimum of decimal128 
> seems reasonable, but a maximum does not.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15172) LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException

2019-08-12 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15172:
-
Test and Documentation Plan: unit test included
 Status: Patch Available  (was: Open)

This bug appears to be similar to CASSANDRA-15263, in that a reverse query with 
the RTBoundCloser is the likely source of asymmetric range tombstone bounds.  
However in this case the problem is much easier to solve; we simply have to not 
assume the bounds have the same length.

I have pushed a patch 
[here|https://github.com/belliottsmith/cassandra/tree/15172-3.0]

> LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException
> 
>
> Key: CASSANDRA-15172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Shalom
>Assignee: Benedict
>Priority: Normal
>
> Hi All,
> This is the first time I open an issue, so apologies if I'm not following the 
> rules properly.
>  
> After upgrading a node from version 2.1.21 to 3.11.4, we've started seeing a 
> lot of AbstractLocalAwareExecutorService exceptions. This happened right 
> after the node successfully started up with the new 3.11.4 binaries. 
> INFO  [main] 2019-06-05 04:41:37,730 Gossiper.java:1715 - No gossip backlog; 
> proceeding
> INFO  [main] 2019-06-05 04:41:38,036 NativeTransportService.java:70 - Netty 
> using native Epoll event loop
> INFO  [main] 2019-06-05 04:41:38,117 Server.java:155 - Using Netty Version: 
> [netty-buffer=netty-buffer-4.0.44.Final.452812a, 
> netty-codec=netty-codec-4.0.44.Final.452812a, 
> netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, 
> netty-codec-http=netty-codec-http-4.0.44.Final.452812a, 
> netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, 
> netty-common=netty-common-4.0.44.Final.452812a, 
> netty-handler=netty-handler-4.0.44.Final.452812a, 
> netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, 
> netty-transport=netty-transport-4.0.44.Final.452812a, 
> netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a,
>  netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, 
> netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, 
> netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
> INFO  [main] 2019-06-05 04:41:38,118 Server.java:156 - Starting listening for 
> CQL clients on /0.0.0.0:9042 (unencrypted)...
> INFO  [main] 2019-06-05 04:41:38,179 CassandraDaemon.java:556 - Not starting 
> RPC server as requested. Use JMX (StorageService->startRPCServer()) or 
> nodetool (enablethrift) to start it
> INFO  [Native-Transport-Requests-21] 2019-06-05 04:41:39,145 
> AuthCache.java:161 - (Re)initializing PermissionsCache (validity 
> period/update interval/max entries) (2000/2000/1000)
> INFO  [OptionalTasks:1] 2019-06-05 04:41:39,729 CassandraAuthorizer.java:409 
> - Converting legacy permissions data
> INFO  [HANDSHAKE-/10.10.10.8] 2019-06-05 04:41:39,808 
> OutboundTcpConnection.java:561 - Handshaking version with /10.10.10.8
> INFO  [HANDSHAKE-/10.10.10.9] 2019-06-05 04:41:39,808 
> OutboundTcpConnection.java:561 - Handshaking version with /10.10.10.9
> INFO  [HANDSHAKE-dc1_02/10.10.10.6] 2019-06-05 04:41:39,809 
> OutboundTcpConnection.java:561 - Handshaking version with dc1_02/10.10.10.6
> WARN  [ReadStage-2] 2019-06-05 04:41:39,857 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-2,5,main]: {}
> java.lang.ArrayIndexOutOfBoundsException: 1
>     at 
> org.apache.cassandra.db.AbstractBufferClusteringPrefix.get(AbstractBufferClusteringPrefix.java:55)
>     at 
> org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSizeCompound(LegacyLayout.java:2545)
>     at 
> org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSize(LegacyLayout.java:2522)
>     at 
> org.apache.cassandra.db.LegacyLayout.serializedSizeAsLegacyPartition(LegacyLayout.java:565)
>     at 
> org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:446)
>     at 
> org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:352)
>     at 
> org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:171)
>     at 
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:77)
>     at 
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:802)
>     at 
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:953)
>     at 
> org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:929)
>     at 
> 

[jira] [Created] (CASSANDRA-15274) Multiple Corrupt datafiles across entire environment

2019-08-12 Thread Phil O Conduin (JIRA)
Phil O Conduin created CASSANDRA-15274:
--

 Summary: Multiple Corrupt datafiles across entire environment 
 Key: CASSANDRA-15274
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15274
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: Phil O Conduin


Cassandra Version: 2.2.13

PRE-PROD environment.
 * 2 datacenters.
 * 9 physical servers in each datacenter - (_Cisco UCS C220 M4 SFF_)
 * 4 Cassandra instances on each server (cass_a, cass_b, cass_c, cass_d)
 * 72 Cassandra instances across the 2 data centres, 36 in site A, 36 in site B.

We also have 2 Reaper Nodes we use for repair.  One reaper node in each 
datacenter each running with its own Cassandra back end in a cluster together.

OS Details [Red Hat Linux]
cass_a@x 0 10:53:01 ~ $ uname -a
Linux x 3.10.0-957.5.1.el7.x86_64 #1 SMP Wed Dec 19 10:46:58 EST 2018 x86_64 
x86_64 x86_64 GNU/Linux

cass_a@x 0 10:57:31 ~ $ cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"

Storage Layout 
cass_a@xx 0 10:46:28 ~ $ df -h
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/vg01-lv_root            20G  2.2G   18G  11% /
devtmpfs                            63G     0   63G   0% /dev
tmpfs                               63G     0   63G   0% /dev/shm
tmpfs                               63G  4.1G   59G   7% /run
tmpfs                               63G     0   63G   0% /sys/fs/cgroup
>> 4 cassandra instances
/dev/sdd                           1.5T  802G  688G  54% /data/ssd4
/dev/sda                           1.5T  798G  692G  54% /data/ssd1
/dev/sdb                           1.5T  681G  810G  46% /data/ssd2
/dev/sdc                           1.5T  558G  932G  38% /data/ssd3

Cassandra load is about 200GB and the rest of the space is snapshots

CPU
cass_a@x 127 10:58:47 ~ $ lscpu | grep -E '^Thread|^Core|^Socket|^CPU\('
CPU(s):                64
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             2

*Description of problem:*
During repair of the cluster, we are seeing multiple corruptions in the log 
files on a lot of instances.  There seems to be no pattern to the corruption.  
It seems that the repair job is finding all the corrupted files for us.  The 
repair will hang on the node where the corrupted file is found.  To fix this we 
remove/rename the datafile and bounce the Cassandra instance.  Our hardware/OS 
team have stated there is no problem on their side.  I do not believe it the 
repair causing the corruption. 

 

So let me give you an example of a corrupted file and maybe someone might be 
able to work through it with me?

When this corrupted file was reported in the log it looks like it was the 
repair that found it.

$ journalctl -u cassmeta-cass_b.service --since "2019-08-07 22:25:00" --until 
"2019-08-07 22:45:00"

Aug 07 22:30:33 cassandra[34611]: INFO  21:30:33 Writing 
Memtable-compactions_in_progress@830377457(0.008KiB serialized bytes, 1 ops, 
0%/0% of on/off-heap limit)
Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Failed creating a merkle tree 
for [repair #9587a200-b95a-11e9-8920-9f72868b8375 on KeyspaceMetadata/x, 
(-1476350953672479093,-1474461
Aug 07 22:30:33 cassandra[34611]: ERROR 21:30:33 Exception in thread 
Thread[ValidationExecutor:825,1,main]
Aug 07 22:30:33 cassandra[34611]: org.apache.cassandra.io.FSReadError: 
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
/x/ssd2/data/KeyspaceMetadata/x-1e453cb0
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:365)
 ~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:361) 
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:340)
 ~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382)
 ~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:366)
 ~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:81)
 ~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) 
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) 
~[apache-cassandra-2.2.13.jar:2.2.13]
Aug 07 22:30:33 cassandra[34611]: at 

[jira] [Updated] (CASSANDRA-15172) LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException

2019-08-12 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15172:
-
 Severity: Normal
   Complexity: Low Hanging Fruit
Discovered By: User Report
 Bug Category: Parent values: Availability(12983)Level 1 values: Response 
Crash(12991)
  Component/s: Local/Other
   Status: Open  (was: Triage Needed)

> LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException
> 
>
> Key: CASSANDRA-15172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Shalom
>Assignee: Benedict
>Priority: Normal
>
> Hi All,
> This is the first time I open an issue, so apologies if I'm not following the 
> rules properly.
>  
> After upgrading a node from version 2.1.21 to 3.11.4, we've started seeing a 
> lot of AbstractLocalAwareExecutorService exceptions. This happened right 
> after the node successfully started up with the new 3.11.4 binaries. 
> INFO  [main] 2019-06-05 04:41:37,730 Gossiper.java:1715 - No gossip backlog; 
> proceeding
> INFO  [main] 2019-06-05 04:41:38,036 NativeTransportService.java:70 - Netty 
> using native Epoll event loop
> INFO  [main] 2019-06-05 04:41:38,117 Server.java:155 - Using Netty Version: 
> [netty-buffer=netty-buffer-4.0.44.Final.452812a, 
> netty-codec=netty-codec-4.0.44.Final.452812a, 
> netty-codec-haproxy=netty-codec-haproxy-4.0.44.Final.452812a, 
> netty-codec-http=netty-codec-http-4.0.44.Final.452812a, 
> netty-codec-socks=netty-codec-socks-4.0.44.Final.452812a, 
> netty-common=netty-common-4.0.44.Final.452812a, 
> netty-handler=netty-handler-4.0.44.Final.452812a, 
> netty-tcnative=netty-tcnative-1.1.33.Fork26.142ecbb, 
> netty-transport=netty-transport-4.0.44.Final.452812a, 
> netty-transport-native-epoll=netty-transport-native-epoll-4.0.44.Final.452812a,
>  netty-transport-rxtx=netty-transport-rxtx-4.0.44.Final.452812a, 
> netty-transport-sctp=netty-transport-sctp-4.0.44.Final.452812a, 
> netty-transport-udt=netty-transport-udt-4.0.44.Final.452812a]
> INFO  [main] 2019-06-05 04:41:38,118 Server.java:156 - Starting listening for 
> CQL clients on /0.0.0.0:9042 (unencrypted)...
> INFO  [main] 2019-06-05 04:41:38,179 CassandraDaemon.java:556 - Not starting 
> RPC server as requested. Use JMX (StorageService->startRPCServer()) or 
> nodetool (enablethrift) to start it
> INFO  [Native-Transport-Requests-21] 2019-06-05 04:41:39,145 
> AuthCache.java:161 - (Re)initializing PermissionsCache (validity 
> period/update interval/max entries) (2000/2000/1000)
> INFO  [OptionalTasks:1] 2019-06-05 04:41:39,729 CassandraAuthorizer.java:409 
> - Converting legacy permissions data
> INFO  [HANDSHAKE-/10.10.10.8] 2019-06-05 04:41:39,808 
> OutboundTcpConnection.java:561 - Handshaking version with /10.10.10.8
> INFO  [HANDSHAKE-/10.10.10.9] 2019-06-05 04:41:39,808 
> OutboundTcpConnection.java:561 - Handshaking version with /10.10.10.9
> INFO  [HANDSHAKE-dc1_02/10.10.10.6] 2019-06-05 04:41:39,809 
> OutboundTcpConnection.java:561 - Handshaking version with dc1_02/10.10.10.6
> WARN  [ReadStage-2] 2019-06-05 04:41:39,857 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-2,5,main]: {}
> java.lang.ArrayIndexOutOfBoundsException: 1
>     at 
> org.apache.cassandra.db.AbstractBufferClusteringPrefix.get(AbstractBufferClusteringPrefix.java:55)
>     at 
> org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSizeCompound(LegacyLayout.java:2545)
>     at 
> org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSize(LegacyLayout.java:2522)
>     at 
> org.apache.cassandra.db.LegacyLayout.serializedSizeAsLegacyPartition(LegacyLayout.java:565)
>     at 
> org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:446)
>     at 
> org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:352)
>     at 
> org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:171)
>     at 
> org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:77)
>     at 
> org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:802)
>     at 
> org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:953)
>     at 
> org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:929)
>     at 
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:62)
>     at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     

[jira] [Commented] (CASSANDRA-15230) Resizing window aborts cqlsh COPY: Interrupted system call

2019-08-12 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905096#comment-16905096
 ] 

Johannes Weißl commented on CASSANDRA-15230:


I am unassigning myself, maybe this is the reason why nobody has commented on 
this yet?

> Resizing window aborts cqlsh COPY: Interrupted system call
> --
>
> Key: CASSANDRA-15230
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15230
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Johannes Weißl
>Priority: Normal
> Attachments: 15230-2.1.txt
>
>
> When resizing a terminal window running cqlsh COPY, the Python program aborts 
> immediately with:
> {{:1:(4, 'Interrupted system call')}}
> This is very annoying, as COPY commands usually run for a long time, and e.g 
> re-attaching to a screen session with a different terminal size aborts the 
> command. This bug affects versions 2.1, 2.2, 3.0, 3.x, and trunk.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15230) Resizing window aborts cqlsh COPY: Interrupted system call

2019-08-12 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johannes Weißl reassigned CASSANDRA-15230:
--

Assignee: (was: Johannes Weißl)

> Resizing window aborts cqlsh COPY: Interrupted system call
> --
>
> Key: CASSANDRA-15230
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15230
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Johannes Weißl
>Priority: Normal
> Attachments: 15230-2.1.txt
>
>
> When resizing a terminal window running cqlsh COPY, the Python program aborts 
> immediately with:
> {{:1:(4, 'Interrupted system call')}}
> This is very annoying, as COPY commands usually run for a long time, and e.g 
> re-attaching to a screen session with a different terminal size aborts the 
> command. This bug affects versions 2.1, 2.2, 3.0, 3.x, and trunk.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-12 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905041#comment-16905041
 ] 

Branimir Lambov commented on CASSANDRA-15260:
-

The code looks good to me. As mentioned, it would be good to change the name to 
match DSE's.

 

On making algorithmic allocation the default, let's continue the discussion on 
CASSANDRA-13701 once this is committed.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15263) LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null

2019-08-12 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15263:
-
 Severity: Normal
   Complexity: Challenging
Discovered By: User Report
 Bug Category: Parent values: Availability(12983)Level 1 values: Response 
Crash(12991)
   Status: Open  (was: Triage Needed)

> LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null
> ---
>
> Key: CASSANDRA-15263
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15263
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: feroz shaik
>Assignee: Benedict
>Priority: Normal
>  Labels: 2.1.16, 3.11.4
> Attachments: sample.system.log, schema.txt, 
> sstabledump_sal_purge_d03.json, sstablemetadata_sal_purge_d03, 
> stack_trace.txt, system.log, system.log, system.log, system.log, 
> system_latest.log
>
>
> We have  hit a problem today while upgrading from 2.1.16 to 3.11.4.
> we encountered this as soon as the first node started up with 3.11.4 
> The full error stack is attached - [^stack_trace.txt] 
>  
> The below errors continued in the log file as long as the process was up.
> ERROR [Native-Transport-Requests-12] 2019-08-06 03:00:47,135 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-8] 2019-08-06 03:00:48,778 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-13] 2019-08-06 03:00:57,454 
>  
> The nodetool version says 3.11.4 and the no of connections on native por t- 
> 9042 was similar to other nodes. The exceptions were scary that we had to 
> call off the change. Any help and insights to this problem from the community 
> is appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15263) LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null

2019-08-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904954#comment-16904954
 ] 

Benedict commented on CASSANDRA-15263:
--

Thanks [~ferozshaik...@gmail.com].  I can see what's happening now, and it 
looks benign.  It should resolve when you finish upgrading the nodes in your 
cluster.

The error is caused by the rare scenario of the rows not using all of the 
declared clustering columns, which inserts a {{null}} clustering value for 
{{column2}}.  It was not thought by the author of the legacy converter that a 
RT clustering component could be {{null}}, and they would have ordinarily been 
correct as row deletions are no longer stored as range tombstones in 3.0, 
however synthetic range tombstone bounds can be built from row clusterings, and 
since the row has a null component, the synthetic RT does also.  

The fix for 3.0 would be simple, namely to ignore the {{null}} value when 
computing a digest, however it looks like this {{null}} is also incompatible 
with 2.1, since it could legitimately never arise there, without the new 
machinery of 3.0 that synthesises them.  So sending this synthetic clustering 
to a 2.1 node could be more harmful than throwing this exception.

I will have to think about the best recourse to address this in 3.0 without 
adversely impacting a 2.1 node.

> LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null
> ---
>
> Key: CASSANDRA-15263
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15263
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: feroz shaik
>Assignee: Benedict
>Priority: Normal
>  Labels: 2.1.16, 3.11.4
> Attachments: sample.system.log, schema.txt, 
> sstabledump_sal_purge_d03.json, sstablemetadata_sal_purge_d03, 
> stack_trace.txt, system.log, system.log, system.log, system.log, 
> system_latest.log
>
>
> We have  hit a problem today while upgrading from 2.1.16 to 3.11.4.
> we encountered this as soon as the first node started up with 3.11.4 
> The full error stack is attached - [^stack_trace.txt] 
>  
> The below errors continued in the log file as long as the process was up.
> ERROR [Native-Transport-Requests-12] 2019-08-06 03:00:47,135 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-8] 2019-08-06 03:00:48,778 
> ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-13] 2019-08-06 03:00:57,454 
>  
> The nodetool version says 3.11.4 and the no of connections on native por t- 
> 9042 was similar to other nodes. The exceptions were scary that we had to 
> call off the change. Any help and insights to this problem from the community 
> is appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15273) cassandra does not start with new systemd version

2019-08-12 Thread Aleksandr Yatskin (JIRA)
Aleksandr Yatskin created CASSANDRA-15273:
-

 Summary: cassandra does not start with new systemd version
 Key: CASSANDRA-15273
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15273
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksandr Yatskin


After update systemd with  fixed vulnerability 
https://access.redhat.com/security/cve/cve-2018-16888, the cassandra service 
does not start correctly.

Environment: RHEL 7, systemd-219-67.el7_7.1, cassandra-3.11.4-1 
(https://www.apache.org/dist/cassandra/redhat/311x/cassandra-3.11.4-1.noarch.rpm)

---

systemctl status cassandra
● cassandra.service - LSB: distributed storage system for structured data
 Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
 Active: failed (Result: resources) since Fri 2019-08-09 17:20:26 MSK; 1s ago
 Docs: man:systemd-sysv-generator(8)
 Process: 2414 ExecStop=/etc/rc.d/init.d/cassandra stop (code=exited, 
status=0/SUCCESS)
 Process: 2463 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited, 
status=0/SUCCESS)
 Main PID: 1884 (code=exited, status=143)

Aug 09 17:20:23 desktop43.example.com systemd[1]: Unit cassandra.service 
entered failed state.
Aug 09 17:20:23 desktop43.example.com systemd[1]: cassandra.service failed.
Aug 09 17:20:23 desktop43.example.com systemd[1]: Starting LSB: distributed 
storage system for structured data...
Aug 09 17:20:23 desktop43.example.com su[2473]: (to cassandra) root on none
Aug 09 17:20:26 desktop43.example.com cassandra[2463]: Starting Cassandra: OK
Aug 09 17:20:26 desktop43.example.com systemd[1]: New main PID 2545 does not 
belong to service, and PID file is not owned by root. Refusing.
Aug 09 17:20:26 desktop43.example.com systemd[1]: New main PID 2545 does not 
belong to service, and PID file is not owned by root. Refusing.
Aug 09 17:20:26 desktop43.example.com systemd[1]: Failed to start LSB: 
distributed storage system for structured data.
Aug 09 17:20:26 desktop43.example.com systemd[1]: Unit cassandra.service 
entered failed state.
Aug 09 17:20:26 desktop43.example.com systemd[1]: cassandra.service failed.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org