[jira] [Commented] (CASSANDRA-10751) "Pool is shutdown" error when running Hadoop jobs on Yarn

2018-05-03 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462227#comment-16462227
 ] 

mck commented on CASSANDRA-10751:
-

Looking green. Trunk's {{test-all}} is already broken, waiting on 
CASSANDRA-14428.
And comparing failed dtests to their respective base branches:
 - 2.2: rebuild_test.TestRebuild.test_resumable_rebuild: no relation to the 
patch, already {{Flakiness: 37%}}
 - 3.0: stable {{bootstrap_test.TestBootstrap.test_simultaneous_bootstrap}} 
failed, tested locally ok.
 - 3.11: stable 
{{repair_tests.repair_test.TestRepair.test_thread_count_repair}} failed, tested 
locally ok.
 - trunk: 8 down to 2 existing failures.

Committed.

> "Pool is shutdown" error when running Hadoop jobs on Yarn
> -
>
> Key: CASSANDRA-10751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10751
> Project: Cassandra
>  Issue Type: Bug
> Environment: Hadoop 2.7.1 (HDP 2.3.2)
> Cassandra 2.1.11
>Reporter: Cyril Scetbon
>Assignee: Cyril Scetbon
>Priority: Major
> Fix For: 4.0, 2.2.13, 3.0.17, 3.11.3
>
> Attachments: CASSANDRA-10751-2.2.patch, CASSANDRA-10751-3.0.patch, 
> output.log
>
>
> Trying to execute an Hadoop job on Yarn, I get errors from Cassandra's 
> internal code. It seems that connections are shutdown but we can't understand 
> why ...
> Here is a subtract of the errors. I also add a file with the complete debug 
> logs.
> {code}
> 15/11/22 20:05:54 [main]: DEBUG core.RequestHandler: Error querying 
> node006.internal.net/192.168.12.22:9042, trying next host (error is: 
> com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown)
> Failed with exception java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> 15/11/22 20:05:54 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
> java.io.IOException: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1672)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.io.IOException: 
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) 
> tried for query failed (tried: node006.internal.net/192.168.12.22:9042 
> (com.datastax.driver.core.ConnectionException: 
> [node006.internal.net/192.168.12.22:9042] Pool is shutdown))
>   at 
> org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:132)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:674)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:324)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
>   ... 15 more
> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All 

[jira] [Comment Edited] (CASSANDRA-12244) progress in compactionstats is reported wrongly for view builds

2018-05-03 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460524#comment-16460524
 ] 

mck edited comment on CASSANDRA-12244 at 5/3/18 10:33 AM:
--

Patch looks good. Note in trunk it was fixed in a different manner, but the 
clash with the human readable flag was still there so I kept the introduction 
of the {{Unit}} enum.


I've put your patch into relevant branches, and will commit once they go green.
In the meantime [~jasonstack], could you please check i've applied your patch 
appropriately in each branch and commit.

|| Branch || uTest || dTest ||
|[cassandra-3.0_12244|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.0_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/540/
 |
|[cassandra-3.11_12244|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/541/
 |
|[trunk_12244|https://github.com/thelastpickle/cassandra/tree/mck/trunk_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/542/
 |


was (Author: michaelsembwever):
Patch looks good. Note in trunk it was fixed in a different manner, but the 
clash with the human readable flag was still there so I kept the introduction 
of the {{Unit}} enum.


I've put your patch into relevant branches, and will commit once they go green.
In the meantime [~jasonstack], could you please check i've applied your patch 
appropriately in each branch and commit.

|| Branch || uTest || dTest ||
|[cassandra-3.0_12244|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.0_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/540/
 |
|[cassandra-3.11_12244|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/541/
 |
|[trunk_12244|https://github.com/thelastpickle/cassandra/tree/mck/trunk_12244]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_12244.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_12244]|
 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/541/
 |

> progress in compactionstats is reported wrongly for view builds
> ---
>
> Key: CASSANDRA-12244
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12244
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x
>
>
> In the view build progress given by compactionstats, there are several issues 
> :
> {code}
>  id   compaction type   keyspace 
> table   completed   total unit   progress
>038d3690-4dbe-11e6-b207-21ec388d48e6View build  mykeyspace   
> mytable   844 bytes   967 bytes   ranges 87.28%
> Active compaction remaining time :n/a
> {code}
> 1) those are ranges, not bytes
> 2) it's not at 87.28%, it's at ~4%. the method for calculating progress in 
> Cassandra is wrong: it neglects to sort the tokens it's iterating through 
> (ViewBuilder.java) and thus ends up with a random number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12244) progress in compactionstats is reported wrongly for view builds

2018-05-03 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462348#comment-16462348
 ] 

mck commented on CASSANDRA-12244:
-

Looking green. Trunk's {{test-all}} is already broken, waiting on 
CASSANDRA-14428.
And comparing failed dtests to their respective base branches:
 - 3.0: stable {{bootstrap_test.TestBootstrap.test_simultaneous_bootstrap}} 
failed, tested locally ok.
 - 3.11: no difference.
 - trunk: two additional failures: 
{{bootstrap_test.TestBootstrap.test_decommissioned_wiped_node_can_join}} and 
{{repair_tests.repair_test.TestRepair.test_dc_repair}}; neither are flakey but 
both tested locally ok.

> progress in compactionstats is reported wrongly for view builds
> ---
>
> Key: CASSANDRA-12244
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12244
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x
>
>
> In the view build progress given by compactionstats, there are several issues 
> :
> {code}
>  id   compaction type   keyspace 
> table   completed   total unit   progress
>038d3690-4dbe-11e6-b207-21ec388d48e6View build  mykeyspace   
> mytable   844 bytes   967 bytes   ranges 87.28%
> Active compaction remaining time :n/a
> {code}
> 1) those are ranges, not bytes
> 2) it's not at 87.28%, it's at ~4%. the method for calculating progress in 
> Cassandra is wrong: it neglects to sort the tokens it's iterating through 
> (ViewBuilder.java) and thus ends up with a random number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12244) progress in compactionstats is reported wrongly for view builds

2018-05-03 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-12244:

   Resolution: Fixed
Fix Version/s: (was: 3.0.x)
   3.11.3
   3.0.17
   4.0
   Status: Resolved  (was: Patch Available)

Committed.

> progress in compactionstats is reported wrongly for view builds
> ---
>
> Key: CASSANDRA-12244
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12244
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Tom van der Woerdt
>Assignee: ZhaoYang
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> In the view build progress given by compactionstats, there are several issues 
> :
> {code}
>  id   compaction type   keyspace 
> table   completed   total unit   progress
>038d3690-4dbe-11e6-b207-21ec388d48e6View build  mykeyspace   
> mytable   844 bytes   967 bytes   ranges 87.28%
> Active compaction remaining time :n/a
> {code}
> 1) those are ranges, not bytes
> 2) it's not at 87.28%, it's at ~4%. the method for calculating progress in 
> Cassandra is wrong: it neglects to sort the tokens it's iterating through 
> (ViewBuilder.java) and thus ends up with a random number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14191) Bootstrap/Streaming fails with missing CompressionInfo

2018-01-24 Thread mck (JIRA)
mck created CASSANDRA-14191:
---

 Summary: Bootstrap/Streaming fails with missing CompressionInfo
 Key: CASSANDRA-14191
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14191
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: mck


Multiple attempts at bootstrapping a new node fail, with streaming failing 
(either hanging or stopping the bootstrap node) always from the same node.

 

The original node throws the following exception during the streaming process:
{noformat}

ERROR [STREAM-OUT-/10.83.74.236:47220] 2018-01-24 19:25:22,532 
StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
Streaming error occurred on session with peer X.X.X.X
java.lang.AssertionError: null
at 
org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:473)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.io.compress.CompressionMetadata.getChunksForSections(CompressionMetadata.java:287)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:172)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:82)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:377)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:349)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
{noformat}

The bootstrapping node's reaction to this failure is
{noformat}
ERROR [STREAM-IN-/10.83.74.234:7001] 2018-01-24 19:25:22,957 
StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
Streaming error occurred on session with peer X.X.X.X
java.io.EOFException: null
at java.io.DataInputStream.readInt(DataInputStream.java:392) 
~[na:1.8.0_151]
at 
org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:68)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:47)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.deserialize(FileMessageHeader.java:188)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:42)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
 ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
{noformat}

Other observations:
 - always the one node that fails,
 - multiple bootstrap attempts (using different ec2 instances) all fail,
 - the exception occurs to {{\-tmp-}} sstables that have no CompressionInfo 
component,
 - it's a different {{\-tmp-}} sstable each time,
 - running either {{nodetool cleanup}} or {{nodetool scrub}} made no difference,





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14191) Bootstrap/Streaming fails with missing CompressionInfo

2018-01-24 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14191:

Reproduced In: 2.1.18

> Bootstrap/Streaming fails with missing CompressionInfo
> --
>
> Key: CASSANDRA-14191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14191
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: mck
>Priority: Major
>
> Multiple attempts at bootstrapping a new node fail, with streaming failing 
> (either hanging or stopping the bootstrap node) always from the same node.
>  
> The original node throws the following exception during the streaming process:
> {noformat}
> ERROR [STREAM-OUT-/10.83.74.236:47220] 2018-01-24 19:25:22,532 
> StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
> Streaming error occurred on session with peer X.X.X.X
> java.lang.AssertionError: null
>   at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:473)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.io.compress.CompressionMetadata.getChunksForSections(CompressionMetadata.java:287)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:172)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:82)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:377)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:349)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
> {noformat}
> The bootstrapping node's reaction to this failure is
> {noformat}
> ERROR [STREAM-IN-/10.83.74.234:7001] 2018-01-24 19:25:22,957 
> StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
> Streaming error occurred on session with peer X.X.X.X
> java.io.EOFException: null
>   at java.io.DataInputStream.readInt(DataInputStream.java:392) 
> ~[na:1.8.0_151]
>   at 
> org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:68)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:47)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.deserialize(FileMessageHeader.java:188)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:42)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
> {noformat}
> Other observations:
>  - always the one node that fails,
>  - multiple bootstrap attempts (using different ec2 instances) all fail,
>  - the exception occurs to {{\-tmp-}} sstables that have no CompressionInfo 
> component,
>  - it's a different {{\-tmp-}} sstable each time,
>  - running either {{nodetool cleanup}} or {{nodetool scrub}} made no 
> difference,



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-01 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14212:

Description: 
Backport CASSANDRA-13080 to 3.11.x

 

The patch applies without conflict to the {{cassandra-3.11}} and equally 
concerns to users of Cassandra-3.11.1

 

  was:
Backport CASSANDRA-13080 to 3.11.x

 


> Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non 
> bootstrap case as well)
> -
>
> Key: CASSANDRA-14212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: mck
>Priority: Major
>
> Backport CASSANDRA-13080 to 3.11.x
>  
> The patch applies without conflict to the {{cassandra-3.11}} and equally 
> concerns to users of Cassandra-3.11.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-01 Thread mck (JIRA)
mck created CASSANDRA-14212:
---

 Summary: Back port CASSANDRA-13080 to 3.11.2 (Use new token 
allocation for non bootstrap case as well)
 Key: CASSANDRA-14212
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
 Project: Cassandra
  Issue Type: Improvement
Reporter: mck


Backport CASSANDRA-13080 to 3.11.x

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-01 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-14212:
---

Assignee: mck

> Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non 
> bootstrap case as well)
> -
>
> Key: CASSANDRA-14212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: mck
>Assignee: mck
>Priority: Major
>
> Backport CASSANDRA-13080 to 3.11.x
>  
> The patch applies without conflict to the {{cassandra-3.11}} and equally 
> concerns to users of Cassandra-3.11.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-01 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349912#comment-16349912
 ] 

mck commented on CASSANDRA-14212:
-

|| branch || testall || dtest ||
| 
[cassandra-3.11_13080|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_13080]
   | 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13080]
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/266]
 |

> Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non 
> bootstrap case as well)
> -
>
> Key: CASSANDRA-14212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: mck
>Assignee: mck
>Priority: Major
>
> Backport CASSANDRA-13080 to 3.11.x
>  
> The patch applies without conflict to the {{cassandra-3.11}} and equally 
> concerns to users of Cassandra-3.11.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-06 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14212:

Fix Version/s: 3.11.x
   Status: Patch Available  (was: In Progress)

> Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non 
> bootstrap case as well)
> -
>
> Key: CASSANDRA-14212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 3.11.x
>
>
> Backport CASSANDRA-13080 to 3.11.x
>  
> The patch applies without conflict to the {{cassandra-3.11}} and equally 
> concerns to users of Cassandra-3.11.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14212) Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non bootstrap case as well)

2018-02-12 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14212:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed.

> Back port CASSANDRA-13080 to 3.11.2 (Use new token allocation for non 
> bootstrap case as well)
> -
>
> Key: CASSANDRA-14212
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14212
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 3.11.x
>
>
> Backport CASSANDRA-13080 to 3.11.x
>  
> The patch applies without conflict to the {{cassandra-3.11}} and equally 
> concerns to users of Cassandra-3.11.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)
mck created CASSANDRA-14247:
---

 Summary: SASI tokenizer for simple delimiter based entries
 Key: CASSANDRA-14247
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
 Project: Cassandra
  Issue Type: Improvement
  Components: sasi
Reporter: mck


Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-14247:
---

Assignee: mck

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Fix Version/s: 3.11.x
   4.0

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371188#comment-16371188
 ] 

mck commented on CASSANDRA-14247:
-

WIP…

|| branch || testall || dtest ||
| 
[cassandra-3.11_14247|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_14247]
   | 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_14247]
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XXX]
 |
| [trunk_14247|https://github.com/thelastpickle/cassandra/tree/mck/trunk_14247] 
| 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_14247]
  | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XXX]
 |

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-23 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374075#comment-16374075
 ] 

mck commented on CASSANDRA-14247:
-

[~xedin], are you able to spare a review?

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround of 
CASSANDRA-11182, and to avoid the disk usage explosion when having to resort to 
{{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-21 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-25 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround of 
CASSANDRA-11182, and to avoid the disk usage explosion when having to resort to 
{{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround of 
CASSANDRA-11182, and to avoid the disk usage explosion when having to resort to 
{{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-25 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376253#comment-16376253
 ] 

mck commented on CASSANDRA-14247:
-

[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-25 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376253#comment-16376253
 ] 

mck edited comment on CASSANDRA-14247 at 2/25/18 9:30 PM:
--

[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

Aside: this tokenizer raises the need for a "exact" mode. Querying a csv inside 
a column like this is one example where the user may never require wildcarding 
LIKE clause (using %) and an 'exact' mode would be significantly more 
performant and use less disk.


was (Author: michaelsembwever):
[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-11075) Consider making SASI the default index implementation

2018-02-25 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-11075:

Labels: sasi  (was: )

> Consider making SASI the default index implementation
> -
>
> Key: CASSANDRA-11075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11075
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Pavel Yaskevich
>Priority: Major
>  Labels: sasi
>
> We now have 2 secondary index implementation in tree: the old native ones and 
> SASI. Moving forward, that feels like one too much to maintain, especially 
> since it seems that SASI is an overall better implementation.
> So we should gather enough data to decide if SASI is indeed always better (or 
> at least sufficiently better than we're convinced no-one would want to stick 
> with the native implementation), and if that's the case, we should consider 
> making it the default (and ultimately get rid of the current implementation).
> So first, we should at least:
> # double check that SASI handles all cases that the native implementation 
> handles. A good start would probably be run all our dtest and utests on a 
> version where SASI is hard-coded as default.
> # compare the performance of SASI and native indexes. In particular our 
> native indexes, in all their weaknesses, have the advantage of not doing a 
> read-before-write. Haven't looked at SASI much so I don't know if it's the 
> case but anyway, we need numbers on both reads and writes.
> Once we have that, if we do decide to make SASI the default, then we need to 
> figure out what is the upgrade path (and whether we add extra syntax for SASI 
> specific options).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-25 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376253#comment-16376253
 ] 

mck edited comment on CASSANDRA-14247 at 2/26/18 12:35 AM:
---

[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

Aside: this tokenizer raises the need for a "exact" mode. Querying a csv inside 
a column like this is one example where the user may never require wildcarding 
LIKE clause (using %) and an 'exact' mode would be significantly more 
performant and use less disk. (btw I'm suspecting that {{is_literal: false}} 
would have the same impact as an 'exact' mode…)


was (Author: michaelsembwever):
[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

Aside: this tokenizer raises the need for a "exact" mode. Querying a csv inside 
a column like this is one example where the user may never require wildcarding 
LIKE clause (using %) and an 'exact' mode would be significantly more 
performant and use less disk.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-26 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376253#comment-16376253
 ] 

mck edited comment on CASSANDRA-14247 at 2/26/18 9:29 AM:
--

[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example {{cqlsh}} corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

Aside: this tokenizer raises the need for a "exact" mode. Querying a csv inside 
a column like this is one example where the user may never require wildcarding 
LIKE clause (using %) and an 'exact' mode would be significantly more 
performant and use less disk. (btw I'm suspecting that {{is_literal: false}} 
would have the same impact as an 'exact' mode…)


was (Author: michaelsembwever):
[~mkjellman], have forced pushed the branch again. (let me know if you want to 
be adding checkpoint commits rather than overwriting the existing commit.)

This adds the test file {{test/resources/tokenization/world_cities_a.csv}}, and 
a unit test to match. The other unit test methods have been updated to use 
different delimiters as appropriate for the existing test data files.

Example corridor testing…
{code:java}
create table test ( one text, two int, three text, PRIMARY KEY (one,two) );

# insert a new row, with the contents of 
test/resources/tokenization/world_cities_a.csv going into column 'three'.

create CUSTOM INDEX on test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 'delimiter': ',', 
'mode': 'prefix', 'analyzed': 'true'};

select one,two from test where three LIKE 'azzazl' ALLOW FILTERING;
{code}

Aside: this tokenizer raises the need for a "exact" mode. Querying a csv inside 
a column like this is one example where the user may never require wildcarding 
LIKE clause (using %) and an 'exact' mode would be significantly more 
performant and use less disk. (btw I'm suspecting that {{is_literal: false}} 
would have the same impact as an 'exact' mode…)

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14191) Bootstrap/Streaming fails with missing CompressionInfo

2018-02-23 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck resolved CASSANDRA-14191.
-
Resolution: Cannot Reproduce

Closing this ticket as 'cannot reproduce', as i doubt more information on it 
will arise.

If it does, or anyone has any thoughts or suspicions about it, please do 
re-open the ticket and speak up.

> Bootstrap/Streaming fails with missing CompressionInfo
> --
>
> Key: CASSANDRA-14191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14191
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: mck
>Priority: Major
>
> Multiple attempts at bootstrapping a new node fail, with streaming failing 
> (either hanging or stopping the bootstrap node) always from the same node.
>  
> The original node throws the following exception during the streaming process:
> {noformat}
> ERROR [STREAM-OUT-/10.83.74.236:47220] 2018-01-24 19:25:22,532 
> StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
> Streaming error occurred on session with peer X.X.X.X
> java.lang.AssertionError: null
>   at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.(CompressionMetadata.java:473)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.io.compress.CompressionMetadata.getChunksForSections(CompressionMetadata.java:287)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.serialize(FileMessageHeader.java:172)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:82)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:377)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:349)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
> {noformat}
> The bootstrapping node's reaction to this failure is
> {noformat}
> ERROR [STREAM-IN-/10.83.74.234:7001] 2018-01-24 19:25:22,957 
> StreamSession.java:512 - [Stream #90c1c8b0-013a-11e8-b5f0-9323de372ca2] 
> Streaming error occurred on session with peer X.X.X.X
> java.io.EOFException: null
>   at java.io.DataInputStream.readInt(DataInputStream.java:392) 
> ~[na:1.8.0_151]
>   at 
> org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:68)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.compress.CompressionInfo$CompressionInfoSerializer.deserialize(CompressionInfo.java:47)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.FileMessageHeader$FileMessageHeaderSerializer.deserialize(FileMessageHeader.java:188)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:42)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:56)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:276)
>  ~[cassandra-all-2.1.18.1463.jar:2.1.18.1463]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
> {noformat}
> Other observations:
>  - always the one node that fails,
>  - multiple bootstrap attempts (using different ec2 instances) all fail,
>  - the exception occurs to {{\-tmp-}} sstables that have no CompressionInfo 
> component,
>  - it's a different {{\-tmp-}} sstable each time,
>  - running either {{nodetool cleanup}} or {{nodetool scrub}} made no 
> difference,



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CASSANDRA-14444) Got NPE when querying Cassandra 3.11.2

2018-06-18 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-1:

Reproduced In: 3.11.2
Since Version: 3.11.2

> Got NPE when querying Cassandra 3.11.2
> --
>
> Key: CASSANDRA-1
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Ubuntu 14.04, JDK 1.8.0_171. 
> Cassandra 3.11.2
>Reporter: Xiaodong Xie
>Priority: Blocker
>
> We just upgraded our Cassandra cluster from 2.2.6 to 3.11.2
> After upgrading, we immediately got exceptions in Cassandra like this one: 
>  
> {code}
> ERROR [Native-Transport-Requests-1] 2018-05-11 17:10:21,994 
> QueryMessage.java:129 - Unexpected error during query
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:248)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:92)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.pager.PartitionRangeQueryPager.(PartitionRangeQueryPager.java:44)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:268)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:475)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:288)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:118)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:224)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:255) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:240) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_171]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
> {code}
>  
> The table schema is like:
> {code}
> CREATE TABLE example.example_table (
>  id bigint,
>  hash text,
>  json text,
>  PRIMARY KEY (id, hash)
> ) WITH COMPACT STORAGE
> {code}
>  
> The query is something like:
> {code}
> "select * from example.example_table;" // (We do know this is bad practise, 
> and we are trying to fix that right now)
> {code}
> with fetch-size as 200, using DataStax Java driver. 
> This table contains about 20k rows. 
>  
> Actually, the fix is quite simple, 
>  
> {code}
> --- a/src/java/org/apache/cassandra/service/pager/PagingState.java
> +++ b/src/java/org/apache/cassandra/service/pager/PagingState.java
> @@ -46,7 +46,7 @@ public class PagingState
> public PagingState(ByteBuffer partitionKey, RowMark rowMark, int remaining, 
> int remainingInPartition)
>  {
> - this.partitionKey = partitionKey;
> + this.partitionKey = partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER 
> : partitionKey;
>  this.rowMark = rowMark;
>  this.remaining = remaining;
>  this.remainingInPartition = remainingInPartition;
> {code}
>  
> "partitionKey == null ? ByteBufferUtil.EMPTY_BYTE_BUFFER : 

[jira] [Updated] (CASSANDRA-14515) Short read protection in presence of almost-purgeable range tombstones may cause permanent data loss

2018-06-18 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14515:

Priority: Blocker  (was: Major)

> Short read protection in presence of almost-purgeable range tombstones may 
> cause permanent data loss
> 
>
> Key: CASSANDRA-14515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14515
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Because read responses don't necessarily close their open RT bounds, it's 
> possible to lose data during short read protection, if a closing bound is 
> compacted away between two adjacent reads from a node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-27 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560601#comment-16560601
 ] 

mck commented on CASSANDRA-14563:
-

{quote}one of the alternate approaches (to animal sniffer), to add java7 
specific "build_test" workflow in addition to java8 workflow, in circleCI 
config.{quote}

Sure, but it's an expensive way for us to support jdk1.7 at runtime, as afaik 
we have no actual need to support compiling with jdk1.7.
And AnimalSniffer catches any incompatibilities immediately at compile-time. 
(Our feedback loop from the CI systems is not ideal atm.)


> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Assignee: Sumanth Pasupuleti
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-08-02 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567652#comment-16567652
 ] 

mck commented on CASSANDRA-14563:
-

{quote}One thing I like about the docker approach is that we really know that 
Cassandra will really work with that JVM. We can build with whichever JVMs we 
want to support building on and we can run the unit tests using whichever JVMs 
we claim to support running on.
{quote}
[~jolynch], i agree with this notion. It demonstrates the separation of 
concerns between jdk compatibility: source compatibility and 
functional/behavioural compatibility.

We only (afaik so far) want to build with jdk8.
 We do want to test (more integration tests) against multiple JVMs.

AnimalSniffer allows us to drop concerns about source compatibility, and drop 
having to run integration tests over artefacts built with jdk7. That's kinda 
important because the community never publishes jdk7 built artefacts.

An improved testing environment would definitely be welcome (eg with docker) 
and integration tests can (should) be run over all jdk patch versions that we 
say we support (because functional compatibility is not just defined by major 
versions).

But the intention of this ticket was to introduce AnimalSniffer so to deal with 
(and simplify) source and binary compatibility, and provide a much faster dev 
feedback loop than CircleCi ever could. As stated, I really don't think that 
there's ever a need to compile Cassandra with multiple JDKs, it doesn't give us 
anything and only adds an additional variable into the testing matrix.

While we should be doing more for functional testing against different jvms 
versions, that's really outside the scope of this ticket. 
[~sumanth.pasupuleti], would you mind creating a separate jira ticket for your 
pull request.

> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Assignee: Sumanth Pasupuleti
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-29 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561032#comment-16561032
 ] 

mck edited comment on CASSANDRA-14563 at 7/29/18 6:31 AM:
--

By "compile-time" i was thinking about local compile-time before the developer 
does a 'git push'.

 - Providing a CircleCI build job provides support compiling Cassandra-2.2 with 
jdk1.7.
 - Using AnimalSniffer in the ant build ensures we support runing Cassandra-2.2 
with jdk1.7.

These are not the same thing. And the later is the specific requirement here.

While using CircleCI does solve both, it is a slower feedback loop for 
developers. Developers don't typically remember to *also* compile with jdk1.7 
locally before pushing their patch.

AnimalSniffer solves just the second problem, and provides a faster feedback 
loop for developers. 

And we do not need to support compiling with jdk1.7, and we do not cut releases 
with jdk1.7 afaik. 
With the use of AnimalSniffer we could altogether forbid building Cassandra-2.2 
with jdk1.7, making life simpler for us by taking a variable out of the 
building environment.

[~jasobrown], having added support for java11, what are you thoughts here? 
Should we be trying to make life simpler for ourselves by focusing on what 
runtime java versions we support while narrowing which versions we support 
building with?


was (Author: michaelsembwever):
By "compile-time" i was thinking about local compile-time before the developer 
does a 'git push'.

 - Providing a CircleCI build job provides support compiling Cassandra-2.2 with 
jdk1.7.
 - Using AnimalSniffer in the ant build ensures we support runing Cassandra-2.2 
with jdk1.7.

These are not the same thing. And the later is the specific requirement here.

While using CircleCI does solve both, it is a slower feedback loop for 
developers. Developers don't typically remember to *also* compile with jdk1.7 
locally before pushing their patch.

AnimalSniffer solves just the second problem, and provides a faster feedback 
loop for developers. 

And we do not need to support compiling with jdk1.7, and we do no cut releases 
with jdk1.7 afaik. 
With the use of AnimalSniffer we could altogether forbid building Cassandra-2.2 
with jdk1.7, making life simpler for us by taking a variable out of the 
building environment.

[~jasobrown], having added support for java11, what are you thoughts here? 
Should we be trying to make life simpler for ourselves by focusing on what 
runtime java versions we support while narrowing which versions we support 
building with?

> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Assignee: Sumanth Pasupuleti
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-29 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561032#comment-16561032
 ] 

mck commented on CASSANDRA-14563:
-

By "compile-time" i was thinking about local compile-time before the developer 
does a 'git push'.

 - Providing a CircleCI build job provides support compiling Cassandra-2.2 with 
jdk1.7.
 - Using AnimalSniffer in the ant build ensures we support runing Cassandra-2.2 
with jdk1.7.

These are not the same thing. And the later is the specific requirement here.

While using CircleCI does solve both, it is a slower feedback loop for 
developers. Developers don't typically remember to *also* compile with jdk1.7 
locally before pushing their patch.

AnimalSniffer solves just the second problem, and provides a faster feedback 
loop for developers. 

And we do not need to support compiling with jdk1.7, and we do no cut releases 
with jdk1.7 afaik. 
With the use of AnimalSniffer we could altogether forbid building Cassandra-2.2 
with jdk1.7, making life simpler for us by taking a variable out of the 
building environment.

[~jasobrown], having added support for java11, what are you thoughts here? 
Should we be trying to make life simpler for ourselves by focusing on what 
runtime java versions we support while narrowing which versions we support 
building with?

> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Assignee: Sumanth Pasupuleti
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14632) cqlsh can't describe when index options contain unicode

2018-08-09 Thread mck (JIRA)
mck created CASSANDRA-14632:
---

 Summary: cqlsh can't describe when index options contain unicode 
 Key: CASSANDRA-14632
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14632
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: mck


The following `describe` fails in `cqlsh`:

 
{code:java}
$ cqlsh

cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};

cqlsh> CREATE TABLE test ( one text, two int, three text, PRIMARY KEY (one,two) 
);

cqlsh> CREATE CUSTOM INDEX ON test.test (three) USING 
'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'delimiter': '░'};

cqlsh> DESCRIBE KEYSPACE test ;

'ascii' codec can't decode byte 0xe2 in position 31: ordinal not in range(128)

{code}
 

Full error is 
{noformat}
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 919, in onecmd
self.handle_statement(st, statementtext)
  File "/usr/bin/cqlsh.py", line 956, in handle_statement
return custom_handler(parsed)
  File "/usr/bin/cqlsh.py", line 1539, in do_describe
self.describe_keyspace(ksname)
  File "/usr/bin/cqlsh.py", line 1275, in describe_keyspace
self.print_recreate_keyspace(self.get_keyspace_meta(ksname), sys.stdout)
  File "/usr/bin/cqlsh.py", line 1225, in print_recreate_keyspace
out.write(ksdef.export_as_string())
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
 line 661, in export_as_string
+ [t.export_as_string() for t in self.tables.values()])
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
 line 1116, in export_as_string
ret = self._all_as_cql()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
 line 1125, in _all_as_cql
ret += "\n%s;" % index.as_cql_query()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
 line 1402, in as_cql_query
ret += " WITH OPTIONS = %s" % 
Encoder().cql_encode_all_types(options){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14632) cqlsh can't describe when index options contain unicode

2018-08-09 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14632:

Reproduced In: 3.11.3
Since Version: 3.11.3

> cqlsh can't describe when index options contain unicode 
> 
>
> Key: CASSANDRA-14632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14632
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: mck
>Priority: Major
>
> The following `describe` fails in `cqlsh`:
>  
> {code:java}
> $ cqlsh
> cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> CREATE TABLE test ( one text, two int, three text, PRIMARY KEY 
> (one,two) );
> cqlsh> CREATE CUSTOM INDEX ON test.test (three) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'delimiter': 
> '░'};
> cqlsh> DESCRIBE KEYSPACE test ;
> 'ascii' codec can't decode byte 0xe2 in position 31: ordinal not in range(128)
> {code}
>  
> Full error is 
> {noformat}
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 919, in onecmd
> self.handle_statement(st, statementtext)
>   File "/usr/bin/cqlsh.py", line 956, in handle_statement
> return custom_handler(parsed)
>   File "/usr/bin/cqlsh.py", line 1539, in do_describe
> self.describe_keyspace(ksname)
>   File "/usr/bin/cqlsh.py", line 1275, in describe_keyspace
> self.print_recreate_keyspace(self.get_keyspace_meta(ksname), sys.stdout)
>   File "/usr/bin/cqlsh.py", line 1225, in print_recreate_keyspace
> out.write(ksdef.export_as_string())
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
>  line 661, in export_as_string
> + [t.export_as_string() for t in self.tables.values()])
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
>  line 1116, in export_as_string
> ret = self._all_as_cql()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
>  line 1125, in _all_as_cql
> ret += "\n%s;" % index.as_cql_query()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.10.zip/cassandra-driver-3.10/cassandra/metadata.py",
>  line 1402, in as_cql_query
> ret += " WITH OPTIONS = %s" % 
> Encoder().cql_encode_all_types(options){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13457) Diag. Events: Add base classes

2018-08-16 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583319#comment-16583319
 ] 

mck commented on CASSANDRA-13457:
-

My +1 stands. 

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13457) Diag. Events: Add base classes

2018-08-16 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-13457:

Reviewers: Jason Brown

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13458) Diag. Events: Add unit testing support

2018-08-16 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526227#comment-16526227
 ] 

mck edited comment on CASSANDRA-13458 at 8/17/18 3:48 AM:
--

(so far) this is a +1 from me.


was (Author: michaelsembwever):
this is a +1 from me.

> Diag. Events: Add unit testing support
> --
>
> Key: CASSANDRA-13458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13458
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Testing
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Diagnostic events will improve unit testing by
> * providing test execution control instances based on CompletableFutures (see 
> [PendingRangeCalculatorServiceTest.java|https://github.com/spodkowinski/cassandra/blob/WIP-13458/test/unit/org/apache/cassandra/gms/PendingRangeCalculatorServiceTest.java])
>  
> * validate state and behavior by allowing you to inspect generated events 
> (see 
> [HintsServiceEventsTest.java|https://github.com/spodkowinski/cassandra/blob/WIP-13458/test/unit/org/apache/cassandra/hints/HintsServiceEventsTest.java])
> See included 
> [testing.rst|https://github.com/spodkowinski/cassandra/blob/WIP-13458/doc/source/development/testing.rst#diagnostic-events-40]
>  draft for more details. Let me know if this would be useful for you as a 
> developer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-08-07 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571508#comment-16571508
 ] 

mck commented on CASSANDRA-14435:
-

reviewed. +1 from me.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14435) Diag. Events: JMX events

2018-08-07 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14435:

Reviewers: mck

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13668) Diag. events for user audit logging

2018-08-07 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571577#comment-16571577
 ] 

mck commented on CASSANDRA-13668:
-

reviewed, +1 from me.

> Diag. events for user audit logging
> ---
>
> Key: CASSANDRA-13668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> With the availability of CASSANDRA-13459, any native transport enabled client 
> will be able to subscribe to internal Cassandra events. External tools can 
> take advantage by monitoring these events in various ways. Use-cases for this 
> can be e.g. auditing tools for compliance and security purposes.
> The scope of this ticket is to add diagnostic events that are raised around 
> authentication and CQL operations. These events can then be consumed and used 
> by external tools to implement a Cassandra user auditing solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13668) Diag. events for user audit logging

2018-08-07 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-13668:

Reviewers: mck

> Diag. events for user audit logging
> ---
>
> Key: CASSANDRA-13668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> With the availability of CASSANDRA-13459, any native transport enabled client 
> will be able to subscribe to internal Cassandra events. External tools can 
> take advantage by monitoring these events in various ways. Use-cases for this 
> can be e.g. auditing tools for compliance and security purposes.
> The scope of this ticket is to add diagnostic events that are raised around 
> authentication and CQL operations. These events can then be consumed and used 
> by external tools to implement a Cassandra user auditing solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13457) Diag. Events: Add base classes

2018-08-07 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571334#comment-16571334
 ] 

mck commented on CASSANDRA-13457:
-

reviewed again. +1 from me.

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13457) Diag. Events: Add base classes

2018-08-12 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577497#comment-16577497
 ] 

mck commented on CASSANDRA-13457:
-

added a few comments on the current commit: 
https://github.com/spodkowinski/cassandra/commit/a4100605a3da7a3b55096462f5b43cc7ab17fd77#diff-048abc7de8f43039fff6f4c21bf1cb93

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra

2018-08-20 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586930#comment-16586930
 ] 

mck commented on CASSANDRA-14346:
-

[~kohlisankalp],
> any timeline when we can expect a patch for it? 

We hope to have an answer to this by next week.

> Scheduled Repair in Cassandra
> -
>
> Key: CASSANDRA-14346
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14346
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Repair
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
>  Labels: 4.0-feature-freeze-review-requested, 
> CommunityFeedbackRequested
> Fix For: 4.0
>
> Attachments: ScheduledRepairV1_20180327.pdf
>
>
> There have been many attempts to automate repair in Cassandra, which makes 
> sense given that it is necessary to give our users eventual consistency. Most 
> recently CASSANDRA-10070, CASSANDRA-8911 and CASSANDRA-13924 have all looked 
> for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), 
> which we spoke about last year at NGCC. Given the positive feedback at NGCC 
> we focussed on getting it production ready and have now been using it in 
> production to repair hundreds of clusters, tens of thousands of nodes, and 
> petabytes of data for the past six months. Also based on feedback at NGCC we 
> have invested effort in figuring out how to integrate this natively into 
> Cassandra rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our 
> implementation into Cassandra, and have created a [design 
> document|https://docs.google.com/document/d/1RV4rOrG1gwlD5IljmrIq_t45rz7H3xs9GbFSEyGzEtM/edit?usp=sharing]
>  showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would 
> be greatly appreciated about the interface or v1 implementation features. I 
> have tried to call out in the document features which we explicitly consider 
> future work (as well as a path forward to implement them in the future) 
> because I would very much like to get this done before the 4.0 merge window 
> closes, and to do that I think aggressively pruning scope is going to be a 
> necessity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-16 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-14563:
---

Assignee: Sumanth Pasupuleti

> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Assignee: Sumanth Pasupuleti
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-08-31 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597316#comment-16597316
 ] 

mck edited comment on CASSANDRA-13262 at 8/31/18 11:56 PM:
---

New dtests running…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/625/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/625/]
 |
| 
[cassandra-3.0_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.0_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/626/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/626/]
 |
| 
[cassandra-3.11_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_13262]
   | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/627/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/627/]
 |


EDIT: rebased branches.


was (Author: michaelsembwever):
New dtests running…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/618/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/618/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/619/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/619/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/620/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/620/]
 |


> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-09-04 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597316#comment-16597316
 ] 

mck edited comment on CASSANDRA-13262 at 9/5/18 1:28 AM:
-

New dtests running…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/628/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/628/]
 |
| 
[cassandra-3.0_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.0_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/632/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/632/]
 |
| 
[cassandra-3.11_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_13262]
   | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/630/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/630/]
 |


EDIT: rebased branches.


was (Author: michaelsembwever):
New dtests running…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/628/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/628/]
 |
| 
[cassandra-3.0_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.0_13262]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.0_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/629/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/629/]
 |
| 
[cassandra-3.11_13262|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_13262]
   | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/630/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/630/]
 |


EDIT: rebased branches.

> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-09-04 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603809#comment-16603809
 ] 

mck commented on CASSANDRA-13262:
-

Committed to 2.2, 3.0 and 3.11, with 62e48c5f3f818d1e841178d7365d208435a63537

> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-09-04 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-13262:

Fix Version/s: 3.11.4
   3.0.18
   2.2.14

> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)
mck created CASSANDRA-14689:
---

 Summary: Add developer docs for creating releases
 Key: CASSANDRA-14689
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
 Project: Cassandra
  Issue Type: Task
  Components: Documentation and Website
Reporter: mck



Provide an initial outline on the steps Release Managers follow for creating, 
voting and publishing a release for Apache Cassandra.


ASF has the following guidelines:
 * `ASF Release Policy `_.
 * `ASF Release Distribution Policy 
`_.
 * `ASF Release Best Practices 
`_.

The project is still doing some things in an outdated manner, eg using 
people.apache.org URLs for staging artefacts. There is no urgent need to fix 
these things but by having the docs published it can improved incrementally 
over time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604034#comment-16604034
 ] 

mck commented on CASSANDRA-14689:
-

Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || testall || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
   |


> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604034#comment-16604034
 ] 

mck edited comment on CASSANDRA-14689 at 9/5/18 7:23 AM:
-

Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || CircleCI || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
   |



was (Author: michaelsembwever):
Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || testall || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
   |


> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14689:

Reviewers: Michael Shuler, Stefan Podkowinski

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
> Fix For: 4.x
>
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14689:

Description: 
Provide an initial outline on the steps Release Managers follow for creating, 
voting and publishing a release for Apache Cassandra.


ASF has the following guidelines:
 * `ASF Release Policy `_.
 * `ASF Release Distribution Policy 
`_.
 * `ASF Release Best Practices 
`_.

The project is still doing some things in an outdated manner, eg using 
people.apache.org URLs for staging artefacts. There is no urgent need to fix 
these things but by having the docs published it can improved incrementally 
over time.

fyi [~mshuler]

  was:

Provide an initial outline on the steps Release Managers follow for creating, 
voting and publishing a release for Apache Cassandra.


ASF has the following guidelines:
 * `ASF Release Policy `_.
 * `ASF Release Distribution Policy 
`_.
 * `ASF Release Best Practices 
`_.

The project is still doing some things in an outdated manner, eg using 
people.apache.org URLs for staging artefacts. There is no urgent need to fix 
these things but by having the docs published it can improved incrementally 
over time.


> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Priority: Minor
> Fix For: 4.x
>
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-14689:
---

Assignee: mck

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
> Fix For: 4.x
>
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14689:

Fix Version/s: 4.x

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Priority: Minor
> Fix For: 4.x
>
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-06 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606487#comment-16606487
 ] 

mck commented on CASSANDRA-14689:
-

Committed to trunk as 63e5763

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-06 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14689:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597316#comment-16597316
 ] 

mck edited comment on CASSANDRA-13262 at 8/31/18 3:02 AM:
--

New dtests running…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/618/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/618/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/619/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/619/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/620/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/620/]
 |



was (Author: michaelsembwever):
still waiting on builds.apache.org to come back…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |


> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14668) Diag events for read repairs

2018-08-30 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14668:

Reviewer: mck

> Diag events for read repairs
> 
>
> Key: CASSANDRA-14668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Read repairs have been a highly discussed topic during the last months and 
> also saw some significant code changes. I'd like to be better prepared in 
> case we need to investigate any further RR issues in the future, by adding 
> diagnostic events that can be enabled for exposing informations such as:
>  * contacted endpoints
>  * digest responses by endpoint
>  * affected partition keys
>  * speculated reads / writes
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14668) Diag events for read repairs

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598250#comment-16598250
 ] 

mck edited comment on CASSANDRA-14668 at 8/31/18 5:05 AM:
--

builds.apache.org came back up today, so kick off the dtests for this…

|| branch || testall || dtest ||
| [WIP-14668|https://github.com/spodkowinski/cassandra/tree/WIP-14668]  | 
[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/]
 |


was (Author: michaelsembwever):
builds.apache.org came back up today, so kick off the dtests for this…

|| branch || testall || dtest ||
| [WIP-14668|spodkowinski/cassandra/tree/WIP-14668] | 
[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/]
 |

> Diag events for read repairs
> 
>
> Key: CASSANDRA-14668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Read repairs have been a highly discussed topic during the last months and 
> also saw some significant code changes. I'd like to be better prepared in 
> case we need to investigate any further RR issues in the future, by adding 
> diagnostic events that can be enabled for exposing informations such as:
>  * contacted endpoints
>  * digest responses by endpoint
>  * affected partition keys
>  * speculated reads / writes
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14668) Diag events for read repairs

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598250#comment-16598250
 ] 

mck edited comment on CASSANDRA-14668 at 8/31/18 5:07 AM:
--

builds.apache.org came back up today, so kicked off the dtests for this…

|| branch || testall || dtest ||
| [WIP-14668|https://github.com/spodkowinski/cassandra/tree/WIP-14668]  | 
[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/]
 |


was (Author: michaelsembwever):
builds.apache.org came back up today, so kick off the dtests for this…

|| branch || testall || dtest ||
| [WIP-14668|https://github.com/spodkowinski/cassandra/tree/WIP-14668]  | 
[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/]
 |

> Diag events for read repairs
> 
>
> Key: CASSANDRA-14668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Read repairs have been a highly discussed topic during the last months and 
> also saw some significant code changes. I'd like to be better prepared in 
> case we need to investigate any further RR issues in the future, by adding 
> diagnostic events that can be enabled for exposing informations such as:
>  * contacted endpoints
>  * digest responses by endpoint
>  * affected partition keys
>  * speculated reads / writes
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14668) Diag events for read repairs

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598250#comment-16598250
 ] 

mck commented on CASSANDRA-14668:
-

builds.apache.org came back up today, so kick off the dtests for this…

|| branch || testall || dtest ||
| [WIP-14668|spodkowinski/cassandra/tree/WIP-14668] | 
[!https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668.svg?style=svg!|https://circleci.com/gh/spodkowinski/cassandra/tree/WIP-14668]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/621/]
 |

> Diag events for read repairs
> 
>
> Key: CASSANDRA-14668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14668
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Read repairs have been a highly discussed topic during the last months and 
> also saw some significant code changes. I'd like to be better prepared in 
> case we need to investigate any further RR issues in the future, by adding 
> diagnostic events that can be enabled for exposing informations such as:
>  * contacted endpoints
>  * digest responses by endpoint
>  * affected partition keys
>  * speculated reads / writes
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)
mck created CASSANDRA-14679:
---

 Summary: Prevent generating new tokens on a node when data exists
 Key: CASSANDRA-14679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
 Project: Cassandra
  Issue Type: Bug
Reporter: mck


Data loss is possible if a node starts up without {{system.local}} data 
available.

If a node restarts and its {{system.local}} data is unavailable it will 
generate new tokens. This will cause range movements in the cluster causing 
potential data loss, and these range movements are not part of a 
bootstrap/decommission and leaves orphaned data around the cluster.

This can happen if a node restarts without a JBOD entry available, or if the 
cassandra.yaml changes and leaves a JBOD entry out.

If a node starts up, finds data but not its {{system.local}} it should not 
generate new tokens. Neither should it assign itself a new Host ID.

This is described in more detail in 
http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14679:

Description: 
Data loss is possible if a node starts up without {{system.local}} data 
available.

If a node restarts and its {{system.local}} data is unavailable it will 
generate new tokens. This will cause range movements in the cluster causing 
potential data loss, as these range movements are not part of a 
bootstrap/decommission and leaves orphaned data around the cluster.

This can happen if a node restarts without a JBOD entry available, or if the 
cassandra.yaml changes and leaves a JBOD entry out.

If a node starts up, finds data but not its {{system.local}} it should not 
generate new tokens. Neither should it assign itself a new Host ID.

This is described in more detail in 
http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html

  was:
Data loss is possible if a node starts up without {{system.local}} data 
available.

If a node restarts and its {{system.local}} data is unavailable it will 
generate new tokens. This will cause range movements in the cluster causing 
potential data loss, and these range movements are not part of a 
bootstrap/decommission and leaves orphaned data around the cluster.

This can happen if a node restarts without a JBOD entry available, or if the 
cassandra.yaml changes and leaves a JBOD entry out.

If a node starts up, finds data but not its {{system.local}} it should not 
generate new tokens. Neither should it assign itself a new Host ID.

This is described in more detail in 
http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html


> Prevent generating new tokens on a node when data exists
> 
>
> Key: CASSANDRA-14679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14679:

Reproduced In: 2.1.12
Since Version: 2.1.12

> Prevent generating new tokens on a node when data exists
> 
>
> Key: CASSANDRA-14679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14679:

Since Version:   (was: 2.1.12)

> Prevent generating new tokens on a node when data exists
> 
>
> Key: CASSANDRA-14679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597273#comment-16597273
 ] 

mck commented on CASSANDRA-13262:
-

{quote} what is the status of the patch for the 2.2, 3.0 and 3.11 branches?
{quote}
[~blerer], the issue was resolved before I addded the comment with anthony's 
patches. They were never committed because of waiting on the [above 
comment|https://issues.apache.org/jira/browse/CASSANDRA-13262?focusedCommentId=16015553=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16015553].

Would you like me to go ahead and commit the back-ports? Or run them through 
the tests again?

> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604034#comment-16604034
 ] 

mck edited comment on CASSANDRA-14689 at 9/5/18 7:28 AM:
-

Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || CircleCI || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
 |

^not like the CircleCi builds checks the website build, but one can dream…


was (Author: michaelsembwever):
Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || CircleCI || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
   |


> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604034#comment-16604034
 ] 

mck edited comment on CASSANDRA-14689 at 9/5/18 7:28 AM:
-

Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || CircleCI || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
 |

^not like the CircleCi builds check the website build, but one can dream…


was (Author: michaelsembwever):
Pull Request @ https://github.com/apache/cassandra/pull/230


|| branch || CircleCI || 
| 
[trunk|https://github.com/thelastpickle/cassandra/tree/mck/docs--release-process]
 | 
[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fdocs--release-process.svg?style=svg!|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Fdocs--release-process]
 |

^not like the CircleCi builds checks the website build, but one can dream…

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14689) Add developer docs for creating releases

2018-09-05 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14689:

Status: Patch Available  (was: In Progress)

> Add developer docs for creating releases
> 
>
> Key: CASSANDRA-14689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14689
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation and Website
>Reporter: mck
>Assignee: mck
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Provide an initial outline on the steps Release Managers follow for creating, 
> voting and publishing a release for Apache Cassandra.
> ASF has the following guidelines:
>  * `ASF Release Policy `_.
>  * `ASF Release Distribution Policy 
> `_.
>  * `ASF Release Best Practices 
> `_.
> The project is still doing some things in an outdated manner, eg using 
> people.apache.org URLs for staging artefacts. There is no urgent need to fix 
> these things but by having the docs published it can improved incrementally 
> over time.
> fyi [~mshuler]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597160#comment-16597160
 ] 

mck commented on CASSANDRA-14679:
-

{quote}If the operator misconfigures the node by removing a directory from 
data_file_directories, I don't really think Cassandra can tell whether it was 
intentional or unintentional.{quote}

If a node starts up with data but without {{system.local}} I think it's safe to 
say this is not intentional and to halt the process, and should prevent new 
tokens from being generated. But I haven't looked through the code to see 
what's cheap to patch.

{quote}I think the best would be to store a small file alongside cassandra.yaml 
on the node to remember state information.{quote}

For a better solution I'm left wondering… maybe something through gossip that 
can reject the range movements? Something along the lines of "the other nodes 
know you already and are not (automatically) letting you change your Host ID".

 

> Prevent generating new tokens on a node when data exists
> 
>
> Key: CASSANDRA-14679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14679) Prevent generating new tokens on a node when data exists

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597160#comment-16597160
 ] 

mck edited comment on CASSANDRA-14679 at 8/30/18 7:54 AM:
--

{quote}If the operator misconfigures the node by removing a directory from 
data_file_directories, I don't really think Cassandra can tell whether it was 
intentional or unintentional.
{quote}
If a node starts up with data but without {{system.local}} I think it's safe to 
say this is not intentional, and we should prevent new tokens from being 
generated. But I haven't looked through the code to see what's cheap to patch.
{quote}I think the best would be to store a small file alongside cassandra.yaml 
on the node to remember state information.
{quote}
For a better solution I'm left wondering… maybe something through gossip that 
can reject the range movements? Something along the lines of "the other nodes 
know you already and are not (automatically) letting you change your Host ID".


was (Author: michaelsembwever):
{quote}If the operator misconfigures the node by removing a directory from 
data_file_directories, I don't really think Cassandra can tell whether it was 
intentional or unintentional.{quote}

If a node starts up with data but without {{system.local}} I think it's safe to 
say this is not intentional and to halt the process, and should prevent new 
tokens from being generated. But I haven't looked through the code to see 
what's cheap to patch.

{quote}I think the best would be to store a small file alongside cassandra.yaml 
on the node to remember state information.{quote}

For a better solution I'm left wondering… maybe something through gossip that 
can reject the range movements? Something along the lines of "the other nodes 
know you already and are not (automatically) letting you change your Host ID".

 

> Prevent generating new tokens on a node when data exists
> 
>
> Key: CASSANDRA-14679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14679
> Project: Cassandra
>  Issue Type: Bug
>Reporter: mck
>Priority: Critical
>
> Data loss is possible if a node starts up without {{system.local}} data 
> available.
> If a node restarts and its {{system.local}} data is unavailable it will 
> generate new tokens. This will cause range movements in the cluster causing 
> potential data loss, as these range movements are not part of a 
> bootstrap/decommission and leaves orphaned data around the cluster.
> This can happen if a node restarts without a JBOD entry available, or if the 
> cassandra.yaml changes and leaves a JBOD entry out.
> If a node starts up, finds data but not its {{system.local}} it should not 
> generate new tokens. Neither should it assign itself a new Host ID.
> This is described in more detail in 
> http://thelastpickle.com/blog/2018/08/22/the-fine-print-when-using-multiple-data-directories.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597316#comment-16597316
 ] 

mck commented on CASSANDRA-13262:
-

still waiting on builds.apache.org to come back…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |


> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13262) Incorrect cqlsh results when selecting same columns multiple times

2018-08-30 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597316#comment-16597316
 ] 

mck edited comment on CASSANDRA-13262 at 8/30/18 11:04 AM:
---

still waiting on builds.apache.org to come back…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262.svg?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |



was (Author: michaelsembwever):
still waiting on builds.apache.org to come back…

|| branch || testall || dtest ||
| 
[cassandra-2.2_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-2.2_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-2.2_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.0_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.0_13262]
  | 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.0_13262]
 | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |
| 
[cassandra-3.11_13262|https://github.com/michaelsembwever/cassandra/tree/mck/cassandra-3.11_13262]
| 
[!https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262?style=svg!|https://circleci.com/gh/michaelsembwever/cassandra/tree/mck%2Fcassandra-3.11_13262]
   | 
[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XX/]
 |


> Incorrect cqlsh results when selecting same columns multiple times
> --
>
> Key: CASSANDRA-13262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13262
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefan Podkowinski
>Assignee: Murukesh Mohanan
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 
> 0001-Fix-incorrect-cqlsh-results-when-selecting-same-colu.patch, 
> CASSANDRA-13262-v2.2.txt, CASSANDRA-13262-v3.0.txt, CASSANDRA-13262-v3.11.txt
>
>
> Just stumbled over this on trunk:
> {quote}
> cqlsh:test1> select a, b, c from table1;
>  a | b| c
> ---+--+-
>  1 |b |   2
>  2 | null | 2.2
> (2 rows)
> cqlsh:test1> select a, a, b, c from table1;
>  a | a| b   | c
> ---+--+-+--
>  1 |b |   2 | null
>  2 | null | 2.2 | null
> (2 rows)
> cqlsh:test1> select a, a, a, b, c from table1;
>  a | a| a | b| c
> ---+--+---+--+--
>  1 |b |   2.0 | null | null
>  2 | null | 2.2004768 | null | null
> {quote}
> My guess is that his is on the Python side, but haven't really looked into it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11105) cassandra-stress tool - InvalidQueryException: Batch too large

2018-07-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533307#comment-16533307
 ] 

mck edited comment on CASSANDRA-11105 at 7/5/18 6:34 AM:
-

for reference sake: in newer versions the syntax is {{-insert 
visits=FIXED\(10M\)}}

for example:
{code}./cassandra-stress user profile=../batch_too_large.yaml ops\(insert=1\) 
-insert visits=FIXED\(10M\) -log level=verbose 
file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node 
10.211.55.8{code}
"FIXED" can also be any of the specifications found 
[here|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/tools/stress/src/org/apache/cassandra/stress/settings/OptionDistribution.java#L158-L170]


was (Author: michaelsembwever):
for reference sake: in newer versions the syntax is {{-insert 
visits=FIXED\(10M\)}}

for example:
./cassandra-stress user profile=../batch_too_large.yaml ops\(insert=1\) -insert 
visits=FIXED\(10M\) -log level=verbose 
file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node 
10.211.55.8
"FIXED" can also be any of the specifications found 
[here|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/tools/stress/src/org/apache/cassandra/stress/settings/OptionDistribution.java#L158-L170]

> cassandra-stress tool - InvalidQueryException: Batch too large
> --
>
> Key: CASSANDRA-11105
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11105
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra 2.2.4, Java 8, CentOS 6.5
>Reporter: Ralf Steppacher
>Priority: Major
> Fix For: 4.0
>
> Attachments: 11105-trunk.txt, batch_too_large.yaml
>
>
> I am using Cassandra 2.2.4 and I am struggling to get the cassandra-stress 
> tool to work for my test scenario. I have followed the example on 
> http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema
>  to create a yaml file describing my test (attached).
> I am collecting events per user id (text, partition key). Events have a 
> session type (text), event type (text), and creation time (timestamp) 
> (clustering keys, in that order). Plus some more attributes required for 
> rendering the events in a UI. For testing purposes I ended up with the 
> following column spec and insert distribution:
> {noformat}
> columnspec:
>   - name: created_at
> cluster: uniform(10..1)
>   - name: event_type
> size: uniform(5..10)
> population: uniform(1..30)
> cluster: uniform(1..30)
>   - name: session_type
> size: fixed(5)
> population: uniform(1..4)
> cluster: uniform(1..4)
>   - name: user_id
> size: fixed(15)
> population: uniform(1..100)
>   - name: message
> size: uniform(10..100)
> population: uniform(1..100B)
> insert:
>   partitions: fixed(1)
>   batchtype: UNLOGGED
>   select: fixed(1)/120
> {noformat}
> Running stress tool for just the insert prints 
> {noformat}
> Generating batches with [1..1] partitions and [0..1] rows (of [10..120] 
> total rows in the partitions)
> {noformat}
> and then immediately starts flooding me with 
> {{com.datastax.driver.core.exceptions.InvalidQueryException: Batch too 
> large}}. 
> Why I should be exceeding the {{batch_size_fail_threshold_in_kb: 50}} in the 
> {{cassandra.yaml}} I do not understand. My understanding is that the stress 
> tool should generate one row per batch. The size of a single row should not 
> exceed {{8+10*3+5*3+15*3+100*3 = 398 bytes}}. Assuming a worst case of all 
> text characters being 3 byte unicode characters. 
> This is how I start the attached user scenario:
> {noformat}
> [rsteppac@centos bin]$ ./cassandra-stress user 
> profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose 
> file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node 
> 10.211.55.8
> INFO  08:00:07 Did not find Netty's native epoll transport in the classpath, 
> defaulting to NIO.
> INFO  08:00:08 Using data-center name 'datacenter1' for 
> DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct 
> datacenter name with DCAwareRoundRobinPolicy constructor)
> INFO  08:00:08 New Cassandra host /10.211.55.8:9042 added
> Connected to cluster: Titan_DEV
> Datatacenter: datacenter1; Host: /10.211.55.8; Rack: rack1
> Created schema. Sleeping 1s for propagation.
> Generating batches with [1..1] partitions and [0..1] rows (of [10..120] 
> total rows in the partitions)
> com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large
>   at 
> com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
>   at 
> 

[jira] [Commented] (CASSANDRA-11105) cassandra-stress tool - InvalidQueryException: Batch too large

2018-07-05 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533307#comment-16533307
 ] 

mck commented on CASSANDRA-11105:
-

for reference sake: in newer versions the syntax is {{-insert 
visits=FIXED\(10M\)}}

for example:
./cassandra-stress user profile=../batch_too_large.yaml ops\(insert=1\) -insert 
visits=FIXED\(10M\) -log level=verbose 
file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node 
10.211.55.8
"FIXED" can also be any of the specifications found 
[here|https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/tools/stress/src/org/apache/cassandra/stress/settings/OptionDistribution.java#L158-L170]

> cassandra-stress tool - InvalidQueryException: Batch too large
> --
>
> Key: CASSANDRA-11105
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11105
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Cassandra 2.2.4, Java 8, CentOS 6.5
>Reporter: Ralf Steppacher
>Priority: Major
> Fix For: 4.0
>
> Attachments: 11105-trunk.txt, batch_too_large.yaml
>
>
> I am using Cassandra 2.2.4 and I am struggling to get the cassandra-stress 
> tool to work for my test scenario. I have followed the example on 
> http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema
>  to create a yaml file describing my test (attached).
> I am collecting events per user id (text, partition key). Events have a 
> session type (text), event type (text), and creation time (timestamp) 
> (clustering keys, in that order). Plus some more attributes required for 
> rendering the events in a UI. For testing purposes I ended up with the 
> following column spec and insert distribution:
> {noformat}
> columnspec:
>   - name: created_at
> cluster: uniform(10..1)
>   - name: event_type
> size: uniform(5..10)
> population: uniform(1..30)
> cluster: uniform(1..30)
>   - name: session_type
> size: fixed(5)
> population: uniform(1..4)
> cluster: uniform(1..4)
>   - name: user_id
> size: fixed(15)
> population: uniform(1..100)
>   - name: message
> size: uniform(10..100)
> population: uniform(1..100B)
> insert:
>   partitions: fixed(1)
>   batchtype: UNLOGGED
>   select: fixed(1)/120
> {noformat}
> Running stress tool for just the insert prints 
> {noformat}
> Generating batches with [1..1] partitions and [0..1] rows (of [10..120] 
> total rows in the partitions)
> {noformat}
> and then immediately starts flooding me with 
> {{com.datastax.driver.core.exceptions.InvalidQueryException: Batch too 
> large}}. 
> Why I should be exceeding the {{batch_size_fail_threshold_in_kb: 50}} in the 
> {{cassandra.yaml}} I do not understand. My understanding is that the stress 
> tool should generate one row per batch. The size of a single row should not 
> exceed {{8+10*3+5*3+15*3+100*3 = 398 bytes}}. Assuming a worst case of all 
> text characters being 3 byte unicode characters. 
> This is how I start the attached user scenario:
> {noformat}
> [rsteppac@centos bin]$ ./cassandra-stress user 
> profile=../batch_too_large.yaml ops\(insert=1\) -log level=verbose 
> file=~/centos_event_by_patient_session_event_timestamp_insert_only.log -node 
> 10.211.55.8
> INFO  08:00:07 Did not find Netty's native epoll transport in the classpath, 
> defaulting to NIO.
> INFO  08:00:08 Using data-center name 'datacenter1' for 
> DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct 
> datacenter name with DCAwareRoundRobinPolicy constructor)
> INFO  08:00:08 New Cassandra host /10.211.55.8:9042 added
> Connected to cluster: Titan_DEV
> Datatacenter: datacenter1; Host: /10.211.55.8; Rack: rack1
> Created schema. Sleeping 1s for propagation.
> Generating batches with [1..1] partitions and [0..1] rows (of [10..120] 
> total rows in the partitions)
> com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large
>   at 
> com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:271)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:185)
>   at 
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:55)
>   at 
> org.apache.cassandra.stress.operations.userdefined.SchemaInsert$JavaDriverRun.run(SchemaInsert.java:87)
>   at 
> org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:159)
>   at 
> org.apache.cassandra.stress.operations.userdefined.SchemaInsert.run(SchemaInsert.java:119)
>   at 
> 

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-07-09 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537820#comment-16537820
 ] 

mck commented on CASSANDRA-14423:
-

This will break running Cassandra-2.2 on jdk1.7


> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> 

[jira] [Updated] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-09 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14563:

Labels: lhf  (was: )

> Add animalsniffer to build to ensure runtime jdk compatbility
> -
>
> Key: CASSANDRA-14563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Build
>Reporter: mck
>Priority: Minor
>  Labels: lhf
>
> Cassandra-2.2 still supports running on JDK1.7
> No tests check this though, as all build and test with JDK1.8
> Adding the ant animalsniffer task can check that jdk1.8 classes or methods 
> are not used accidentally.
> ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14563) Add animalsniffer to build to ensure runtime jdk compatbility

2018-07-09 Thread mck (JIRA)
mck created CASSANDRA-14563:
---

 Summary: Add animalsniffer to build to ensure runtime jdk 
compatbility
 Key: CASSANDRA-14563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14563
 Project: Cassandra
  Issue Type: Improvement
  Components: Build
Reporter: mck


Cassandra-2.2 still supports running on JDK1.7

No tests check this though, as all build and test with JDK1.8

Adding the ant animalsniffer task can check that jdk1.8 classes or methods are 
not used accidentally.

ref: http://www.mojohaus.org/animal-sniffer/animal-sniffer/index.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14423) SSTables stop being compacted

2018-07-09 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537833#comment-16537833
 ] 

mck edited comment on CASSANDRA-14423 at 7/10/18 12:36 AM:
---

||Branch||uTest||
|[cassandra-2.2_14423.1|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1]|

[~mshuler], can you quick review this?


was (Author: michaelsembwever):

||Branch||uTest||
|[cassandra-2.2_14423.1|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|

[~mshuler], can you quick review this?

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off 

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-07-09 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537833#comment-16537833
 ] 

mck commented on CASSANDRA-14423:
-


||Branch||uTest||
|[cassandra-2.2_14423.1|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|

[~mshuler], can you quick review this?

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> 

[jira] [Comment Edited] (CASSANDRA-14423) SSTables stop being compacted

2018-07-09 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537833#comment-16537833
 ] 

mck edited comment on CASSANDRA-14423 at 7/10/18 12:46 AM:
---

||Branch||uTest||
|[cassandra-2.2_14423.1|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1]|

[~mshuler], can you quick review this?

(also created CASSANDRA-14563)


was (Author: michaelsembwever):
||Branch||uTest||
|[cassandra-2.2_14423.1|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-2.2_14423.1]|[!https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1.svg?style=svg!|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-2.2_14423.1]|

[~mshuler], can you quick review this?

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 

[jira] [Commented] (CASSANDRA-14423) SSTables stop being compacted

2018-07-10 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539403#comment-16539403
 ] 

mck commented on CASSANDRA-14423:
-

Committed as 3482370df5672c9337a16a8a52baba53b70a4fe8

> SSTables stop being compacted
> -
>
> Key: CASSANDRA-14423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14423
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Blocker
> Fix For: 2.2.13, 3.0.17, 3.11.3
>
>
> So seeing a problem in 3.11.0 where SSTables are being lost from the view and 
> not being included in compactions/as candidates for compaction. It seems to 
> get progressively worse until there's only 1-2 SSTables in the view which 
> happen to be the most recent SSTables and thus compactions completely stop 
> for that table.
> The SSTables seem to still be included in reads, just not compactions.
> The issue can be fixed by restarting C*, as it will reload all SSTables into 
> the view, but this is only a temporary fix. User defined/major compactions 
> still work - not clear if they include the result back in the view but is not 
> a good work around.
> This also results in a discrepancy between SSTable count and SSTables in 
> levels for any table using LCS.
> {code:java}
> Keyspace : xxx
> Read Count: 57761088
> Read Latency: 0.10527088681224288 ms.
> Write Count: 2513164
> Write Latency: 0.018211106398149903 ms.
> Pending Flushes: 0
> Table: xxx
> SSTable count: 10
> SSTables in each level: [2, 0, 0, 0, 0, 0, 0, 0, 0]
> Space used (live): 894498746
> Space used (total): 894498746
> Space used by snapshots (total): 0
> Off heap memory used (total): 11576197
> SSTable Compression Ratio: 0.6956629530569777
> Number of keys (estimate): 3562207
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 87
> Local read count: 57761088
> Local read latency: 0.108 ms
> Local write count: 2513164
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 86.33
> Bloom filter false positives: 43
> Bloom filter false ratio: 0.0
> Bloom filter space used: 8046104
> Bloom filter off heap memory used: 8046024
> Index summary off heap memory used: 3449005
> Compression metadata off heap memory used: 81168
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 5722
> Compacted partition mean bytes: 175
> Average live cells per slice (last five minutes): 1.0
> Maximum live cells per slice (last five minutes): 1
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Dropped Mutations: 0
> {code}
> Also for STCS we've confirmed that SSTable count will be different to the 
> number of SSTables reported in the Compaction Bucket's. In the below example 
> there's only 3 SSTables in a single bucket - no more are listed for this 
> table. Compaction thresholds haven't been modified for this table and it's a 
> very basic KV schema.
> {code:java}
> Keyspace : yyy
> Read Count: 30485
> Read Latency: 0.06708991307200263 ms.
> Write Count: 57044
> Write Latency: 0.02204061776873992 ms.
> Pending Flushes: 0
> Table: yyy
> SSTable count: 19
> Space used (live): 18195482
> Space used (total): 18195482
> Space used by snapshots (total): 0
> Off heap memory used (total): 747376
> SSTable Compression Ratio: 0.7607394576769735
> Number of keys (estimate): 116074
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 39
> Local read count: 30485
> Local read latency: NaN ms
> Local write count: 57044
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 79.76
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 690912
> Bloom filter off heap memory used: 690760
> Index summary off heap memory used: 54736
> Compression metadata off heap memory used: 1880
> Compacted partition minimum bytes: 73
> Compacted partition maximum bytes: 124
> Compacted partition mean bytes: 96
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 0 
> {code}
> {code:java}
> Apr 27 03:10:39 cassandra[9263]: TRACE o.a.c.d.c.SizeTieredCompactionStrategy 
> Compaction buckets are 
> 

[jira] [Commented] (CASSANDRA-13457) Diag. Events: Add base classes

2018-03-07 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390850#comment-16390850
 ] 

mck commented on CASSANDRA-13457:
-

{quote}Can we avoid the static fields? So to be avoiding adding to the 
CASSANDRA-7837 problems… I don't think C* has a better habit in place for this? 
But a singleton would be one better than all static fields…
{quote}
I can't see DiagnosticEventService not being a singleton, given the breadth of 
its context.
But if the static fields were removed (a la the spirit of CASSANDRA-7840), it 
would at least provide more options for unit testing and an ability to swap out 
the DiagnosticEventService implementation (eg like is done in \{{Tracing}}).

> Diag. Events: Add base classes
> --
>
> Key: CASSANDRA-13457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13457
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core, Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> Base ticket for adding classes that will allow you to implement and subscribe 
> to events.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed

2018-03-11 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394747#comment-16394747
 ] 

mck commented on CASSANDRA-11163:
-

Rebased trunk patch run the dtests. 

I don't believe the failed test has anything to do with this ticket: 
[repair_tests.repair_test.TestRepair.test_dc_parallel_repair|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/511/testReport/junit/repair_tests.repair_test/TestRepair/test_dc_parallel_repair/]

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 242168
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 12 00:03 ma-8-big-TOC.txt
> -rw-r--r-- 1 root root 14902 Feb 12 00:03 ma-8-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 12 00:03 ma-8-big-Statistics.db
> -rw-r--r-- 1 root root   1458631 Feb 

[jira] [Commented] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed

2018-03-12 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394947#comment-16394947
 ] 

mck commented on CASSANDRA-11163:
-

Committed.

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 242168
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 12 00:03 ma-8-big-TOC.txt
> -rw-r--r-- 1 root root 14902 Feb 12 00:03 ma-8-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 12 00:03 ma-8-big-Statistics.db
> -rw-r--r-- 1 root root   1458631 Feb 12 00:03 ma-8-big-Index.db
> -rw-r--r-- 1 root root 10808 Feb 12 00:03 ma-8-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 12 00:03 ma-8-big-Digest.crc32
> -rw-r--r-- 1 root root  19660275 Feb 12 00:03 ma-8-big-Data.db
> -rw-r--r-- 1 root root  1204 Feb 12 00:03 ma-8-big-CRC.db
> -rw-r--r-- 1 root root   

[jira] [Updated] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed

2018-03-12 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-11163:

   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)
   3.11.3
   3.0.17
   4.0
   Status: Resolved  (was: Ready to Commit)

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 242168
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 12 00:03 ma-8-big-TOC.txt
> -rw-r--r-- 1 root root 14902 Feb 12 00:03 ma-8-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 12 00:03 ma-8-big-Statistics.db
> -rw-r--r-- 1 root root   1458631 Feb 12 00:03 ma-8-big-Index.db
> -rw-r--r-- 1 root root 10808 Feb 12 00:03 ma-8-big-Filter.db
> -rw-r--r-- 1 

[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-28 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381446#comment-16381446
 ] 

mck edited comment on CASSANDRA-14247 at 3/1/18 5:40 AM:
-

Approaches to (2) are found 
[here|https://github.com/thelastpickle/cassandra/commit/0d6c8117120ef444e1aa52e49ab66aafa159677e]
 and 
[here|https://github.com/thelastpickle/cassandra/commit/c1f66d7c389ab5816b36d7d02ca2b8043bab0ecf].

The former was just my first attempt at removing the overhead of the 
{{string.split(..)}} call. The second re-codes it to use nio buffers.
It's the latter i presume we are aiming for. Is it what you had in mind 
[~mkjellman]?
A few quick stress test showed that it was 60% (±5%) faster than the original 
patches above, working with {{world_cities_a.csv}} as input.

{quote}iterate the text left to right or right to left{quote}
Can we put that in the too-hard basket for now? 
I would think a better next step (in a new ticket) would be to improve the 
other analysers to also use ByteBuffers. as there's an obvious performance win 
here.


was (Author: michaelsembwever):
Approaches to (2) are found 
[here|https://github.com/thelastpickle/cassandra/commit/0d6c8117120ef444e1aa52e49ab66aafa159677e]
 and 
[here|https://github.com/thelastpickle/cassandra/commit/c1f66d7c389ab5816b36d7d02ca2b8043bab0ecf].

The former was just my first attempt at removing the overhead of the 
{{string.split(..)}} call. The second re-codes it to use nio buffers.
It's the latter i presume we are aiming for. Is it what you had in mind 
[~mkjellman]?
A few quick stress test showed that it was 60% (±5%) faster than the original 
patches above, working with {{world_cities_a.csv}} as input.

{quote}iterate the text left or right or right{quote}
Can we put that in the too-hard basket for now? 
I would think a better next step (in a new ticket) would be to improve the 
other analysers to also use ByteBuffers. as there's an obvious win here.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-28 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381446#comment-16381446
 ] 

mck edited comment on CASSANDRA-14247 at 3/1/18 3:10 AM:
-

Approaches to (2) are found 
[here|https://github.com/thelastpickle/cassandra/commit/0d6c8117120ef444e1aa52e49ab66aafa159677e]
 and 
[here|https://github.com/thelastpickle/cassandra/commit/c1f66d7c389ab5816b36d7d02ca2b8043bab0ecf].

The former was just my first attempt at removing the overhead of the 
{{string.split(..)}} call. The second re-codes it to use nio buffers.
It's the latter i presume we are aiming for. Is it what you had in mind 
[~mkjellman]?
A few quick stress test showed that it was 60% (±5%) faster than the original 
patches above, working with {{world_cities_a.csv}} as input.

{quote}iterate the text left or right or right{quote}
Can we put that in the too-hard basket for now? 
I would think a better next step (in a new ticket) would be to improve the 
other analysers to also use ByteBuffers. as there's an obvious win here.


was (Author: michaelsembwever):
Approaches to (2) are found 
[here|https://github.com/thelastpickle/cassandra/commit/0d6c8117120ef444e1aa52e49ab66aafa159677e]
 and 
[here|https://github.com/thelastpickle/cassandra/commit/c1f66d7c389ab5816b36d7d02ca2b8043bab0ecf].

The former was just my first attempt at removing the overhead of the 
{{string.split(..)}} call. The second re-codes it to use nio buffers.
It's the latter i presume we are aiming for. Is it what you had in mind 
[~mkjellman]?
A few quick stress test showed that it was 60% (±5%) faster than the original 
patches above, working with {{world_cities_a.csv}} as input.

{quote}iterate the text left or right or right{quote}
Can we put that in the too-hard basket for now? 
I think the next step would be to improve the other analysers to also use 
ByteBuffers. as there's an obvious win here.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-02-28 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381446#comment-16381446
 ] 

mck commented on CASSANDRA-14247:
-

Approaches to (2) are found 
[here|https://github.com/thelastpickle/cassandra/commit/0d6c8117120ef444e1aa52e49ab66aafa159677e]
 and 
[here|https://github.com/thelastpickle/cassandra/commit/c1f66d7c389ab5816b36d7d02ca2b8043bab0ecf].

The former was just my first attempt at removing the overhead of the 
{{string.split(..)}} call. The second re-codes it to use nio buffers.
It's the latter i presume we are aiming for. Is it what you had in mind 
[~mkjellman]?
A few quick stress test showed that it was 60% (±5%) faster than the original 
patches above, working with {{world_cities_a.csv}} as input.

{quote}iterate the text left or right or right{quote}
Can we put that in the too-hard basket for now? 
I think the next step would be to improve the other analysers to also use 
ByteBuffers. as there's an obvious win here.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed

2018-03-08 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392525#comment-16392525
 ] 

mck commented on CASSANDRA-11163:
-


|| branch || testall || dtest ||
| [14166-3.0|https://github.com/kgreav/cassandra/tree/14166-3.0]| 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.0] | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/506/]
 |
| [14166-3.11|https://github.com/kgreav/cassandra/tree/14166-3.11]  | 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.11]| 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/507]
 |
| [14166-trunk|https://github.com/kgreav/cassandra/tree/14166-trunk]| 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-trunk]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/508]
 |

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 242168
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> 

[jira] [Comment Edited] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed

2018-03-10 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392525#comment-16392525
 ] 

mck edited comment on CASSANDRA-11163 at 3/11/18 2:17 AM:
--

||branch||testall||dtest||
|[14166-3.0|https://github.com/kgreav/cassandra/tree/14166-3.0]|[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.0]|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/506/]|
|[14166-3.11|https://github.com/kgreav/cassandra/tree/14166-3.11]|[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.11]|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/507]|
|[14166-trunk|https://github.com/kgreav/cassandra/tree/14166-trunk]|[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-trunk]|[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/508]|

EDIT: i had 
[troubles|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/510/]
 getting the trunk patch dtests to run. As upstream trunk dtests appear to now 
be 
[working|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/462/],
 I, off a 
[thelastpickle|https://github.com/thelastpickle/cassandra/tree/14166-trunk] 
fork, rebased that patch off trunk and am running the dtests 
[again|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/511/]…


was (Author: michaelsembwever):

|| branch || testall || dtest ||
| [14166-3.0|https://github.com/kgreav/cassandra/tree/14166-3.0]| 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.0] | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/506/]
 |
| [14166-3.11|https://github.com/kgreav/cassandra/tree/14166-3.11]  | 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-3.11]| 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/507]
 |
| [14166-trunk|https://github.com/kgreav/cassandra/tree/14166-trunk]| 
[testall|https://circleci.com/gh/kgreav/cassandra/tree/14166-trunk]   | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/508]
 |

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> 

[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-15 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
>  Labels: sasi
> Fix For: 4.0, 3.11.3
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-15 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Fix Version/s: (was: 3.11.x)
   3.11.3

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
>  Labels: sasi
> Fix For: 4.0, 3.11.3
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-14 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399884#comment-16399884
 ] 

mck edited comment on CASSANDRA-14247 at 3/15/18 3:56 AM:
--

[~mkjellman],
{quote}one thing that stuck out to me was this while loop that didn't actually 
"do" anything but it does do something... could you at least throw a comment in 
just to make it a bit more readable?{quote}

comment thrown in :-)

The byte buffer approach is pushed to the trunk_14247 and cassandra-3.11_14247 
branches. 

The rationale to adding this patch also to cassandra-3.11 is it's an important 
stability workaround to {{\{mode:CONTAINS\}}}, and is a standalone class, 
annotated as {{@Beta}}, that does not touch any other code .

The following patches have been submitted:

|| branch || testall || dtest ||
| 
[cassandra-3.11_14247|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_14247]
   | 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_14247]
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/512]
 |
| [trunk_14247|https://github.com/thelastpickle/cassandra/tree/mck/trunk_14247] 
| 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_14247]
  | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/513]
 |



was (Author: michaelsembwever):
[~mkjellman],
{quote}one thing that stuck out to me was this while loop that didn't actually 
"do" anything but it does do something... could you at least throw a comment in 
just to make it a bit more readable?{quote}

comment thrown in :-)

The byte buffer approach is pushed to the trunk_14247 and cassandra-3.11_14247 
branches. 

The rationale to adding this patch also to cassandra-3.11 is it's an important 
stability workaround to {{mode: CONTAINS}}, and it is an additional standalone 
class that does not touch other code which has been annotated as {{@Beta}}.

The following patches have been submitted:

|| branch || testall || dtest ||
| 
[cassandra-3.11_14247|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_14247]
   | 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_14247]
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/512]
 |
| [trunk_14247|https://github.com/thelastpickle/cassandra/tree/mck/trunk_14247] 
| 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_14247]
  | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/513]
 |


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-14 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

  Labels: sasi  (was: )
Reviewer: Michael Kjellman
  Status: Patch Available  (was: In Progress)

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
>  Labels: sasi
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-15 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399964#comment-16399964
 ] 

mck commented on CASSANDRA-14247:
-

{quote}i'll add the relevant section to doc/SASI.md{quote}

Done. Added docs update to just {{trunk_14247}} branch.

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
>  Labels: sasi
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

2018-03-14 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399888#comment-16399888
 ] 

mck edited comment on CASSANDRA-14247 at 3/15/18 4:20 AM:
--

{quote}the only other thought i have is if the in-tree documentation needs to 
be updated give this is something people interact with via CQL and schema 
updates.{quote}

Yes I better do that. Good catch! 

EDIT: there's actually no CQL/schema docs down to the details of SASI options. 
But i'll add the relevant section to {{doc/SASI.md}}.


was (Author: michaelsembwever):
{quote}the only other thought i have is if the in-tree documentation needs to 
be updated give this is something people interact with via CQL and schema 
updates.{quote}

Yes I better do that. Good catch! 

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
>  Labels: sasi
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



<    1   2   3   4   5   6   7   8   9   10   >