[jira] [Commented] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Vinay Chella (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774858#comment-16774858
 ] 

Vinay Chella commented on CASSANDRA-15030:
--

The latest change looks good and it accepts both keystore and truststore, looks 
good.

> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14298) cqlshlib tests broken on b.a.o

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774844#comment-16774844
 ] 

Dinesh Joshi commented on CASSANDRA-14298:
--

Thank you for the patch, [~ptbannister]. I have imported your patch into a git 
branch to make it easier for review. I have opened this PR for review: 
https://github.com/apache/cassandra-dtest/pull/45/files I will over over it and 
add my comments. [~spo...@gmail.com] please feel free to review as well.

> cqlshlib tests broken on b.a.o
> --
>
> Key: CASSANDRA-14298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Legacy/Testing
>Reporter: Stefan Podkowinski
>Assignee: Patrick Bannister
>Priority: Major
>  Labels: cqlsh, dtest, pull-request-available
> Attachments: CASSANDRA-14298.txt, cqlsh_tests_notes.md
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14298) cqlshlib tests broken on b.a.o

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774844#comment-16774844
 ] 

Dinesh Joshi edited comment on CASSANDRA-14298 at 2/22/19 6:45 AM:
---

Thank you for the patch, [~ptbannister]. I have imported your patch into a git 
branch and rebased it on the current master to make it easier for review. I 
have opened this PR for review: 
https://github.com/apache/cassandra-dtest/pull/45/files I will over over it and 
add my comments. [~spo...@gmail.com] please feel free to review as well.


was (Author: djoshi3):
Thank you for the patch, [~ptbannister]. I have imported your patch into a git 
branch to make it easier for review. I have opened this PR for review: 
https://github.com/apache/cassandra-dtest/pull/45/files I will over over it and 
add my comments. [~spo...@gmail.com] please feel free to review as well.

> cqlshlib tests broken on b.a.o
> --
>
> Key: CASSANDRA-14298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Legacy/Testing
>Reporter: Stefan Podkowinski
>Assignee: Patrick Bannister
>Priority: Major
>  Labels: cqlsh, dtest, pull-request-available
> Attachments: CASSANDRA-14298.txt, cqlsh_tests_notes.md
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14298) cqlshlib tests broken on b.a.o

2019-02-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-14298:
---
Labels: cqlsh dtest pull-request-available  (was: cqlsh dtest)

> cqlshlib tests broken on b.a.o
> --
>
> Key: CASSANDRA-14298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build, Legacy/Testing
>Reporter: Stefan Podkowinski
>Assignee: Patrick Bannister
>Priority: Major
>  Labels: cqlsh, dtest, pull-request-available
> Attachments: CASSANDRA-14298.txt, cqlsh_tests_notes.md
>
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774831#comment-16774831
 ] 

Dinesh Joshi commented on CASSANDRA-15030:
--

After having an offline conversation with [~vinaykumarcse], I have enabled 
specifying truststore as well as it may be useful in cases where you'd like to 
restrict CA Roots.

> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774778#comment-16774778
 ] 

Dinesh Joshi edited comment on CASSANDRA-15030 at 2/22/19 5:29 AM:
---

Thanks for the comments [~vinaykumarcse]. We can specify custom truststores as 
a JVM arg. Besides this is only useful if we expect the client to present a SSL 
certificate signed by a non-standard CA root which we don't have in scope here.

[~cnlwsu], For tests it would be better to leave the mock CA Root specification 
at the JVM level. It would be cumbersome and error prone to have everyone 
specify the same root all over the place.


was (Author: djoshi3):
Thanks for the comments [~vinaykumarcse]. We can specify custom truststores as 
a JVM arg. If you feel strongly about it, I can add it.

[~cnlwsu], For tests it would be better to leave the mock CA Root specification 
at the JVM level. It would be cumbersome and error prone to have everyone 
specify the same root all over the place.

> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774778#comment-16774778
 ] 

Dinesh Joshi commented on CASSANDRA-15030:
--

Thanks for the comments [~vinaykumarcse]. We can specify custom truststores as 
a JVM arg. If you feel strongly about it, I can add it.

[~cnlwsu], For tests it would be better to leave the mock CA Root specification 
at the JVM level. It would be cumbersome and error prone to have everyone 
specify the same root all over the place.

> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Chris Lohfink (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774724#comment-16774724
 ] 

Chris Lohfink commented on CASSANDRA-15030:
---

* Configuration constructor is getting unwieldy, can you add a fluent Builder 
inner class for constructing it?
* In the tests instead of globally setting the ca path with system properties 
in gradle build script, you can set the cert path for the WebClient with its 
WebClientOptions ie:

{code}
WebClientOptions clientOpts = new WebClientOptions()
  .setSsl(config.isSslEnabled())
  .setTrustStoreOptions(new JksOptions()

.setPath(config.getKeyStorePath())

.setPassword(config.getKeystorePassword()));
WebClient client = WebClient.create(vertx, clientOpts);
{code}

That would open up tests in future for invalid or missing certs

* NP: add a newline at the end of the config

> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15028) Autogenerate api docs in sidecar

2019-02-21 Thread Chris Lohfink (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-15028:
--
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> Autogenerate api docs in sidecar
> 
>
> Key: CASSANDRA-15028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15028
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Swagger can provide us a way to define the api and autogenerate nice looking 
> docs and have interactive UI for trying the APIs. This can also be used to 
> autogenerate client libraries in many different libraries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-sidecar] branch master updated: Autogenerate API docs for sidecar Patch by Chris Lohfink, reviewed by Dinesh Joshi for CASSANDRA-15028

2019-02-21 Thread clohfink
This is an automated email from the ASF dual-hosted git repository.

clohfink pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git


The following commit(s) were added to refs/heads/master by this push:
 new 5712fb4  Autogenerate API docs for sidecar Patch by Chris Lohfink, 
reviewed by Dinesh Joshi for CASSANDRA-15028
5712fb4 is described below

commit 5712fb486b8e49b9b136d28fe2b9915e1d081689
Author: Chris Lohfink 
AuthorDate: Thu Feb 21 20:14:52 2019 -0600

Autogenerate API docs for sidecar
Patch by Chris Lohfink, reviewed by Dinesh Joshi for CASSANDRA-15028
---
 api.yaml   | 57 ++
 build.gradle   | 33 ++---
 .../org/apache/cassandra/sidecar/MainModule.java   |  9 
 3 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/api.yaml b/api.yaml
new file mode 100644
index 000..a49e6a8
--- /dev/null
+++ b/api.yaml
@@ -0,0 +1,57 @@
+openapi: 3.0.0
+
+info:
+  description: Apache Cassandra sidecar
+  version: "1.0.0"
+  title: Apache Cassandra Sidecar API
+  license:
+name: Apache 2.0
+url: 'http://www.apache.org/licenses/LICENSE-2.0.html'
+
+tags:
+  - name: visibility
+description: See status of Cassandra
+  - name: management
+description: Execute, coordinate, or schedule operations
+
+paths:
+  /api/v1/__health:
+get:
+  tags:
+- visibility
+  summary: Check Cassandra Health
+  operationId: health
+  description: |
+Lists status of Cassandra Daemon and its services
+  responses:
+'200':
+  description: Current status if Cassandra is up and returning OK 
status
+  content:
+application/json:
+  schema:
+type: object
+items:
+  $ref: '#/components/schemas/HealthStatus'
+'503':
+  description: Health check failed and returning NOT_OK
+  content:
+application/json:
+  schema:
+type: object
+items:
+  $ref: '#/components/schemas/HealthStatus'
+
+components:
+  schemas:
+HealthStatus:
+  type: object
+  required:
+- status
+  properties:
+status:
+  type: string
+  enum:
+- 'OK'
+- 'NOT_OK'
+  description: if reads are able to run through binary interface. 'OK' 
or 'NOT_OK'
+  example: 'OK'
diff --git a/build.gradle b/build.gradle
index b7659a8..5d4c771 100644
--- a/build.gradle
+++ b/build.gradle
@@ -1,10 +1,13 @@
+plugins {
+id 'java'
+id 'application'
+id 'idea'
+id 'org.hidetake.swagger.generator' version '2.16.0'
+}
+
 group 'org.apache.cassandra'
 version '1.0-SNAPSHOT'
 
-apply plugin: 'java'
-apply plugin: 'application'
-apply plugin: 'idea'
-
 sourceCompatibility = 1.8
 
 repositories {
@@ -41,7 +44,7 @@ sourceSets {
 // This is needed as gradle considers `src/main/resources` as the default 
resources folder
 main {
 resources {
-srcDirs = ['conf', 'setup']
+srcDirs = ['conf', 'setup', 'src/main/resources']
 }
 }
 test {
@@ -70,7 +73,9 @@ dependencies {
 
 runtime group: 'commons-beanutils', name: 'commons-beanutils', version: 
'1.9.3'
 runtime group: 'org.yaml', name: 'snakeyaml', version: '1.23'
+
 jolokia 'org.jolokia:jolokia-jvm:1.6.0:agent'
+swaggerUI 'org.webjars:swagger-ui:3.10.0'
 
 testCompile group: 'org.cassandraunit', name: 'cassandra-unit-shaded', 
version: '3.3.0.2'
 testCompile 'com.datastax.cassandra:cassandra-driver-core:3.6+:tests'
@@ -79,6 +84,19 @@ dependencies {
 testCompile group: 'io.vertx', name: 'vertx-junit5', version: '3.6.3'
 }
 
+swaggerSources {
+apidoc {
+inputFile = file('api.yaml')
+reDoc {
+outputDir = file('src/main/resources/docs')
+title = 'Cassandra Sidecar API Documentation'
+}
+ui {
+outputDir = file('src/main/resources/docs/swagger')
+}
+}
+}
+
 task copyCodeStyle(type: Copy) {
 from "ide/idea/codeStyleSettings.xml"
 into ".idea"
@@ -104,6 +122,8 @@ clean {
 delete "$projectDir/lib"
 println "Deleting agents $projectDir/src/dist/agents"
 delete "$projectDir/src/dist/agents"
+println "Deleting generated docs $projectDir/src/main/resources/docs"
+delete "$projectDir/src/main/resources/docs"
 
 }
 
@@ -113,5 +133,4 @@ test {
 
 // copyDist gets called on every build
 copyDist.dependsOn installDist
-build.dependsOn copyDist
-build.dependsOn copyJolokia
+build.dependsOn copyDist, generateReDoc, generateSwaggerUI, copyJolokia
diff --git a/src/main/java/org/apache/cassandra/sidecar/MainModule.java 
b/src/main/java/org/apache/cassandra/sidecar/MainModule.java
index a6950a7..59113f2 100644
--- a/src/main/java/org/apache/cassandra/sidecar/MainModule.java
++

[jira] [Commented] (CASSANDRA-15028) Autogenerate api docs in sidecar

2019-02-21 Thread Chris Lohfink (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774701#comment-16774701
 ] 

Chris Lohfink commented on CASSANDRA-15028:
---

thanks, committed as 5712fb486b8e49b9b136d28fe2b9915e1d081689

> Autogenerate api docs in sidecar
> 
>
> Key: CASSANDRA-15028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15028
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Swagger can provide us a way to define the api and autogenerate nice looking 
> docs and have interactive UI for trying the APIs. This can also be used to 
> autogenerate client libraries in many different libraries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15030) Add support for SSL and bindable address to sidecar

2019-02-21 Thread Vinay Chella (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774650#comment-16774650
 ] 

Vinay Chella commented on CASSANDRA-15030:
--

First pass review:

AbstractHealthServiceTest:
* Unused references
* Unused {{Router router = injector.getInstance(Router.class);}}
* Avoid {{sout}}, use of loggers might be a good idea here

TestModule:
* {{bind(CassandraSidecarDaemon.class).in(Singleton.class)}}, you can simplify 
this by using class level scope @Singleton

MainModule:
* Should we also add truststore context 
[here|https://github.com/dineshjoshi/cassandra-sidecar/commit/d9cdb088f2efdb8e537d35f3f9c492e51f55c3d1#diff-a54ca631e55a83c55242baa44ed6e271R42]?
 I believe this 
[path|https://github.com/dineshjoshi/cassandra-sidecar/commit/d9cdb088f2efdb8e537d35f3f9c492e51f55c3d1#diff-a54ca631e55a83c55242baa44ed6e271R64]
 can be either truststore or keystore here?

Also, you might want to run code style formatting on this changeset.


> Add support for SSL and bindable address to sidecar
> ---
>
> Key: CASSANDRA-15030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15030
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Sidecar
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to support SSL for the sidecar's REST interface. We should also have 
> the ability to bind the sidecar's API to a specific network interface. This 
> patch adds support for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774646#comment-16774646
 ] 

Dinesh Joshi edited comment on CASSANDRA-14482 at 2/22/19 1:23 AM:
---

Thanks, [~benedict] for the insightful comment. I patched the {{zstd-jni}} to 
add the ability to enable checksumming on the methods that [~bdeggleston] 
suggested. It was accepted upstream and is now available starting with 
{{zstd-jni-1.3.8-5}}. I have pulled it in and enabled it. I think that resolves 
Blake's concerns regarding GC and we get checksumming as well.


was (Author: djoshi3):
Thanks, [~benedict] for the insightful comment. I patched the {{zstd-jni}} to 
add the ability to enable compression on the methods that [~bdeggleston] 
suggested. It was accepted upstream and is now available starting with 
{{zstd-jni-1.3.8-5}}. I have pulled it in and enabled it. I think that resolves 
Blake's concerns regarding GC and we get checksumming as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774646#comment-16774646
 ] 

Dinesh Joshi commented on CASSANDRA-14482:
--

Thanks, [~benedict] for the insightful comment. I patched the {{zstd-jni}} to 
add the ability to enable compression on the methods that [~bdeggleston] 
suggested. It was accepted upstream and is now available starting with 
{{zstd-jni-1.3.8-5}}. I have pulled it in and enabled it. I think that resolves 
Blake's concerns regarding GC and we get checksumming as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15031) Add integration tests task to sidecar

2019-02-21 Thread Vinay Chella (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella updated CASSANDRA-15031:
-
Component/s: Sidecar

> Add integration tests task to sidecar
> -
>
> Key: CASSANDRA-15031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15031
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Sidecar
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Major
>
> Add support to run longer run tasks, and include some tests for existing 
> health service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15032) Adjust transient replication keyspace replication options to be more friendly to naive clients

2019-02-21 Thread Andy Tolbert (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-15032:
-
Description: 
To specify the number of transient replicas, replication options are specified 
like:

{code}
ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 
'DC1' : '3/1'}; 
ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'SimpleStrategy', 
'replication_factor' : '3/1'}
{code}

It occurred to me that existing client drivers that parse keyspace options may 
not handle this gracefully.

For example, the datastax java driver tries to parse {{3/1}} as a number and 
fails.  In this case, the parsing error is not fatal, its just that the 
metadata for that keyspace in the driver is incomplete, and things like token 
aware routing can't be utilized.

It is possible that other libraries may not handle this as well.

As an alternative, I propose adding a separate option like: 
{{'transient_replicas': 1}}.  {{replication_factor}} would represent the total 
number of replicas (full and transient) in this case. Something similar could 
be done for the NTS case, but might be slightly clumsy to express.

This would allow existing client libraries to continue working, and while 
things like routing may be suboptimal (i.e. driver won't know to differentiate 
between replicas and transient replicas), at least parsing won't fail in 
possibly fatal ways.



  was:
To specify the number of transient replicas, replication options are specified 
like:

{code}
ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 
'DC1' : '3/1'}; ALTER KEYSPACE foo WITH REPLICATION = {'class' : 
'SimpleStrategy', 'replication_factor' : '3/1'}
{code}

It occurred to me that existing client drivers that parse keyspace options may 
not handle this gracefully.

For example, the datastax java driver tries to parse {{3/1}} as a number and 
fails.  In this case, the parsing error is not fatal, its just that the 
metadata for that keyspace in the driver is incomplete, and things like token 
aware routing can't be utilized.

It is possible that other libraries may not handle this as well.

As an alternative, I propose adding a separate option like: 
{{'transient_replicas': 1}}.  {{replication_factor}} would represent the total 
number of replicas (full and transient) in this case.

This would allow existing client libraries to continue working, and while 
things like routing may be suboptimal (i.e. driver won't know to differentiate 
between replicas and transient replicas), at least parsing won't fail in 
possibly fatal ways.




> Adjust transient replication keyspace replication options to be more friendly 
> to naive clients
> --
>
> Key: CASSANDRA-15032
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15032
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Transient Replication
>Reporter: Andy Tolbert
>Priority: Major
>
> To specify the number of transient replicas, replication options are 
> specified like:
> {code}
> ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 
> 'DC1' : '3/1'}; 
> ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'SimpleStrategy', 
> 'replication_factor' : '3/1'}
> {code}
> It occurred to me that existing client drivers that parse keyspace options 
> may not handle this gracefully.
> For example, the datastax java driver tries to parse {{3/1}} as a number and 
> fails.  In this case, the parsing error is not fatal, its just that the 
> metadata for that keyspace in the driver is incomplete, and things like token 
> aware routing can't be utilized.
> It is possible that other libraries may not handle this as well.
> As an alternative, I propose adding a separate option like: 
> {{'transient_replicas': 1}}.  {{replication_factor}} would represent the 
> total number of replicas (full and transient) in this case. Something similar 
> could be done for the NTS case, but might be slightly clumsy to express.
> This would allow existing client libraries to continue working, and while 
> things like routing may be suboptimal (i.e. driver won't know to 
> differentiate between replicas and transient replicas), at least parsing 
> won't fail in possibly fatal ways.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15032) Adjust transient replication keyspace replication options to be more friendly to naive clients

2019-02-21 Thread Andy Tolbert (JIRA)
Andy Tolbert created CASSANDRA-15032:


 Summary: Adjust transient replication keyspace replication options 
to be more friendly to naive clients
 Key: CASSANDRA-15032
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15032
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Transient Replication
Reporter: Andy Tolbert


To specify the number of transient replicas, replication options are specified 
like:

{code}
ALTER KEYSPACE foo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 
'DC1' : '3/1'}; ALTER KEYSPACE foo WITH REPLICATION = {'class' : 
'SimpleStrategy', 'replication_factor' : '3/1'}
{code}

It occurred to me that existing client drivers that parse keyspace options may 
not handle this gracefully.

For example, the datastax java driver tries to parse {{3/1}} as a number and 
fails.  In this case, the parsing error is not fatal, its just that the 
metadata for that keyspace in the driver is incomplete, and things like token 
aware routing can't be utilized.

It is possible that other libraries may not handle this as well.

As an alternative, I propose adding a separate option like: 
{{'transient_replicas': 1}}.  {{replication_factor}} would represent the total 
number of replicas (full and transient) in this case.

This would allow existing client libraries to continue working, and while 
things like routing may be suboptimal (i.e. driver won't know to differentiate 
between replicas and transient replicas), at least parsing won't fail in 
possibly fatal ways.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15031) Add integration tests task to sidecar

2019-02-21 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-15031:
-

 Summary: Add integration tests task to sidecar
 Key: CASSANDRA-15031
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15031
 Project: Cassandra
  Issue Type: Improvement
Reporter: Chris Lohfink
Assignee: Chris Lohfink


Add support to run longer run tasks, and include some tests for existing health 
service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15027) Handle IR prepare phase failures less race prone by waiting for all results

2019-02-21 Thread Blake Eggleston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774515#comment-16774515
 ] 

Blake Eggleston commented on CASSANDRA-15027:
-

Thanks [~spo...@gmail.com]. I’ve extended your code so that in addition to 
waiting for other anti-compactions to complete, the coordinator also 
pro-actively cancels ongoing anti-compactions on the other participants. This 
avoids wasting time waiting for anti-compactions on other machines. The code 
does 3 things:
 * Adds a session state check to the {{isStopRequested}} method in the 
anti-compaction iterator.
 * The coordinator now sends failure messages to all participants when it 
receives a failure message from one of them in the prepare phase. It does not 
mark these participants as having failed internally though, since that would 
cause the nodetool session to immediately complete. Instead, it waits until 
it’s received messages from all the other nodes.
 * The participants will now respond with a failed prepare message if the 
anti-compaction completes, but the session was failed in the mean time. This 
prevents a dead lock on the coordinator in the case where the participant 
received a failure message between the time the anti-compaction completes and 
the callback fires.

Let me know what you think. If everything looks ok to you, I’m +1 on committing.

[trunk|https://github.com/bdeggleston/cassandra/tree/15027-trunk]
 
[circle|https://circleci.com/gh/bdeggleston/workflows/cassandra/tree/15027-trunk]

> Handle IR prepare phase failures less race prone by waiting for all results
> ---
>
> Key: CASSANDRA-15027
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15027
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Local/Compaction
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Handling incremental repairs as a coordinator begins by sending a 
> {{PrepareConsistentRequest}} message to all participants, which may also 
> include the coordinator itself. Participants will run anti-compactions upon 
> receiving such a message and report the result of the operation back to the 
> coordinator.
> Once we receive a failure response from any of the participants, we fail-fast 
> in {{CoordinatorSession.handlePrepareResponse()}}, which will in turn 
> completes the {{prepareFuture}} that {{RepairRunnable}} is blocking on. Then 
> the repair command will terminate with an error status, as expected.
> The issue is that in case the node will both be coordinator and participant, 
> we may end up with a local session and submitted anti-compactions, which will 
> be executed without any coordination with the coordinator session (on same 
> node). This may result in situations where running repair commands right 
> after another, may cause overlapping execution of anti-compactions that will 
> cause the following (misleading) message to show up in the logs and will 
> cause the repair to fail again:
>  "Prepare phase for incremental repair session %s has failed because it 
> encountered intersecting sstables belonging to another incremental repair 
> session (%s). This is by starting an incremental repair session before a 
> previous one has completed. Check nodetool repair_admin for hung sessions and 
> fix them."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14939) fix some operational holes in incremental repair

2019-02-21 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774116#comment-16774116
 ] 

Marcus Eriksson commented on CASSANDRA-14939:
-

in general, this looks very nice and will be a huge help

General stuff
 * We keep the LocalSessions around for 1 day in system.repairs - I guess it is 
possible to give incomplete information for summarizeRepaired after bounce for 
example? (I have no real solution other than keeping them around for a longer 
time)

ColumnFamilyStore;
 * test(s) for getPendingRepairStats
 * in getPendingRepairStats there is a potential NPE - the pending repair 
status can be removed between the isPendingRepair check and getting the 
pendingRepair UUID from the sstable
 * when we use {{runWithCompactionsDisabled}} we could potentially pass in the 
ranges we need to cancel compactions for (totally ok leaving this for a future 
ticket though)
 * potential NPE - {{sst.isPendingRepair}} call first, then 
{{sst.getPendingRepair}} in the {{runWithCompactionsDisabled}} call

CompactionStrategyManager;
 * in {{releaseRepairedData}} - we should probably make sure all the callables 
are cancelled if we catch that exception there, otherwise we might keep the 
sstables marked as compacting forever

PendingRepairManager;
 * I think strategies can be removed even though we have the read lock - 
{{Set sstables = get(sessionID).getSSTables();}} could NPE in 
that case

Range;
 * new {{Range.intersects(..)}} method should probably have a test
 * same with new {{subtract}} method - both get tested indirectly, but might be 
good with a few direct ones

LocalSessions;
 * in {{getPendingStats}} sessionIDs set is not used
 * in {{getPendingStats}} when checking {{if (!Iterables.any(ranges, r -> 
r.intersects(session.ranges)))}} session could be null - there is a null check 
right below which we could move up.

RepairedState;
 * This class could use a more comments

PendingStat;
 * In {{addSSTable}}, instead of checking 
{{Preconditions.checkArgument(sessionID != null);}} we should probably just 
skip the sstable as it means it has been moved out of pending

PendingStats;
 * seems to be a mismatch in the columns in {{to/fromComposite}}

SchemaArgsParser;
 * Untested

I changed {{RepairAdmin}} nodetool command to use subcommands to reduce some of 
the manual parameter verification;
 [https://github.com/krummas/cassandra/commits/blake/14939-trunk-nodetool]

> fix some operational holes in incremental repair
> 
>
> Key: CASSANDRA-14939
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14939
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> Incremental repair has a few operational rough spots that make it more 
> difficult to fully automate and operate at scale than it should be.
> * Visibility into whether pending repair data exists for a given token range.
> * Ability to force promotion/demotion of data for completed sessions instead 
> of waiting for compaction.
> * Get the most recent repairedAt timestamp for a given token range.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10190) Python 3 support for cqlsh

2019-02-21 Thread aman raparia (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774015#comment-16774015
 ] 

aman raparia commented on CASSANDRA-10190:
--

Patrick can you please post the patch link for that and also is the new code 
for cqlsh for python 3 is updated on github repo

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Priority: Major
>  Labels: cqlsh
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10190) Python 3 support for cqlsh

2019-02-21 Thread Patrick Bannister (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774002#comment-16774002
 ] 

Patrick Bannister commented on CASSANDRA-10190:
---

We've resumed CASSANDRA-14298 in the past week, hopefully we'll get that
done soon, and that would clear the way for this ticket.

I had a patch that got cqlsh working on 3 and 2.7 in October. The
saferscanner problems are solvable.

Patrick Bannister




> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Priority: Major
>  Labels: cqlsh
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773915#comment-16773915
 ] 

Benedict edited comment on CASSANDRA-14482 at 2/21/19 10:15 AM:


Going over the data twice is unlikely to incur much greater penalty than going 
over it once and doing both things.  In fact, if the two behaviours are 
designed to behave optimally with the CPU pipeline (which compression and 
checksumming algorithms each certainly are) then mixing the two simultaneously 
would very likely be slower than running each independently.

Looking at the ZStd code, it looks like it does the sensible thing and executes 
the checksum independently.  It appears to checksum the input stream rather 
than the output, though, which is odd given that the latter should be smaller 
(and modulo any bugs in the compressor, should be just as good).  

The only possible advantage ZStd could probably have over us would to perform 
the checksum incrementally on, say, pages of data it is also compressing so 
that it is guaranteed to be in L1, and to guarantee no TLB misses.  However, it 
doesn't *seem* to do this - it seems to assume you provide the data in 
reasonable chunks.  Anyway, there should be no TLB misses on the size of data 
we're operating over when visiting it twice, and the data should be in L3 at 
worst, and prefetched to L2/L1.  We could also probably do this ourselves, by 
providing only page-sized frames to compress and performing the checksum 
incrementally, though this would mean tighter integration with the C API, and 
is unlikely to be worth the effort.

I have, though, made some assumptions about the ZStd code on reading it, as I 
didn't make time to fully read the codebase.


was (Author: benedict):
Going over the data twice is unlikely to incur much greater penalty than going 
over it once and doing both things.  In fact, if the two behaviours are 
designed to behave optimally with the CPU pipeline (which compression and 
checksumming algorithms each certainly are) then mixing the two simultaneously 
would very likely be slower than running each independently.

Looking at the ZStd code, it looks like it does the sensible thing and executes 
the checksum independently.  It appears to checksum the input stream rather 
than the output, though, which is odd given that the latter should be smaller 
(and modulo any bugs in the compressor, should be just as good).  

The only possible advantage ZStd could probably have over us would to perform 
the checksum incrementally on, say, pages of data it is also compressing so 
that it is guaranteed to be in L1, and to guarantee no TLB misses.  However, it 
doesn't *seem* to do this - it seems to assume you provide the data in 
reasonable chunks.  Anyway, there should be no TLB misses on the size of data 
we're operating over when visiting it twice, and the data should be in L3 at 
worst, and prefetched to L2.  We could also probably do this ourselves, by 
providing only page-sized frames to compress and performing the checksum 
incrementally, though this would mean tighter integration with the C API, and 
is unlikely to be worth the effort.

I have, though, made some assumptions about the ZStd code on reading it, as I 
didn't make time to fully read the codebase.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773915#comment-16773915
 ] 

Benedict edited comment on CASSANDRA-14482 at 2/21/19 10:13 AM:


Going over the data twice is unlikely to incur much greater penalty than going 
over it once and doing both things.  In fact, if the two behaviours are 
designed to behave optimally with the CPU pipeline (which compression and 
checksumming algorithms each certainly are) then mixing the two simultaneously 
would very likely be slower than running each independently.

Looking at the ZStd code, it looks like it does the sensible thing and executes 
the checksum independently.  It appears to checksum the input stream rather 
than the output, though, which is odd given that the latter should be smaller 
(and modulo any bugs in the compressor, should be just as good).  

The only possible advantage ZStd could probably have over us would to perform 
the checksum incrementally on, say, pages of data it is also compressing so 
that it is guaranteed to be in L1, and to guarantee no TLB misses.  However, it 
doesn't *seem* to do this - it seems to assume you provide the data in 
reasonable chunks.  Anyway, there should be no TLB misses on the size of data 
we're operating over when visiting it twice, and the data should be in L3 at 
worst, and prefetched to L2.  We could also probably do this ourselves, by 
providing only page-sized frames to compress and performing the checksum 
incrementally, though this would mean tighter integration with the C API, and 
is unlikely to be worth the effort.

I have, though, made some assumptions about the ZStd code on reading it, as I 
didn't make time to fully read the codebase.


was (Author: benedict):
Going over the data twice is unlikely to incur much greater penalty than going 
over it once and doing both things.  In fact, if the two behaviours are 
designed to behave optimally with the CPU pipeline (which compression and 
checksumming algorithms each certainly are) then mixing the two simultaneously 
would almost certainly be slower than running each independently.

Looking at the ZStd code, it looks like it does the sensible thing and executes 
the checksum independently.  It appears to checksum the input stream rather 
than the output, though, which is odd given that the latter should be smaller 
(and modulo any bugs in the compressor, should be just as good).  

The only possible advantage ZStd could probably have over us would to perform 
the checksum incrementally on, say, pages of data it is also compressing so 
that it is guaranteed to be in L1, and to guarantee no TLB misses.  However, it 
doesn't *seem* to do this - it seems to assume you provide the data in 
reasonable chunks.  Anyway, there should be no TLB misses on the size of data 
we're operating over when visiting it twice, and the data should be in L3 at 
worst, and prefetched to L2.  We could also probably do this ourselves, by 
providing only page-sized frames to compress and performing the checksum 
incrementally, though this would mean tighter integration with the C API, and 
is unlikely to be worth the effort.

I have, though, made some assumptions about the ZStd code on reading it, as I 
didn't make time to fully read the codebase.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773915#comment-16773915
 ] 

Benedict commented on CASSANDRA-14482:
--

Going over the data twice is unlikely to incur much greater penalty than going 
over it once and doing both things.  In fact, if the two behaviours are 
designed to behave optimally with the CPU pipeline (which compression and 
checksumming algorithms each certainly are) then mixing the two simultaneously 
would almost certainly be slower than running each independently.

Looking at the ZStd code, it looks like it does the sensible thing and executes 
the checksum independently.  It appears to checksum the input stream rather 
than the output, though, which is odd given that the latter should be smaller 
(and modulo any bugs in the compressor, should be just as good).  

The only possible advantage ZStd could probably have over us would to perform 
the checksum incrementally on, say, pages of data it is also compressing so 
that it is guaranteed to be in L1, and to guarantee no TLB misses.  However, it 
doesn't *seem* to do this - it seems to assume you provide the data in 
reasonable chunks.  Anyway, there should be no TLB misses on the size of data 
we're operating over when visiting it twice, and the data should be in L3 at 
worst, and prefetched to L2.  We could also probably do this ourselves, by 
providing only page-sized frames to compress and performing the checksum 
incrementally, though this would mean tighter integration with the C API, and 
is unlikely to be worth the effort.

I have, though, made some assumptions about the ZStd code on reading it, as I 
didn't make time to fully read the codebase.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773883#comment-16773883
 ] 

Dinesh Joshi commented on CASSANDRA-14482:
--

Adding our own checksumming incurs additional overhead not just in terms of the 
additional CPU that we would use going over the data twice (once for 
compression inside Zstd and then once in the compressor to compute the hash) 
but also additional code and maintaining that code. From my digging around in 
the code it seems [they're clobbering 
parameters|https://github.com/facebook/zstd/issues/1534] which should be an 
easy fix.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773871#comment-16773871
 ] 

Benedict commented on CASSANDRA-14482:
--

I don't see any issue with doing our own checksumming?  It's not like they'll 
be doing anything magical.  In fact, according to 
[this|https://facebook.github.io/zstd/zstd_manual.html] they're just using the 
lower 32bits of xxhash64, which we already utilise elsewhere I think?  There's 
also a good case to made for permitting the checksum to be configurable, so 
that the user can decide on their preferred level of protection, both in 
algorithm and number of bits.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15015) Cassandra metrics documentation is not correct for Hint_delays metric

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773825#comment-16773825
 ] 

Dinesh Joshi commented on CASSANDRA-15015:
--

Thanks, Anup. At a first glance this looks ok. Let me double check though.

> Cassandra metrics documentation is not correct for Hint_delays metric
> -
>
> Key: CASSANDRA-15015
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15015
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Blog
>Reporter: Anup Shirolkar
>Assignee: Anup Shirolkar
>Priority: Minor
>  Labels: documentation, easyfix, low-hanging-fruit
> Fix For: 3.0.19, 3.11.5, 4.0
>
> Attachments: 150150-trunk.txt
>
>
> The Cassandra metrics for hint delays are not correctly referred on the 
> documentation web page: 
> [http://cassandra.apache.org/doc/latest/operating/metrics.html#hintsservice-metrics]
>  
> The metrics are defined in the 
> [code|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/metrics/HintsServiceMetrics.java#L45-L52]
>  as 'Hint_delays' and 'Hint_delays-' but those are listed on the 
> website as 'Hints_delays' and 'Hints_delays-'.
> The documentation should be fixed by removing the extra 's' in Hints to match 
> it with code.
> The Jira for adding hint_delays: CASSANDRA-13234 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773821#comment-16773821
 ] 

Dinesh Joshi edited comment on CASSANDRA-14482 at 2/21/19 8:40 AM:
---

[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU and still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.


was (Author: djoshi3):
[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU but still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773821#comment-16773821
 ] 

Dinesh Joshi edited comment on CASSANDRA-14482 at 2/21/19 8:40 AM:
---

[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU and still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming 
flag (no GC overhead)

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.


was (Author: djoshi3):
[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU and still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773821#comment-16773821
 ] 

Dinesh Joshi edited comment on CASSANDRA-14482 at 2/21/19 8:38 AM:
---

[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the 
{{ZStd::compress}} JNI static helper. I confirmed this with the JNI author and 
it goes deeper than just the JNI bindings. However using the compression stream 
causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU but still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.


was (Author: djoshi3):
[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the `ZStd::compress` 
JNI static helper. I confirmed this with the JNI author and it goes deeper than 
just the JNI bindings. However using the compression stream causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU but still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2019-02-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773821#comment-16773821
 ] 

Dinesh Joshi commented on CASSANDRA-14482:
--

[~bdeggleston] thanks for the review. The reason I used the streams was that 
{{Zstd}} does not enable setting the checksumming flag via the `ZStd::compress` 
JNI static helper. I confirmed this with the JNI author and it goes deeper than 
just the JNI bindings. However using the compression stream causes GC.

So here are the options we have right now -

# Move forward with checksumming & accept the GC overhead
# Move forward withOUT checksumming
# Allow user to turn on/off checksumming using a compression preference 
parameter (turning on will incur GC, turning off wont)
# Add our own checksumming (ugly, burns additional CPU but still generates some 
garbage)
# Work with Zstd & Zstd JNI to enable passing in flags such as checksumming flag

I personally think in the near term we should pick option 1-3 and move forward 
and open a follow on ticket to address the GC issue. I am opposed to doing our 
own checksumming especially because Zstd already supports it and it is just a 
matter of plumbing and adding the appropriate APIs to make it happen in a 
performant manner for JNI. If anybody has any other ideas, I am all ears.

[~aweisberg] [~iamaleksey] [~benedict] [~jjirsa] please feel free to chime in.

I am already discussing this issue in the Zstd community and have a working 
prototype of what we need but I think it is incomplete. I have reached out to 
[~dikanggu] to help surface it with the Zstd team as well.

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Dependencies, Feature/Compression
>Reporter: Sushma A Devendrappa
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: performance, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10190) Python 3 support for cqlsh

2019-02-21 Thread aman raparia (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773785#comment-16773785
 ] 

aman raparia commented on CASSANDRA-10190:
--

has porting been completed of cqlsh for Python 3.7?? as there no updated on the 
issue since August 2018.

I am also trying to make the cqlsh compatible with Python 3.7. However stuck to 
resolve the issue in the saferscanner file of cqlshlib.

 

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Priority: Major
>  Labels: cqlsh
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org