[jira] [Commented] (CASSANDRA-15297) nodetool can not create snapshot with snapshot name that have special character

2021-12-10 Thread Saranya Krishnakumar (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457492#comment-17457492
 ] 

Saranya Krishnakumar commented on CASSANDRA-15297:
--

[~blerer] I see this PR is still open, can I take it up?

> nodetool can not create snapshot with snapshot name that have special 
> character
> ---
>
> Key: CASSANDRA-15297
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15297
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: maxwellguo
>Assignee: maxwellguo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
> Attachments: after-fix.jpg, image.png, listsnapshots-p-s.jpg, 
> snapshot-listsnapshot-.jpg, snapshot-p-s.jpg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> we make snapshot through "nodetool snapshot -t snapshotname " , when 
> snapshotname contains special characters like "/", the make snapshot process 
> successfully , but the result 
> can be different ,when we check the data file directory or use "nodetool 
> listsnapshots".
> here is some case :
> 1. nodetool snapshot -t "p/s"
> the listsnapshot resturns snapshot  p for all table but actually the snapshot 
> name is "p/s";
> also the data directory is like the format : 
> datapath/snapshots/p/s/snapshot-datafile-link
>   !snapshot-p-s.jpg! 
>  !listsnapshots-p-s.jpg! 
> 2. nodetool snapshot -t "/"
> the listsnapshot resturns "there is not snapshot"; but the make snapshot 
> process return successfully and the data directory is like the format : 
> datapath/snapshots/snapshot-datafile-link
>  !snapshot-listsnapshot-.jpg! 
> the Attachements are the result under our environment.
> so for me , we suggest that the snapshot name should not contains special 
> character. just throw exception and told the user not to use  special 
> character.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15297) nodetool can not create snapshot with snapshot name that have special character

2021-12-10 Thread Saranya Krishnakumar (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457492#comment-17457492
 ] 

Saranya Krishnakumar edited comment on CASSANDRA-15297 at 12/11/21, 12:52 AM:
--

[~blerer] I see this PR is still open, can I take it up and complete?


was (Author: saranya_k):
[~blerer] I see this PR is still open, can I take it up?

> nodetool can not create snapshot with snapshot name that have special 
> character
> ---
>
> Key: CASSANDRA-15297
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15297
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: maxwellguo
>Assignee: maxwellguo
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
> Attachments: after-fix.jpg, image.png, listsnapshots-p-s.jpg, 
> snapshot-listsnapshot-.jpg, snapshot-p-s.jpg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> we make snapshot through "nodetool snapshot -t snapshotname " , when 
> snapshotname contains special characters like "/", the make snapshot process 
> successfully , but the result 
> can be different ,when we check the data file directory or use "nodetool 
> listsnapshots".
> here is some case :
> 1. nodetool snapshot -t "p/s"
> the listsnapshot resturns snapshot  p for all table but actually the snapshot 
> name is "p/s";
> also the data directory is like the format : 
> datapath/snapshots/p/s/snapshot-datafile-link
>   !snapshot-p-s.jpg! 
>  !listsnapshots-p-s.jpg! 
> 2. nodetool snapshot -t "/"
> the listsnapshot resturns "there is not snapshot"; but the make snapshot 
> process return successfully and the data directory is like the format : 
> datapath/snapshots/snapshot-datafile-link
>  !snapshot-listsnapshot-.jpg! 
> the Attachements are the result under our environment.
> so for me , we suggest that the snapshot name should not contains special 
> character. just throw exception and told the user not to use  special 
> character.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457359#comment-17457359
 ] 

Bartlomiej edited comment on CASSANDRA-17189 at 12/10/21, 11:27 PM:


{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from *public execute()* to *private execute()* 
because otherwise *executeInternal()* would stay without guard check or this 
guard call would have be duplicated. Now, after reading the code, I understand 
that all Select statements will be handled by that guard. Please say what you 
think about my patch, if it is "lets say ok" I will write tests :)

Thanks !


was (Author: bkowalczyyk):
{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L236|public
 execute()] to 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L296|private
 execute()]  because otherwise 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L443|executeInternal()]
 would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please say what you think about my patch, if it is "lets 
say ok" I will write tests :)

Thanks !

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new gu

[jira] [Comment Edited] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457359#comment-17457359
 ] 

Bartlomiej edited comment on CASSANDRA-17189 at 12/10/21, 11:22 PM:


{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L236|public
 execute()] to 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L296|private
 execute()]  because otherwise 
[https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L443|executeInternal()]
 would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please say what you think about my patch, if it is "lets 
say ok" I will write tests :)

Thanks !


was (Author: bkowalczyyk):
{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please say what you think about my patch, if it is "lets 
say ok" I will write tests :)

Thanks !

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging method

[jira] [Comment Edited] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457359#comment-17457359
 ] 

Bartlomiej edited comment on CASSANDRA-17189 at 12/10/21, 9:07 PM:
---

{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please say what you think about my patch, if it is "lets 
say ok" I will write tests :)

Thanks !


was (Author: bkowalczyyk):
{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please check what you think about my change, if it is 
"lets say ok" I will write tests :)

Thanks !

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: com

[jira] [Comment Edited] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457359#comment-17457359
 ] 

Bartlomiej edited comment on CASSANDRA-17189 at 12/10/21, 9:06 PM:
---

{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two methods to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please check what you think about my change, if it is 
"lets say ok" I will write tests :)

Thanks !


was (Author: bkowalczyyk):
{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two method to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please check what you think about my change, if it is 
"lets say ok" I will write tests :)

Thanks !

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: c

[jira] [Commented] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457359#comment-17457359
 ] 

Bartlomiej commented on CASSANDRA-17189:


{code:java}
Why would it crash?{code}
that was because, in forPaging method, state was null
{code:java}
public DataLimits forPaging(int pageSize)
{
Guardrails.pageSize.guard(pageSize, "?", null);
return new CQLLimits(pageSize, perPartitionLimit, isDistinct);
}{code}
so guard was also applied for super user
{code:java}
(state == null || state.isOrdinaryUser()); {code}
Now it is not the case anymore because in execute I am able to get state and 
super users are not aborded :)
{code:java}
By the way, that SelectStatement#columnFamily method uses the old terminology 
for tables, which were originally called column families. Maybe we can rename 
the method to table(). {code}
as you suggested I changed two method to table() :)

I also had to move guard call from
{code:java}
public ResultMessage.Rows execute() {code}
to
{code:java}
private ResultMessage.Rows execute(){code}
because otherwise
{code:java}
public ResultMessage.Rows executeInternal() {code}
would stay without guard check or this guard call would have be duplicated. 
Now, after reading the code, I understand that all Select statements will be 
handled by that guard. Please check what you think about my change, if it is 
"lets say ok" I will write tests :)

Thanks !

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartlomiej updated CASSANDRA-17189:
---
Attachment: CASSANDRA-17189-trunk.diff

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartlomiej updated CASSANDRA-17189:
---
Attachment: (was: CASSANDRA-17189-trunk.diff)

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartlomiej updated CASSANDRA-17189:
---
Attachment: CASSANDRA-17189-trunk.diff

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.diff
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Bartlomiej (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartlomiej updated CASSANDRA-17189:
---
Attachment: (was: CASSANDRA-17189-trunk.txt)

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17031) Add support for PEM based key material for SSL

2021-12-10 Thread Maulin Vasavada (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457346#comment-17457346
 ] 

Maulin Vasavada commented on CASSANDRA-17031:
-

FYI I changed the log level from DEBUG to INFO for the certificate details 
printing.

> Add support for PEM based key material for SSL
> --
>
> Key: CASSANDRA-17031
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17031
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Internode
>Reporter: Maulin Vasavada
>Assignee: Maulin Vasavada
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> h1. Scope
> Currently Cassandra supports standard keystore types for SSL 
> keys/certificates. The scope of this enhancement is to add support for PEM 
> based key material (keys/certificate) given that PEM is widely used common 
> format for the same. We intend to add support for Unencrypted and Password 
> Based Encrypted (PBE) PKCS#8 formatted Private Keys in PEM format with 
> standard algorithms (RSA, DSA and EC) along with the certificate chain for 
> the private key and PEM based X509 certificates. The work here is going to be 
> built on top of [CEP-9: Make SSLContext creation 
> pluggable|https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-9%3A+Make+SSLContext+creation+pluggable]
>  for which the code is merged for Apache Cassandra 4.1 release.
> We intend to support the key material be configured as direct PEM values 
> input OR via the file (configured with keystore and truststore configurations 
> today). We are not going to model PEM as a valid 'store_type' given that 
> 'store_type' has a [specific 
> definition|https://docs.oracle.com/en/java/javase/11/security/java-cryptography-architecture-jca-reference-guide.html#GUID-AB51DEFD-5238-4F96-967F-082F6D34FBEA].
>  
> h1. Approach
> Create an implementation for 
> [ISslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/ISslContextFactory.java]
>  extending 
> [FileBasedSslContextFactory|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/security/FileBasedSslContextFactory.java]
>  implementation to add PEM formatted key/certificates.
> h1. Motivation
> PEM is a widely used format for encoding Private Keys and X.509 Certificates 
> and Apache Cassandra's current implementation lacks the support for 
> specifying the PEM formatted key material for SSL configurations. This means 
> operators have to re-create the key material to comply to the supported 
> formats (using key/trust store types - jks, pkcs12 etc) and deal with an 
> operational task for managing it. This is an operational overhead we can 
> avoid by supporting the PEM format making Apache Cassandra even more customer 
> friendly and drive more adoption.
> h1. Proposed Changes
>  # A new implementation for ISslContextFactory - PEMBasedSslContextFactory 
> with the following supported configuration
> {panel:title=New configurations}
> {panel}
> |{{encryption_options:  }}
>  {{}}{{ssl_context_factory:}}
>  {{}}{{class_name: 
> org.apache.cassandra.security.PEMBasedSslContextFactory}}
>  {{}}{{parameters:}}
>  {{  }}{{private_key:  certificate chain>}}
>  {{  }}{{private_key_password:  }}{{private}} {{key }}{{if}} {{it is encrypted>}}
>  {{  }}{{trusted_certificates: }}|
> *NOTE:* We could reuse 'keystore_password' instead of the 
> 'private_key_password'. However PEM encoded private key is not a 'keystore' 
> in itself hence it would be inappropriate to piggyback on that other than 
> avoid duplicating similar fields.
>  # The PEMBasedSslContextFactory will also support file based key material 
> (and the corresponding HOT Reloading based on file timestamp updates) for the 
> PEM format via existing  'keystore' and 'truststore' encryption options. 
> However in that case the 'truststore_password' configuration won't be used 
> since generally PEM formatted certificates for truststore don't get encrypted 
> with a password.
>  # The PEMBasedSslContextFactory will internally create PKCS12 keystore for 
> private key and the trusted certificates. However, this doesn't impact the 
> user of the implementation in anyway and it is mentioned for clarity only.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457328#comment-17457328
 ] 

Michaël Ughetto commented on CASSANDRA-17201:
-

[~jmckenzie] Ok I will bring it up on the mailing list then. It's just that 
usually the "new feature" templates are used for this kind of discussion in my 
experience :)

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17181) SchemaCQLHelperTest methods can be simplified

2021-12-10 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17181:

Reviewers: Benjamin Lerer, Ekaterina Dimitrova  (was: Benjamin Lerer)

> SchemaCQLHelperTest methods can be simplified
> -
>
> Key: CASSANDRA-17181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17181
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Snapshots
>Reporter: Benjamin Lerer
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17181-4.0.patch
>
>
> {{SchemaCQLHelperTest}} is used during a snapshot to generate the 
> {{schema.cql}} file. The methods accept the following paramaters: 
> {{includeDroppedColumns}}, {{internals}} and {{ifNotExists}}.
> Those parameters are in practice always set to true by the calling code and 
> therefore can be removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457315#comment-17457315
 ] 

Josh McKenzie commented on CASSANDRA-17201:
---

{quote}can we imagine
{quote}
Sure! We can imagine a lot of things. Question is, are they valuable enough to 
someone to do the legwork to implement them and then to maintain them long 
term. That's a much better discussion for... *drumroll*

The dev mailing list. :)

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17181) SchemaCQLHelperTest methods can be simplified

2021-12-10 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457314#comment-17457314
 ] 

Benjamin Lerer commented on CASSANDRA-17181:


The patch looks good to me. I triggered a CI run 
[here|https://app.circleci.com/pipelines/github/blerer/cassandra/249/workflows/b7573c32-defb-4a98-a051-025971e548a0].

[~e.dimitrova] would you have time for the second review?

> SchemaCQLHelperTest methods can be simplified
> -
>
> Key: CASSANDRA-17181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17181
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Snapshots
>Reporter: Benjamin Lerer
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17181-4.0.patch
>
>
> {{SchemaCQLHelperTest}} is used during a snapshot to generate the 
> {{schema.cql}} file. The methods accept the following paramaters: 
> {{includeDroppedColumns}}, {{internals}} and {{ifNotExists}}.
> Those parameters are in practice always set to true by the calling code and 
> therefore can be removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17181) SchemaCQLHelperTest methods can be simplified

2021-12-10 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-17181:
---
Status: Review In Progress  (was: Patch Available)

> SchemaCQLHelperTest methods can be simplified
> -
>
> Key: CASSANDRA-17181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17181
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Snapshots
>Reporter: Benjamin Lerer
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17181-4.0.patch
>
>
> {{SchemaCQLHelperTest}} is used during a snapshot to generate the 
> {{schema.cql}} file. The methods accept the following paramaters: 
> {{includeDroppedColumns}}, {{internals}} and {{ifNotExists}}.
> Those parameters are in practice always set to true by the calling code and 
> therefore can be removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17181) SchemaCQLHelperTest methods can be simplified

2021-12-10 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-17181:
---
Test and Documentation Plan: The patch update existing tests
 Status: Patch Available  (was: Open)

> SchemaCQLHelperTest methods can be simplified
> -
>
> Key: CASSANDRA-17181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17181
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Snapshots
>Reporter: Benjamin Lerer
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17181-4.0.patch
>
>
> {{SchemaCQLHelperTest}} is used during a snapshot to generate the 
> {{schema.cql}} file. The methods accept the following paramaters: 
> {{includeDroppedColumns}}, {{internals}} and {{ifNotExists}}.
> Those parameters are in practice always set to true by the calling code and 
> therefore can be removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17133) Broken test_timeuuid - upgrade_tests.cql_tests

2021-12-10 Thread Kanthi Subramanian (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457311#comment-17457311
 ] 

Kanthi Subramanian commented on CASSANDRA-17133:


Thanks [~blerer] and [~brandon.williams]  for reviewing it again, should I be 
creating a PR to remove the functions and tests or is there anyway u can revert 
the commit. Please let me know.

 

> Broken test_timeuuid - upgrade_tests.cql_tests
> --
>
> Key: CASSANDRA-17133
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17133
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Yifan Cai
>Assignee: Kanthi Subramanian
>Priority: Normal
> Fix For: 4.x
>
>
> Both CircleCI and Jenkins build failed at test_timeuuid with the following 
> error.
> {quote}cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="Ambiguous call to function maxtimeuuid (can be matched by following 
> signatures: system.maxtimeuuid : (bigint) -> timeuuid, system.maxtimeuuid : 
> (timestamp) -> timeuuid): use type casts to disambiguate"{quote}
> https://app.circleci.com/pipelines/github/yifan-c/cassandra/273/workflows/7a855174-823a-4553-ad09-25623747a58e/jobs/1884/tests#failed-test-0
> https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/1272/tests/
> The change was added in CASSANDRA-17029. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457299#comment-17457299
 ] 

Michaël Ughetto edited comment on CASSANDRA-17201 at 12/10/21, 6:13 PM:


[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simplify 
ingestion from parquet files into Cassandra for example.

PS: this being said GPU analytics integration would be absolutely awesome.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in... and I actually explicit the 
direction of the convesion I'm interested in...


was (Author: JIRAUSER281525):
[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simplify 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in... and I actually explicit the 
direction of the convesion I'm interested in...

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457299#comment-17457299
 ] 

Michaël Ughetto edited comment on CASSANDRA-17201 at 12/10/21, 6:10 PM:


[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simplify 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in... and I actually explicit the 
direction of the convesion I'm interested in...


was (Author: JIRAUSER281525):
[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simply 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in... and I actually explicit the 
direction of the convesion I'm interested in...

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457299#comment-17457299
 ] 

Michaël Ughetto edited comment on CASSANDRA-17201 at 12/10/21, 5:56 PM:


[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simply 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in... and I actually explicit the 
direction of the convesion I'm interested in...


was (Author: JIRAUSER281525):
[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simply 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in...

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457299#comment-17457299
 ] 

Michaël Ughetto commented on CASSANDRA-17201:
-

[~jmckenzie] : My question is "can we imagine an arrow to sstable converter? 
And which Apache project would it fall under if any?". This would simply 
ingestion from parquet files into Cassandra for example.

[~brandon.williams]: My original request contains a link to the datastax work 
that does the opposite of what I'm interested in...

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457293#comment-17457293
 ] 

Josh McKenzie commented on CASSANDRA-17201:
---

If the questions is "could we integrate sstable to arrow exporting for GPU 
accelerated analytics more natively to the C* ecosystem", that'd be a great 
DISCUSS thread on the dev list [~mughetto]. Probably premature (and less 
visibility) to open a JIRA about it though.

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2021-12-10 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457271#comment-17457271
 ] 

Caleb Rackliffe commented on CASSANDRA-15234:
-

CCM changes look reasonable, and I didn't have trouble running a cluster with 
it against your branch.

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17202:
-
Reviewers: Brandon Williams

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting on my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17202:
-
Status: Needs Committer  (was: Patch Available)

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting on my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457263#comment-17457263
 ] 

Brandon Williams commented on CASSANDRA-17202:
--

Only known failures in CI, +1 from me.

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting on my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10023) Emit a metric for number of local read and write calls

2021-12-10 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-10023:
-
Reviewers: Brandon Williams

> Emit a metric for number of local read and write calls
> --
>
> Key: CASSANDRA-10023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10023
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Sankalp Kohli
>Assignee: Stefan Miklosovic
>Priority: Low
>  Labels: 4.0-feature-freeze-review-requested, lhf
> Fix For: 4.x
>
> Attachments: 10023-trunk-dtests.txt, 10023-trunk.txt, 
> CASSANDRA-10023.patch
>
>
> Many C* drivers have feature to be replica aware and chose the co-ordinator 
> which is a replica. We should add a metric which tells us whether all calls 
> to the co-ordinator are replica aware.
> We have seen issues where client thinks they are replica aware when they 
> forget to add routing key at various places in the code. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457234#comment-17457234
 ] 

Brandon Williams commented on CASSANDRA-17202:
--

[Circle|https://app.circleci.com/pipelines/github/driftx/cassandra?branch=CASSANDRA-17202]
 running.

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting on my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Ivan Senic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Senic updated CASSANDRA-17202:
---
Description: 
In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found in 
the prepared statements cache, there is always unnecessary string creation 
using String.format in order to execute the _checkTrue_ assertion. The string 
construction is necessary only when the queries are not equal.
{code:java}
public static ResultMessage.Prepared getStoredPreparedStatement(String 
queryString, String clientKeyspace)
throws InvalidRequestException
{
MD5Digest statementId = computeId(queryString, clientKeyspace);
Prepared existing = preparedStatements.getIfPresent(statementId);
if (existing == null)
return null;

checkTrue(queryString.equals(existing.rawCQLStatement),
String.format("MD5 hash collision: query with the same MD5 hash was 
already prepared. \n Existing: '%s'", existing.rawCQLStatement));
 {code}
Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
getting on my nerves as it's popping up on my flame graphs all the time.

 

  was:
In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found in 
the prepared statements cache, there is always unnecessary string creation 
using String.format in order to execute the _checkTrue_ assertion. The string 
construction is necessary only when the queries are not equal.


{code:java}
public static ResultMessage.Prepared getStoredPreparedStatement(String 
queryString, String clientKeyspace)
throws InvalidRequestException
{
MD5Digest statementId = computeId(queryString, clientKeyspace);
Prepared existing = preparedStatements.getIfPresent(statementId);
if (existing == null)
return null;

checkTrue(queryString.equals(existing.rawCQLStatement),
String.format("MD5 hash collision: query with the same MD5 hash was 
already prepared. \n Existing: '%s'", existing.rawCQLStatement));
 {code}
Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
getting in my nerves as it's popping up on my flame graphs all the time.

 


> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting on my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-17202:

Test and Documentation Plan: [https://github.com/apache/cassandra/pull/1358]
 Status: Patch Available  (was: In Progress)

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting in my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457230#comment-17457230
 ] 

Ekaterina Dimitrova commented on CASSANDRA-17202:
-

Moving to Patch Available to trigger reviewers attention. Thank you

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting in my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457224#comment-17457224
 ] 

Marcus Eriksson commented on CASSANDRA-14752:
-

bq. should we apply the 3.11 patch and fix the bug for our users and open 
follow up ticket if you think it is really worth it to pursue something more at 
this point?
yes this sounds good to me

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17189) Guardrail for page size

2021-12-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457222#comment-17457222
 ] 

Andres de la Peña commented on CASSANDRA-17189:
---

Hi [~bkowalczyyk],

bq. Should I extract guard outside forPaging ?(for example in 
SelectStatement.execute?)

Right, I also think that 
[{{SelectStatement#execute}}|https://github.com/apache/cassandra/blob/e99a8da161ed599c1a22a853c9c7f9caf6c1eb79/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L246]
 might be a better choice for placing for the call to the guardrail. There you 
can easily get the table name and use it as the {{what}} argument, with 
something like:
{code:java}
Guardrails.keyspaces.guard(pageSize, columnFamily(), state.getClientState());
{code}
By the way, that {{SelectStatement#columnFamily}} method uses the old 
terminology for tables, which were originally called column families. Maybe we 
can rename the method to {{{}table(){}}}.
{quote}I also wonder, should we care about too small value for abort_threshold 
? If someone will set for example 100, cassandra will crash.
{quote}
Why would it crash? High page limits can be a huge problem because they put too 
much pressure on memory, but IIRC queries with a very low page size are just 
slow, and could even make some sense if the rows are huge (although other 
guardrails will try to limit the size of rows).

> Guardrail for page size
> ---
>
> Key: CASSANDRA-17189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17189
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Guardrails
>Reporter: Andres de la Peña
>Assignee: Bartlomiej
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.1
>
> Attachments: CASSANDRA-17189-trunk.txt
>
>
> Add guardrail limiting the query page size, for example:
> {code}
> # Guardrail to warn about or reject page sizes greater than threshold.
> # The two thresholds default to -1 to disable.
> page_size:
> warn_threshold: -1
> abort_threshold: -1
> {code}
> Initially this can be based on the specified number of rows used as page 
> size, although it would be ideal to also limit the actual size in bytes of 
> the returned pages.
> +Additional information for newcomers:+
> # Add the configuration for the new guardrail on page size in the guardrails 
> section of cassandra.yaml.
> # Add a getPageSize method in GuardrailsConfig returning a Threshold.Config 
> object
> # Implement that method in GuardrailsOptions, which is the default yaml-based 
> implementation of GuardrailsConfig
> # Add a Threshold guardrail named pageSize in Guardrails, using the 
> previously created config
> # Define JMX-friendly getters and setters for the previously created config 
> in GuardrailsMBean
> # Implement the JMX-friendly getters and setters in Guardrails
> # Now that we have the guardrail ready, it’s time to use it. We should search 
> for a place to invoke the Guardrails.pageSize#guard method with the page size 
> that each query is going to use. The DataLimits#forPaging methods look like 
> good candidates for this.
> # Finally, add some tests for the new guardrail. Given that the new guardrail 
> is a Threshold, our new test should probably extend ThresholdTester.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457212#comment-17457212
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-14752 at 12/10/21, 3:35 PM:


Thank you both, actually Benjamin made a pass and there was also another issue 
with the trunk patch that the tests didn't catch but I got side-tracked and 
left to follow up on this when there is more time to work on it and not jumping 
in between other tasks as it is important to do it right. 

Quick suggestion - should we apply the 3.11 patch and fix the bug for our users 
and open follow up ticket if you think it is really worth it to pursue 
something more at this point? [~marcuse] , [~blerer] , [~benedict] , what do 
you think about that?


was (Author: e.dimitrova):
Thank you both, actually Benjamin made a pass and there was also another issue 
with the trunk that the tests didn't catch but I got side-tracked and left to 
follow up on this when there is more time to work on it and not jumping in 
between other tasks as it is important to do it right. 

Quick suggestion - should we apply the 3.11 patch and fix the bug for our users 
and open follow up ticket if you think it is really worth it to pursue 
something more at this point? [~marcuse] , [~blerer] , [~benedict] , what do 
you think about that?

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457212#comment-17457212
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-14752 at 12/10/21, 3:33 PM:


Thank you both, actually Benjamin made a pass and there was also another issue 
with the trunk that the tests didn't catch but I got side-tracked and left to 
follow up on this when there is more time to work on it and not jumping in 
between other tasks as it is important to do it right. 

Quick suggestion - should we apply the 3.11 patch and fix the bug for our users 
and open follow up ticket if you think it is really worth it to pursue 
something more at this point? [~marcuse] , [~blerer] , [~benedict] , what do 
you think about that?


was (Author: e.dimitrova):
Thank you both, actually Benjamin made a pass and there was also another issue 
that the tests didn't catch but I got side-tracked and left to follow up on 
this when there is more time to work on it and not jumping in between other 
tasks as it is important to do it right. 

Quick suggestion - should we apply the 3.11 patch and fix the bug for our users 
and open follow up ticket if you think it is really worth it to pursue 
something more at this point? [~marcuse] , [~blerer] , [~benedict] , what do 
you think about that?

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457212#comment-17457212
 ] 

Ekaterina Dimitrova commented on CASSANDRA-14752:
-

Thank you both, actually Benjamin made a pass and there was also another issue 
that the tests didn't catch but I got side-tracked and left to follow up on 
this when there is more time to work on it and not jumping in between other 
tasks as it is important to do it right. 

Quick suggestion - should we apply the 3.11 patch and fix the bug for our users 
and open follow up ticket if you think it is really worth it to pursue 
something more at this point? [~marcuse] , [~blerer] , [~benedict] , what do 
you think about that?

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-14752:

Status: In Progress  (was: Patch Available)

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17202:
-
Change Category: Performance
 Complexity: Low Hanging Fruit
Component/s: Messaging/Client
  Fix Version/s: 4.x
   Priority: Low  (was: Normal)
 Status: Open  (was: Triage Needed)

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting in my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17199) Provide summary of failed SessionInfo's in StreamResultFuture

2021-12-10 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-17199:
---
Mentor: Brandon Williams

> Provide summary of failed SessionInfo's in StreamResultFuture
> -
>
> Key: CASSANDRA-17199
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17199
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Brendan Cicchi
>Priority: Normal
> Fix For: 4.0.x
>
>
> Currently, we warn about the presence of one or more failed sessions existing 
> in the final state and then an operator/user traces back through the logs to 
> find any failed streams for troubleshooting.
> {code:java}
> private synchronized void maybeComplete()
> {
> if (finishedAllSessions())
> {
> StreamState finalState = getCurrentState();
> if (finalState.hasFailedSession())
> {
> logger.warn("[Stream #{}] Stream failed", planId);
> tryFailure(new StreamException(finalState, "Stream failed"));
> }
> else
> {
> logger.info("[Stream #{}] All sessions completed", planId);
> trySuccess(finalState);
> }
> }
> } {code}
> It would be helpful to log out a summary of the SessionInfo for each failed 
> session since that should be accessible via the StreamState.
>  
> This can be especially helpful for longer streaming operations like bootstrap 
> where the failure could have been a long time back and all recent streams 
> leading up to the warning actually are successful.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Ivan Senic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457170#comment-17457170
 ] 

Ivan Senic commented on CASSANDRA-17202:


PR: https://github.com/apache/cassandra/pull/1358

> Avoid unnecessary String.format in QueryProcessor when getting stored 
> prepared statement 
> -
>
> Key: CASSANDRA-17202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ivan Senic
>Assignee: Ivan Senic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found 
> in the prepared statements cache, there is always unnecessary string creation 
> using String.format in order to execute the _checkTrue_ assertion. The string 
> construction is necessary only when the queries are not equal.
> {code:java}
> public static ResultMessage.Prepared getStoredPreparedStatement(String 
> queryString, String clientKeyspace)
> throws InvalidRequestException
> {
> MD5Digest statementId = computeId(queryString, clientKeyspace);
> Prepared existing = preparedStatements.getIfPresent(statementId);
> if (existing == null)
> return null;
> checkTrue(queryString.equals(existing.rawCQLStatement),
> String.format("MD5 hash collision: query with the same MD5 hash 
> was already prepared. \n Existing: '%s'", existing.rawCQLStatement));
>  {code}
> Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
> getting in my nerves as it's popping up on my flame graphs all the time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17202) Avoid unnecessary String.format in QueryProcessor when getting stored prepared statement

2021-12-10 Thread Ivan Senic (Jira)
Ivan Senic created CASSANDRA-17202:
--

 Summary: Avoid unnecessary String.format in QueryProcessor when 
getting stored prepared statement 
 Key: CASSANDRA-17202
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17202
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ivan Senic
Assignee: Ivan Senic


In the _QueryProcessor#getStoredPreparedStatement_ if the statement is found in 
the prepared statements cache, there is always unnecessary string creation 
using String.format in order to execute the _checkTrue_ assertion. The string 
construction is necessary only when the queries are not equal.


{code:java}
public static ResultMessage.Prepared getStoredPreparedStatement(String 
queryString, String clientKeyspace)
throws InvalidRequestException
{
MD5Digest statementId = computeId(queryString, clientKeyspace);
Prepared existing = preparedStatements.getIfPresent(statementId);
if (existing == null)
return null;

checkTrue(queryString.equals(existing.rawCQLStatement),
String.format("MD5 hash collision: query with the same MD5 hash was 
already prepared. \n Existing: '%s'", existing.rawCQLStatement));
 {code}
Hopefully the JIT can optimize this once the _checkTrue_ is inlined, but it's 
getting in my nerves as it's popping up on my flame graphs all the time.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17133) Broken test_timeuuid - upgrade_tests.cql_tests

2021-12-10 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457138#comment-17457138
 ] 

Benjamin Lerer edited comment on CASSANDRA-17133 at 12/10/21, 1:08 PM:
---

I had a look at the code and it seems that the mistake is mine. We did not need 
to add those new functions in CASSANDRA-17029 as the functionality was already 
there.
We had the following functions:
* toDate(timestamp)
* toTimestamp(date)
* mintimeuuid(timestamp)
* maxtimeuuid(timestamp)

Those functions were not overloaded so they will be the ones being picked up 
but the code allow deserialization from integer to the timestampt type or the 
date type.
by consequence adding new function was not needed and in fact lead to a 
slightly different behavior. What I would suggest it to remove the methods 
added by CASSANDRA-17029 and modify the tests to verify the behavior with int 
an long inputs.
Sorry for not realizing that sooner. :-( 


  


was (Author: blerer):
I had a look at the code and it seems that the mistake is mine. We did not need 
to add those new functions in CASSANDRA-17029 as the functionality was already 
there.
We had the following functions:
* toDate(timestamp)
* toTimestamp(date)
* mintimeuuid(timestamp)
* maxtimeuuid(timestamp)
Those functions were not overloaded so they will be the ones being picked up 
but the code allow deserialization from integer to the timestampt type or the 
date type.
by consequence adding new function was not needed and in fact lead to a 
slightly different behavior. What I would suggest it to remove the methods 
added by CASSANDRA-17029 and modify the tests to verify the behavior with int 
an long inputs.
Sorry for not realizing that sooner. :-( 


  

> Broken test_timeuuid - upgrade_tests.cql_tests
> --
>
> Key: CASSANDRA-17133
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17133
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Yifan Cai
>Assignee: Kanthi Subramanian
>Priority: Normal
> Fix For: 4.x
>
>
> Both CircleCI and Jenkins build failed at test_timeuuid with the following 
> error.
> {quote}cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="Ambiguous call to function maxtimeuuid (can be matched by following 
> signatures: system.maxtimeuuid : (bigint) -> timeuuid, system.maxtimeuuid : 
> (timestamp) -> timeuuid): use type casts to disambiguate"{quote}
> https://app.circleci.com/pipelines/github/yifan-c/cassandra/273/workflows/7a855174-823a-4553-ad09-25623747a58e/jobs/1884/tests#failed-test-0
> https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/1272/tests/
> The change was added in CASSANDRA-17029. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17133) Broken test_timeuuid - upgrade_tests.cql_tests

2021-12-10 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457138#comment-17457138
 ] 

Benjamin Lerer commented on CASSANDRA-17133:


I had a look at the code and it seems that the mistake is mine. We did not need 
to add those new functions in CASSANDRA-17029 as the functionality was already 
there.
We had the following functions:
* toDate(timestamp)
* toTimestamp(date)
* mintimeuuid(timestamp)
* maxtimeuuid(timestamp)
Those functions were not overloaded so they will be the ones being picked up 
but the code allow deserialization from integer to the timestampt type or the 
date type.
by consequence adding new function was not needed and in fact lead to a 
slightly different behavior. What I would suggest it to remove the methods 
added by CASSANDRA-17029 and modify the tests to verify the behavior with int 
an long inputs.
Sorry for not realizing that sooner. :-( 


  

> Broken test_timeuuid - upgrade_tests.cql_tests
> --
>
> Key: CASSANDRA-17133
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17133
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Yifan Cai
>Assignee: Kanthi Subramanian
>Priority: Normal
> Fix For: 4.x
>
>
> Both CircleCI and Jenkins build failed at test_timeuuid with the following 
> error.
> {quote}cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="Ambiguous call to function maxtimeuuid (can be matched by following 
> signatures: system.maxtimeuuid : (bigint) -> timeuuid, system.maxtimeuuid : 
> (timestamp) -> timeuuid): use type casts to disambiguate"{quote}
> https://app.circleci.com/pipelines/github/yifan-c/cassandra/273/workflows/7a855174-823a-4553-ad09-25623747a58e/jobs/1884/tests#failed-test-0
> https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/1272/tests/
> The change was added in CASSANDRA-17029. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17201:
-
Resolution: Invalid
Status: Resolved  (was: Triage Needed)

I am not sure what your question is, but arrow-to-sstable is obviously feasible 
since it exists (https://github.com/datastax/sstable-to-arrow/) but is not 
associated with the Apache Cassandra project.

> Arrow to SSTable converter
> --
>
> Key: CASSANDRA-17201
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Michaël Ughetto
>Priority: Normal
>
> Hi,
> I hope this is the good project to ask for this. Recently this project to 
> convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
> [https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]
> I'm wondering if an Arrow to SSTable would be feasible? In practice we 
> envision using it to  to quickly process our parquet files on GPU and upload 
> them faster to Cassandra.
> I also brought this up with the RAPIDS team and Datastax:
> [https://github.com/rapidsai/cudf/issues/9811]
> [https://github.com/datastax/sstable-to-arrow/issues/1]
> Cheers,
> Michaël
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14752:

Status: Patch Available  (was: Ready to Commit)

yeah my bad, only checked the 3.11 patch, assumed they were the same

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457109#comment-17457109
 ] 

Benedict Elliott Smith commented on CASSANDRA-14752:


This patch could potentially have serious performance consequences, by making 
many call-sites megamorphic that were previously bimorphic for clusters using 
e.g. offheap_buffers (and perhaps offheap_objects)...

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17140) Broken test_rolling_upgrade - upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_3_0_x_To_indev_4_0_x

2021-12-10 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457026#comment-17457026
 ] 

Berenguer Blasi edited comment on CASSANDRA-17140 at 12/10/21, 10:16 AM:
-

Ok so this is a bit crazy. I went as far as April in 3.0 and 
{{upgrade_tests/upgrade_through_versions_test.py::TestProtoV3Upgrade_AllVersions_EndsAt_3_11_X::test_rolling_upgrade}}
 will fail locally. Whereas other tests like the ones above start failing on 
15252. So some tests will start failing sooner or later depending on which one 
you run.

Maybe my initial intuition that this is sthg to do with dtests was not wrong. 
Maybe we need to bisect matching java and dtest at that time. Other than that I 
can only think of deep-diving into 15252 4.0 for that test that reliably 
switches from pass to fail at that commit. Maybe that will reveal more info as 
that is the only pass -> fail transition we know atm.


was (Author: bereng):
Ok so this is a bit crazy. I went as far as April in 3.0 and 
{{upgrade_tests/upgrade_through_versions_test.py::TestProtoV3Upgrade_AllVersions_EndsAt_3_11_X::test_rolling_upgrade}}
 will fail locally. Whereas other tests like the ones above start failing on 
15252. So some tests will start failing sooner or later depending on which one 
you run.

Maybe my initial intuition that this is sthg to do with dtests was not wrong. 
Maybe we need to bisect matching java and dtest at that time. Other than that I 
can only think of deep-diving into 15252 4.0 for that test that reliably 
switches from pass to fail at that commit. Maybe that will reveal more info.

> Broken test_rolling_upgrade - 
> upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_3_0_x_To_indev_4_0_x
> 
>
> Key: CASSANDRA-17140
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17140
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Yifan Cai
>Priority: Normal
> Fix For: 4.0.x
>
>
> The tests "test_rolling_upgrade" fail with the below error. 
>  
> [https://app.circleci.com/pipelines/github/yifan-c/cassandra/279/workflows/6340cd42-0b27-42c2-8418-9f8b56c57bea/jobs/1990]
>  
> I am able to alway produce it by running the test locally too. 
> {{$ pytest --execute-upgrade-tests-only --upgrade-target-version-only 
> --upgrade-version-selection all --cassandra-version=4.0 
> upgrade_tests/upgrade_through_versions_test.py::TestUpgrade_indev_3_11_x_To_indev_4_0_x::test_rolling_upgrade}}
>  
> {code:java}
> self = 
>   object at 0x7ffba4242fd0>
> subprocs = [, 
> ]
> def _check_on_subprocs(self, subprocs):
> """
> Check on given subprocesses.
> 
> If any are not alive, we'll go ahead and terminate any remaining 
> alive subprocesses since this test is going to fail.
> """
> subproc_statuses = [s.is_alive() for s in subprocs]
> if not all(subproc_statuses):
> message = "A subprocess has terminated early. Subprocess 
> statuses: "
> for s in subprocs:
> message += "{name} (is_alive: {aliveness}), 
> ".format(name=s.name, aliveness=s.is_alive())
> message += "attempting to terminate remaining subprocesses now."
> self._terminate_subprocs()
> >   raise RuntimeError(message)
> E   RuntimeError: A subprocess has terminated early. Subprocess 
> statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting 
> to terminate remaining subprocesses now.{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17140) Broken test_rolling_upgrade - upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_3_0_x_To_indev_4_0_x

2021-12-10 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457026#comment-17457026
 ] 

Berenguer Blasi commented on CASSANDRA-17140:
-

Ok so this is a bit crazy. I went as far as April in 3.0 and 
{{upgrade_tests/upgrade_through_versions_test.py::TestProtoV3Upgrade_AllVersions_EndsAt_3_11_X::test_rolling_upgrade}}
 will fail locally. Whereas other tests like the ones above start failing on 
15252. So some tests will start failing sooner or later depending on which one 
you run.

Maybe my initial intuition that this is sthg to do with dtests was not wrong. 
Maybe we need to bisect matching java and dtest at that time. Other than that I 
can only think of deep-diving into 15252 4.0 for that test that reliably 
switches from pass to fail at that commit. Maybe that will reveal more info.

> Broken test_rolling_upgrade - 
> upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_3_0_x_To_indev_4_0_x
> 
>
> Key: CASSANDRA-17140
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17140
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Yifan Cai
>Priority: Normal
> Fix For: 4.0.x
>
>
> The tests "test_rolling_upgrade" fail with the below error. 
>  
> [https://app.circleci.com/pipelines/github/yifan-c/cassandra/279/workflows/6340cd42-0b27-42c2-8418-9f8b56c57bea/jobs/1990]
>  
> I am able to alway produce it by running the test locally too. 
> {{$ pytest --execute-upgrade-tests-only --upgrade-target-version-only 
> --upgrade-version-selection all --cassandra-version=4.0 
> upgrade_tests/upgrade_through_versions_test.py::TestUpgrade_indev_3_11_x_To_indev_4_0_x::test_rolling_upgrade}}
>  
> {code:java}
> self = 
>   object at 0x7ffba4242fd0>
> subprocs = [, 
> ]
> def _check_on_subprocs(self, subprocs):
> """
> Check on given subprocesses.
> 
> If any are not alive, we'll go ahead and terminate any remaining 
> alive subprocesses since this test is going to fail.
> """
> subproc_statuses = [s.is_alive() for s in subprocs]
> if not all(subproc_statuses):
> message = "A subprocess has terminated early. Subprocess 
> statuses: "
> for s in subprocs:
> message += "{name} (is_alive: {aliveness}), 
> ".format(name=s.name, aliveness=s.is_alive())
> message += "attempting to terminate remaining subprocesses now."
> self._terminate_subprocs()
> >   raise RuntimeError(message)
> E   RuntimeError: A subprocess has terminated early. Subprocess 
> statuses: Process-1 (is_alive: True), Process-2 (is_alive: False), attempting 
> to terminate remaining subprocesses now.{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17201) Arrow to SSTable converter

2021-12-10 Thread Jira
Michaël Ughetto created CASSANDRA-17201:
---

 Summary: Arrow to SSTable converter
 Key: CASSANDRA-17201
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17201
 Project: Cassandra
  Issue Type: New Feature
Reporter: Michaël Ughetto


Hi,

I hope this is the good project to ask for this. Recently this project to 
convert SSTables to Arrow allowed to analyse Cassandra data on GPU:
[https://developer.nvidia.com/blog/analyzing-cassandra-data-using-gpus-part-2/]

I'm wondering if an Arrow to SSTable would be feasible? In practice we envision 
using it to  to quickly process our parquet files on GPU and upload them faster 
to Cassandra.

I also brought this up with the RAPIDS team and Datastax:
[https://github.com/rapidsai/cudf/issues/9811]

[https://github.com/datastax/sstable-to-arrow/issues/1]

Cheers,

Michaël

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14752:

Status: Ready to Commit  (was: Review In Progress)

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations

2021-12-10 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456940#comment-17456940
 ] 

Marcus Eriksson commented on CASSANDRA-14752:
-

lgtm, +1

> serializers/BooleanSerializer.java is using static bytebuffers which may 
> cause problem for subsequent operations
> 
>
> Key: CASSANDRA-14752
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14752
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Varun Barala
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.x
>
> Attachments: patch, patch-modified
>
>
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26]
>  It has two static Bytebuffer variables:-
> {code:java}
> private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1});
> private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code}
> What will happen if the position of these Bytebuffers is being changed by 
> some other operations? It'll affect other subsequent operations. -IMO Using 
> static is not a good idea here.-
> A potential place where it can become problematic: 
> [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243]
>  Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if 
> these Bytebuffers have been used previously.
> Solution: 
>  
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42]
>  Every time we return new bytebuffer object. Please do let me know If there 
> is a better way. I'd like to contribute. Thanks!!
> {code:java}
> public ByteBuffer serialize(Boolean value)
> {
> return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER
> : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); 
> // false
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org