date:20190402

[jira] [Commented] (CASSANDRA-13357) A possible NPE in nodetool getendpoints

2019-04-02 Thread Eduard Tudenhoefner (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807536#comment-16807536
 ] 

Eduard Tudenhoefner commented on CASSANDRA-13357:
-

LGTM

> A possible NPE in nodetool getendpoints
> ---
>
> Key: CASSANDRA-13357
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13357
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Hao Zhong
>Assignee: Hao Zhong
>Priority: Normal
> Fix For: 4.x
>
> Attachments: cassandra.patch
>
>
> The GetEndpoints.execute method has the following code:
> {code:title=GetEndpoints.java|borderStyle=solid}
>List endpoints = probe.getEndpoints(ks, table, key);
> for (InetAddress endpoint : endpoints)
> {
> System.out.println(endpoint.getHostAddress());
> }
> {code}
> This code can throw NPE. A similar bug is fixed in CASSANDRA-8950. The buggy 
> code  is 
> {code:title=NodeCmd.java|borderStyle=solid}
>   List endpoints = this.probe.getEndpoints(keySpace, cf, key);
> for (InetAddress anEndpoint : endpoints)
> {
>output.println(anEndpoint.getHostAddress());
> }
> {code}
> The fixed code is:
> {code:title=NodeCmd.java|borderStyle=solid}
> try
> {
> List endpoints = probe.getEndpoints(keySpace, cf, 
> key);
> for (InetAddress anEndpoint : endpoints)
>output.println(anEndpoint.getHostAddress());
> }
> catch (IllegalArgumentException ex)
> {
> output.println(ex.getMessage());
> probe.failed();
> }
> {code}
> The GetEndpoints.execute method shall be modified as CASSANDRA-8950 does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)

mck created CASSANDRA-15073:
---

 Summary: Apache NetBeans project files
 Key: CASSANDRA-15073
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
 Project: Cassandra
  Issue Type: Task
  Components: Build
Reporter: mck


Provide necessary project files so to be able to open the Cassandra project in 
Apache NetBeans.

No additional project functionality is required beyond being able to edit the 
project's source files. Building the project is still expected to be done via 
`ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-15073:
---

Assignee: mck

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807567#comment-16807567
 ] 

mck commented on CASSANDRA-15073:
-

Patch in progress at 
https://github.com/thelastpickle/cassandra/tree/mck/trunk_15073

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread Benedict (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15073:
-
 Complexity: Low Hanging Fruit
Change Category: Parent values: Quality Assurance(12981)
 Status: Open  (was: Triage Needed)

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread Benedict (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15073:
-
Change Category: Quality Assurance

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs

2019-04-02 Thread Jon Meredith (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807919#comment-16807919
 ] 

Jon Meredith commented on CASSANDRA-15005:
--

Thanks for the docs and the additional tests - your modifications look good to 
me. I'll find somebody to review it and then we'll have to work out where to 
park it until trunk opens up for feature contributions.

Do you have any plans to use it before it lands in a public release?

> Configurable whilelist for UDFs
> ---
>
> Key: CASSANDRA-15005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15005
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: A. Soroka
>Priority: Low
>
> I would like to use the UDF system to distribute some simple calculations on 
> values. For some use cases, this would require access only to some Java API 
> classes that aren't on the (hardcoded) whitelist (e.g. 
> {{java.security.MessageDigest}}). In other cases, it would require access to 
> a little non-C* library code, pre-distributed to nodes by out-of-band means.
> As I understand the situation now, the whitelist for types UDFs can use is 
> hardcoded in java in 
> [UDFunction|[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java#L99].]
> This ticket, then, is a request for a facility that would allow that list to 
> be extended via some kind of deployment-time configuration. I realize that 
> serious security concerns immediately arise for this kind of functionality, 
> but I hope that by restricting it (only used during startup, no exposing the 
> whitelist for introspection, etc.) it could be quite practical.
> I'd like very much to assist with this ticket if it is accepted. (I believe I 
> have sufficient Java skill to do that, but no real familiarity with C*'s 
> codebase, yet. :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

2019-04-02 Thread Sumanth Pasupuleti (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807988#comment-16807988
 ] 

Sumanth Pasupuleti commented on CASSANDRA-15013:


Thanks for the feedback [~benedict]

+1 on unconditionally enqueuing the message to the executor when we 
setAutoRead(false), and throwing OverloadedException each time a message is 
discarded. In other words, we would never discard a message if the client chose 
to go with backpressure option, rather we just setAutoRead(false) and process 
the message.

Regarding in-flight per-endpoint, and having an application identifier, I like 
the suggestion, as it offers better guarantees on throttling client instances, 
however, I propose cutting a separate ticket for that work, and keeping the 
scope limited for this current ticket.

> Message Flusher queue can grow unbounded, potentially running JVM out of 
> memory
> ---
>
> Key: CASSANDRA-15013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: BlockedEpollEventLoopFromHeapDump.png, 
> BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap 
> dump showing each ImmediateFlusher taking upto 600MB.png
>
>
> This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue 
> bounded, since, in the current state, items get added to the queue without 
> any checks on queue size, nor with any checks on netty outbound buffer to 
> check the isWritable state.
> We are seeing this issue hit our production 3.0 clusters quite often.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

2019-04-02 Thread Benedict (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807994#comment-16807994
 ] 

Benedict commented on CASSANDRA-15013:
--

{quote}In other words, we would never discard a message if the client chose to 
go with backpressure option
{quote}
+1
{quote}I propose cutting a separate ticket for that work, and keeping the scope 
limited for this current ticket
{quote}
How about a middle ground: we implement the per-endpoint (IP address) limit 
(which would be easily generalised to incorporate an application identifier) in 
this patch, so that the logical behaviour of the message control flow isn't 
really revisited, we just have to change the inputs and introduce any client 
API changes in the follow-up patch?

I personally have a preference for trying to get all of the logical semantics 
settled in the first patch, though I'm not deeply wed to that.

> Message Flusher queue can grow unbounded, potentially running JVM out of 
> memory
> ---
>
> Key: CASSANDRA-15013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: BlockedEpollEventLoopFromHeapDump.png, 
> BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap 
> dump showing each ImmediateFlusher taking upto 600MB.png
>
>
> This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue 
> bounded, since, in the current state, items get added to the queue without 
> any checks on queue size, nor with any checks on netty outbound buffer to 
> check the isWritable state.
> We are seeing this issue hit our production 3.0 clusters quite often.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808019#comment-16808019
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

This is a great repro script, thanks. 

A couple of observations:
 * test.test has 2 columns, and uses compact storage, which shouldn’t be 
possible
 * node1 & node3 are the replicas of the missing partition (we’re querying from 
the un-upgraded node2, for those following along).
 * doing a point read ({{select * from test.test where id=‘1’;}}) returns the 
expected partition
 * using LIMIT 2 instead of PAGING 2 has the same problem
 * LIMIT 3 returns a partial row: {{1 | there | null}}
 * LIMIT 4 returns the entire row: {{1 | there |  hi}}

Tables with compact storage can only have a single column, so you shouldn’t be 
able to create a compact storage table with 2 columns. Instead of throwing an 
error though, it seems like it just silently treats the table as a normal 
table. This might be why no one has noticed that our ddl validation is broken.

It looks like the mixed mode read path is treating the table as a proper 
compact storage table though, and treating each cell as a row, which is why you 
see partial rows start to appear as you increase the limit. If you remove 
compact storage from the ddl, or only use a single column, everything works 
normally.

I'll think on the best way to address this.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Peter Sanford (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808049#comment-16808049
 ] 

Peter Sanford commented on CASSANDRA-15072:
---

{quote}Tables with compact storage can only have a single column, so you 
shouldn’t be able to create a compact storage table with 2 columns.
{quote}
According to [http://cassandra.apache.org/doc/latest/cql/ddl.html] that 
restriction is only for tables with clustering columns:
{quote}if a compact table has at least one clustering column, then it must have 
exactly one column outside of the primary key ones.
{quote}
We have a lot of tables created from thrift (compact storage) that do not have 
clustering columns and have > 1 column in the CQL schema.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Muir Manders (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808060#comment-16808060
 ] 

Muir Manders commented on CASSANDRA-15072:
--

[https://docs.datastax.com/en/cql/3.3/cql/cql_using/useCompactStorage.html] 
also explicitly states the implied inverse:
{quote}
A compact table with a primary key that is not compound can have multiple 
columns that are not part of the primary key.
{quote}

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs

2019-04-02 Thread A. Soroka (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808074#comment-16808074
 ] 

A. Soroka commented on CASSANDRA-15005:
---

I'm not sure whether I'll be using it before a release, because I plan to use 
it experimentally this spring, but I don't know when there will be a new 
Cassandra release. (Soon I hope! :grin:) Production use of this feature would 
be many, many months away for me. I can't imagine that happening before a 
release, but I know very little about the larger schedule.

> Configurable whilelist for UDFs
> ---
>
> Key: CASSANDRA-15005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15005
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: A. Soroka
>Priority: Low
>
> I would like to use the UDF system to distribute some simple calculations on 
> values. For some use cases, this would require access only to some Java API 
> classes that aren't on the (hardcoded) whitelist (e.g. 
> {{java.security.MessageDigest}}). In other cases, it would require access to 
> a little non-C* library code, pre-distributed to nodes by out-of-band means.
> As I understand the situation now, the whitelist for types UDFs can use is 
> hardcoded in java in 
> [UDFunction|[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java#L99].]
> This ticket, then, is a request for a facility that would allow that list to 
> be extended via some kind of deployment-time configuration. I realize that 
> serious security concerns immediately arise for this kind of functionality, 
> but I hope that by restricting it (only used during startup, no exposing the 
> whitelist for introspection, etc.) it could be quite practical.
> I'd like very much to assist with this ticket if it is accepted. (I believe I 
> have sufficient Java skill to do that, but no real familiarity with C*'s 
> codebase, yet. :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808114#comment-16808114
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

Huh, I did not know that. I guess that makes sense though. So then this is just 
an upgrade bug.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Muir Manders (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808142#comment-16808142
 ] 

Muir Manders commented on CASSANDRA-15072:
--

Thanks for helping us investigate this issue. Do you think you understand the 
exact cause at this point?

{quote}It looks like the mixed mode read path is treating the table as a proper 
compact storage table though, and treating each cell as a row
{quote}
Does "mixed mode" refer to the mixed 2.X <=> 3.X cassandra versions?

>From a high level it sounds like a 2.X coordinator and a 3.X replica have some 
>confusion regarding compact storage cells vs. rows, and how many are needed to 
>satisfy a limit or page quota. Is that still what you think is going on?

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified for a cluster/keyspace

2019-04-02 Thread Joseph Lynch (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15074:
-
Summary: Allow table property defaults (e.g. compaction, compression) to be 
specified for a cluster/keyspace  (was: Allow table property defaults (e.g. 
compaction, compression) to be specified in schema)

> Allow table property defaults (e.g. compaction, compression) to be specified 
> for a cluster/keyspace
> ---
>
> Key: CASSANDRA-15074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15074
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema
>Reporter: Joseph Lynch
>Priority: Low
> Fix For: 4.x
>
>
> During an IRC discussion in 
> [cassandra-dev|https://wilderness.apache.org/channels/?f=cassandra-dev/2019-04-02#1554224083]
>  it was proposed that we could have table property defaults stored on a 
> Keyspace or globally within the cluster. For example, this would allow users 
> to specify "All new tables on this cluster should default to LCS with SSTable 
> size of 320MiB" or "all new tables in Keyspace XYZ should have Zstd 
> commpression with a 8 KiB block size" or "default_time_to_live should default 
> to 3 days" etc ... This way operators can choose the default that makes sense 
> for their organization once (e.g. LCS if they are running on fast SSDs), 
> rather than requiring developers creating the Keyspaces/Tables to make the 
> decision on every creation (often without context of which choices are right).
> A few implementation options were discussed including:
>  * A YAML option
>  * Schema provided at the Keyspace level that would be inherited by any 
> tables automatically
>  * Schema provided at the Cluster level that would be inherited by any 
> Keyspaces or Tables automatically
> In IRC it appears that rough consensus was found in having global -> keyspace 
> -> table defaults which would be stored in schema (no YAML configuration 
> since this isn't node level really, it's a cluster level config).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified in schema

2019-04-02 Thread Joseph Lynch (JIRA)

Joseph Lynch created CASSANDRA-15074:


 Summary: Allow table property defaults (e.g. compaction, 
compression) to be specified in schema
 Key: CASSANDRA-15074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15074
 Project: Cassandra
  Issue Type: Improvement
  Components: Cluster/Schema
Reporter: Joseph Lynch
 Fix For: 4.x


During an IRC discussion in 
[cassandra-dev|https://wilderness.apache.org/channels/?f=cassandra-dev/2019-04-02#1554224083]
 it was proposed that we could have table property defaults stored on a 
Keyspace or globally within the cluster. For example, this would allow users to 
specify "All new tables on this cluster should default to LCS with SSTable size 
of 320MiB" or "all new tables in Keyspace XYZ should have Zstd commpression 
with a 8 KiB block size" or "default_time_to_live should default to 3 days" etc 
... This way operators can choose the default that makes sense for their 
organization once (e.g. LCS if they are running on fast SSDs), rather than 
requiring developers creating the Keyspaces/Tables to make the decision on 
every creation (often without context of which choices are right).

A few implementation options were discussed including:
 * A YAML option
 * Schema provided at the Keyspace level that would be inherited by any tables 
automatically
 * Schema provided at the Cluster level that would be inherited by any 
Keyspaces or Tables automatically

In IRC it appears that rough consensus was found in having global -> keyspace 
-> table defaults which would be stored in schema (no YAML configuration since 
this isn't node level really, it's a cluster level config).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

2019-04-02 Thread Blake Eggleston (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808193#comment-16808193
 ] 

Blake Eggleston commented on CASSANDRA-15072:
-

No problem. Yes mixed mode just means you're upgrading your cluster.

I don't know the exact cause, but you've summarized what I think is probably 
happening. Specifically the legacy read path on the 3.0 nodes is probably 
always interpreting single cells as rows for compact storage tables, even ones 
without clustering columns.

> Incomplete range results during 2.X -> 3.11.4 upgrade
> -
>
> Key: CASSANDRA-15072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Muir Manders
>Assignee: Blake Eggleston
>Priority: High
> Attachments: eriksw-repro.sh
>
>
> Hello
> During an upgrade from 2.1.17 to 3.11.4, our application starting getting 
> back incomplete results for range queries. When all nodes were upgraded 
> (before upgrading sstables), we stopped getting incomplete results. I was 
> able to reproduce it and listed steps below. It seems to require the random 
> partitioner and compact storage to reproduce reliably. It also reproduces 
> coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old 
> node is your coordinator and it has to talk to an upgraded replica.
> {noformat}
> ccm create test -v 2.1.17 -n 3
> ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner'
> ccm node1 updateconf 'initial_token: 0'
> ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242'
> ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484'
> ccm start
> ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE COLUMNFAMILY test.test (
>   id text,
>   foo text,
>   bar text,
>   PRIMARY KEY (id)
> ) WITH COMPACT STORAGE;
> CONSISTENCY QUORUM;
> INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there');
> INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there');
> SCHEMA
> ccm node1 stop
> ccm node1 setdir -v 3.11.4
> ccm node1 start
> ccm node2 stop
> ccm node2 setdir -v 3.11.4
> ccm node2 start
> # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to
> # allow for simpler test setup)
> cqlsh 127.0.0.3 < CONSISTENCY QUORUM;
> PAGING 2;
> select * from test.test;
> QUERY
> {noformat}
> This results in:
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
> (1 rows)
> {noformat}
> Running it against the upgraded node (node1):
> {noformat}
> Page size: 2
>  id | bar   | foo
> +---+-
>   2 | there |  hi
>   1 | there |  hi
> (2 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-15073:

Test and Documentation Plan: 
to test:
{code}
git clone https://github.com/thelastpickle:mck/trunk_15073
{code}
and open the project in netbeans. 

(might have to open the `ide/` folder for it to be recognised as a project.)
 Status: Patch Available  (was: In Progress)

patch complete.

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808267#comment-16808267
 ] 

mck edited comment on CASSANDRA-15073 at 4/3/19 12:52 AM:
--

patch complete.

to test:
{code}
git clone --single-branch --branch mck/trunk_15073 
g...@github.com:thelastpickle/cassandra.git
cd cassandra
ant 
{code}
then open in netbeans. (open the {{ide/}} subfolder to it be recognised as a 
project.)


was (Author: michaelsembwever):
patch complete.

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-15073:

Change Category:   (was: Quality Assurance)

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

2019-04-02 Thread mck (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-15073:

Status: In Progress  (was: Patch Available)

> Apache NetBeans project files
> -
>
> Key: CASSANDRA-15073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15073
> Project: Cassandra
>  Issue Type: Task
>  Components: Build
>Reporter: mck
>Assignee: mck
>Priority: Low
>
> Provide necessary project files so to be able to open the Cassandra project 
> in Apache NetBeans.
> No additional project functionality is required beyond being able to edit the 
> project's source files. Building the project is still expected to be done via 
> `ant` on the command-line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-02 Thread Dinesh Joshi (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14654:
-
Status: Review In Progress  (was: Patch Available)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-02 Thread Dinesh Joshi (JIRA)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808384#comment-16808384
 ] 

Dinesh Joshi commented on CASSANDRA-14654:
--

Hi [~cnlwsu], I went over the PR once more and the latest set of changes look 
good to me.

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

2019-04-02 Thread Dinesh Joshi (JIRA)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14654:
-
Status: Ready to Commit  (was: Review In Progress)

> Reduce heap pressure during compactions
> ---
>
> Key: CASSANDRA-14654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: Performance, pull-request-available
> Fix For: 4.x
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13357) A possible NPE in nodetool getendpoints

[jira] [Created] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Assigned] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Commented] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs

[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Updated] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified for a cluster/keyspace

[jira] [Created] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified in schema

[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Comment Edited] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files

[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

[jira] [Commented] (CASSANDRA-14654) Reduce heap pressure during compactions

[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions

25 matches

Site Navigation

Mail list logo

Footer information