[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-14654: - Status: Ready to Commit (was: Review In Progress) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-14654: - Status: Review In Progress (was: Patch Available) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808384#comment-16808384 ] Dinesh Joshi commented on CASSANDRA-14654: -- Hi [~cnlwsu], I went over the PR once more and the latest set of changes look good to me. > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-15073: Status: In Progress (was: Patch Available) > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-15073: Change Category: (was: Quality Assurance) > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808267#comment-16808267 ] mck edited comment on CASSANDRA-15073 at 4/3/19 12:52 AM: -- patch complete. to test: {code} git clone --single-branch --branch mck/trunk_15073 g...@github.com:thelastpickle/cassandra.git cd cassandra ant {code} then open in netbeans. (open the {{ide/}} subfolder to it be recognised as a project.) was (Author: michaelsembwever): patch complete. > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-15073: Test and Documentation Plan: to test: {code} git clone https://github.com/thelastpickle:mck/trunk_15073 {code} and open the project in netbeans. (might have to open the `ide/` folder for it to be recognised as a project.) Status: Patch Available (was: In Progress) patch complete. > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808193#comment-16808193 ] Blake Eggleston commented on CASSANDRA-15072: - No problem. Yes mixed mode just means you're upgrading your cluster. I don't know the exact cause, but you've summarized what I think is probably happening. Specifically the legacy read path on the 3.0 nodes is probably always interpreting single cells as rows for compact storage tables, even ones without clustering columns. > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified in schema
Joseph Lynch created CASSANDRA-15074: Summary: Allow table property defaults (e.g. compaction, compression) to be specified in schema Key: CASSANDRA-15074 URL: https://issues.apache.org/jira/browse/CASSANDRA-15074 Project: Cassandra Issue Type: Improvement Components: Cluster/Schema Reporter: Joseph Lynch Fix For: 4.x During an IRC discussion in [cassandra-dev|https://wilderness.apache.org/channels/?f=cassandra-dev/2019-04-02#1554224083] it was proposed that we could have table property defaults stored on a Keyspace or globally within the cluster. For example, this would allow users to specify "All new tables on this cluster should default to LCS with SSTable size of 320MiB" or "all new tables in Keyspace XYZ should have Zstd commpression with a 8 KiB block size" or "default_time_to_live should default to 3 days" etc ... This way operators can choose the default that makes sense for their organization once (e.g. LCS if they are running on fast SSDs), rather than requiring developers creating the Keyspaces/Tables to make the decision on every creation (often without context of which choices are right). A few implementation options were discussed including: * A YAML option * Schema provided at the Keyspace level that would be inherited by any tables automatically * Schema provided at the Cluster level that would be inherited by any Keyspaces or Tables automatically In IRC it appears that rough consensus was found in having global -> keyspace -> table defaults which would be stored in schema (no YAML configuration since this isn't node level really, it's a cluster level config). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15074) Allow table property defaults (e.g. compaction, compression) to be specified for a cluster/keyspace
[ https://issues.apache.org/jira/browse/CASSANDRA-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15074: - Summary: Allow table property defaults (e.g. compaction, compression) to be specified for a cluster/keyspace (was: Allow table property defaults (e.g. compaction, compression) to be specified in schema) > Allow table property defaults (e.g. compaction, compression) to be specified > for a cluster/keyspace > --- > > Key: CASSANDRA-15074 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15074 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema >Reporter: Joseph Lynch >Priority: Low > Fix For: 4.x > > > During an IRC discussion in > [cassandra-dev|https://wilderness.apache.org/channels/?f=cassandra-dev/2019-04-02#1554224083] > it was proposed that we could have table property defaults stored on a > Keyspace or globally within the cluster. For example, this would allow users > to specify "All new tables on this cluster should default to LCS with SSTable > size of 320MiB" or "all new tables in Keyspace XYZ should have Zstd > commpression with a 8 KiB block size" or "default_time_to_live should default > to 3 days" etc ... This way operators can choose the default that makes sense > for their organization once (e.g. LCS if they are running on fast SSDs), > rather than requiring developers creating the Keyspaces/Tables to make the > decision on every creation (often without context of which choices are right). > A few implementation options were discussed including: > * A YAML option > * Schema provided at the Keyspace level that would be inherited by any > tables automatically > * Schema provided at the Cluster level that would be inherited by any > Keyspaces or Tables automatically > In IRC it appears that rough consensus was found in having global -> keyspace > -> table defaults which would be stored in schema (no YAML configuration > since this isn't node level really, it's a cluster level config). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808142#comment-16808142 ] Muir Manders commented on CASSANDRA-15072: -- Thanks for helping us investigate this issue. Do you think you understand the exact cause at this point? {quote}It looks like the mixed mode read path is treating the table as a proper compact storage table though, and treating each cell as a row {quote} Does "mixed mode" refer to the mixed 2.X <=> 3.X cassandra versions? >From a high level it sounds like a 2.X coordinator and a 3.X replica have some >confusion regarding compact storage cells vs. rows, and how many are needed to >satisfy a limit or page quota. Is that still what you think is going on? > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808114#comment-16808114 ] Blake Eggleston commented on CASSANDRA-15072: - Huh, I did not know that. I guess that makes sense though. So then this is just an upgrade bug. > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs
[ https://issues.apache.org/jira/browse/CASSANDRA-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808074#comment-16808074 ] A. Soroka commented on CASSANDRA-15005: --- I'm not sure whether I'll be using it before a release, because I plan to use it experimentally this spring, but I don't know when there will be a new Cassandra release. (Soon I hope! :grin:) Production use of this feature would be many, many months away for me. I can't imagine that happening before a release, but I know very little about the larger schedule. > Configurable whilelist for UDFs > --- > > Key: CASSANDRA-15005 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15005 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Interpreter >Reporter: A. Soroka >Priority: Low > > I would like to use the UDF system to distribute some simple calculations on > values. For some use cases, this would require access only to some Java API > classes that aren't on the (hardcoded) whitelist (e.g. > {{java.security.MessageDigest}}). In other cases, it would require access to > a little non-C* library code, pre-distributed to nodes by out-of-band means. > As I understand the situation now, the whitelist for types UDFs can use is > hardcoded in java in > [UDFunction|[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java#L99].] > This ticket, then, is a request for a facility that would allow that list to > be extended via some kind of deployment-time configuration. I realize that > serious security concerns immediately arise for this kind of functionality, > but I hope that by restricting it (only used during startup, no exposing the > whitelist for introspection, etc.) it could be quite practical. > I'd like very much to assist with this ticket if it is accepted. (I believe I > have sufficient Java skill to do that, but no real familiarity with C*'s > codebase, yet. :) ) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808060#comment-16808060 ] Muir Manders commented on CASSANDRA-15072: -- [https://docs.datastax.com/en/cql/3.3/cql/cql_using/useCompactStorage.html] also explicitly states the implied inverse: {quote} A compact table with a primary key that is not compound can have multiple columns that are not part of the primary key. {quote} > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808049#comment-16808049 ] Peter Sanford commented on CASSANDRA-15072: --- {quote}Tables with compact storage can only have a single column, so you shouldn’t be able to create a compact storage table with 2 columns. {quote} According to [http://cassandra.apache.org/doc/latest/cql/ddl.html] that restriction is only for tables with clustering columns: {quote}if a compact table has at least one clustering column, then it must have exactly one column outside of the primary key ones. {quote} We have a lot of tables created from thrift (compact storage) that do not have clustering columns and have > 1 column in the CQL schema. > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15072) Incomplete range results during 2.X -> 3.11.4 upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808019#comment-16808019 ] Blake Eggleston commented on CASSANDRA-15072: - This is a great repro script, thanks. A couple of observations: * test.test has 2 columns, and uses compact storage, which shouldn’t be possible * node1 & node3 are the replicas of the missing partition (we’re querying from the un-upgraded node2, for those following along). * doing a point read ({{select * from test.test where id=‘1’;}}) returns the expected partition * using LIMIT 2 instead of PAGING 2 has the same problem * LIMIT 3 returns a partial row: {{1 | there | null}} * LIMIT 4 returns the entire row: {{1 | there | hi}} Tables with compact storage can only have a single column, so you shouldn’t be able to create a compact storage table with 2 columns. Instead of throwing an error though, it seems like it just silently treats the table as a normal table. This might be why no one has noticed that our ddl validation is broken. It looks like the mixed mode read path is treating the table as a proper compact storage table though, and treating each cell as a row, which is why you see partial rows start to appear as you increase the limit. If you remove compact storage from the ddl, or only use a single column, everything works normally. I'll think on the best way to address this. > Incomplete range results during 2.X -> 3.11.4 upgrade > - > > Key: CASSANDRA-15072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15072 > Project: Cassandra > Issue Type: Bug >Reporter: Muir Manders >Assignee: Blake Eggleston >Priority: High > Attachments: eriksw-repro.sh > > > Hello > During an upgrade from 2.1.17 to 3.11.4, our application starting getting > back incomplete results for range queries. When all nodes were upgraded > (before upgrading sstables), we stopped getting incomplete results. I was > able to reproduce it and listed steps below. It seems to require the random > partitioner and compact storage to reproduce reliably. It also reproduces > coming from 2.1.21 and 2.2.14. You seem to get the bad behavior when an old > node is your coordinator and it has to talk to an upgraded replica. > {noformat} > ccm create test -v 2.1.17 -n 3 > ccm updateconf 'partitioner: org.apache.cassandra.dht.RandomPartitioner' > ccm node1 updateconf 'initial_token: 0' > ccm node2 updateconf 'initial_token: 56713727820156410577229101238628035242' > ccm node3 updateconf 'initial_token: 113427455640312821154458202477256070484' > ccm start > ccm node1 cqlsh < CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', > 'replication_factor': 3}; > CREATE COLUMNFAMILY test.test ( > id text, > foo text, > bar text, > PRIMARY KEY (id) > ) WITH COMPACT STORAGE; > CONSISTENCY QUORUM; > INSERT INTO test.test (id, foo, bar) values ('1', 'hi', 'there'); > INSERT INTO test.test (id, foo, bar) values ('2', 'hi', 'there'); > SCHEMA > ccm node1 stop > ccm node1 setdir -v 3.11.4 > ccm node1 start > ccm node2 stop > ccm node2 setdir -v 3.11.4 > ccm node2 start > # here I use 3.X cqlsh to connect to 2.X node so I can lower the page size (to > # allow for simpler test setup) > cqlsh 127.0.0.3 < CONSISTENCY QUORUM; > PAGING 2; > select * from test.test; > QUERY > {noformat} > This results in: > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > (1 rows) > {noformat} > Running it against the upgraded node (node1): > {noformat} > Page size: 2 > id | bar | foo > +---+- > 2 | there | hi > 1 | there | hi > (2 rows) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807994#comment-16807994 ] Benedict commented on CASSANDRA-15013: -- {quote}In other words, we would never discard a message if the client chose to go with backpressure option {quote} +1 {quote}I propose cutting a separate ticket for that work, and keeping the scope limited for this current ticket {quote} How about a middle ground: we implement the per-endpoint (IP address) limit (which would be easily generalised to incorporate an application identifier) in this patch, so that the logical behaviour of the message control flow isn't really revisited, we just have to change the inputs and introduce any client API changes in the follow-up patch? I personally have a preference for trying to get all of the logical semantics settled in the first patch, though I'm not deeply wed to that. > Message Flusher queue can grow unbounded, potentially running JVM out of > memory > --- > > Key: CASSANDRA-15013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15013 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 3.0.x, 3.11.x > > Attachments: BlockedEpollEventLoopFromHeapDump.png, > BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap > dump showing each ImmediateFlusher taking upto 600MB.png > > > This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue > bounded, since, in the current state, items get added to the queue without > any checks on queue size, nor with any checks on netty outbound buffer to > check the isWritable state. > We are seeing this issue hit our production 3.0 clusters quite often. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807988#comment-16807988 ] Sumanth Pasupuleti commented on CASSANDRA-15013: Thanks for the feedback [~benedict] +1 on unconditionally enqueuing the message to the executor when we setAutoRead(false), and throwing OverloadedException each time a message is discarded. In other words, we would never discard a message if the client chose to go with backpressure option, rather we just setAutoRead(false) and process the message. Regarding in-flight per-endpoint, and having an application identifier, I like the suggestion, as it offers better guarantees on throttling client instances, however, I propose cutting a separate ticket for that work, and keeping the scope limited for this current ticket. > Message Flusher queue can grow unbounded, potentially running JVM out of > memory > --- > > Key: CASSANDRA-15013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15013 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 3.0.x, 3.11.x > > Attachments: BlockedEpollEventLoopFromHeapDump.png, > BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap > dump showing each ImmediateFlusher taking upto 600MB.png > > > This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue > bounded, since, in the current state, items get added to the queue without > any checks on queue size, nor with any checks on netty outbound buffer to > check the isWritable state. > We are seeing this issue hit our production 3.0 clusters quite often. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15005) Configurable whilelist for UDFs
[ https://issues.apache.org/jira/browse/CASSANDRA-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807919#comment-16807919 ] Jon Meredith commented on CASSANDRA-15005: -- Thanks for the docs and the additional tests - your modifications look good to me. I'll find somebody to review it and then we'll have to work out where to park it until trunk opens up for feature contributions. Do you have any plans to use it before it lands in a public release? > Configurable whilelist for UDFs > --- > > Key: CASSANDRA-15005 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15005 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Interpreter >Reporter: A. Soroka >Priority: Low > > I would like to use the UDF system to distribute some simple calculations on > values. For some use cases, this would require access only to some Java API > classes that aren't on the (hardcoded) whitelist (e.g. > {{java.security.MessageDigest}}). In other cases, it would require access to > a little non-C* library code, pre-distributed to nodes by out-of-band means. > As I understand the situation now, the whitelist for types UDFs can use is > hardcoded in java in > [UDFunction|[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/UDFunction.java#L99].] > This ticket, then, is a request for a facility that would allow that list to > be extended via some kind of deployment-time configuration. I realize that > serious security concerns immediately arise for this kind of functionality, > but I hope that by restricting it (only used during startup, no exposing the > whitelist for introspection, etc.) it could be quite practical. > I'd like very much to assist with this ticket if it is accepted. (I believe I > have sufficient Java skill to do that, but no real familiarity with C*'s > codebase, yet. :) ) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-15073: - Change Category: Quality Assurance > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-15073: - Complexity: Low Hanging Fruit Change Category: Parent values: Quality Assurance(12981) Status: Open (was: Triage Needed) > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807567#comment-16807567 ] mck commented on CASSANDRA-15073: - Patch in progress at https://github.com/thelastpickle/cassandra/tree/mck/trunk_15073 > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck reassigned CASSANDRA-15073: --- Assignee: mck > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15073) Apache NetBeans project files
mck created CASSANDRA-15073: --- Summary: Apache NetBeans project files Key: CASSANDRA-15073 URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 Project: Cassandra Issue Type: Task Components: Build Reporter: mck Provide necessary project files so to be able to open the Cassandra project in Apache NetBeans. No additional project functionality is required beyond being able to edit the project's source files. Building the project is still expected to be done via `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13357) A possible NPE in nodetool getendpoints
[ https://issues.apache.org/jira/browse/CASSANDRA-13357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807536#comment-16807536 ] Eduard Tudenhoefner commented on CASSANDRA-13357: - LGTM > A possible NPE in nodetool getendpoints > --- > > Key: CASSANDRA-13357 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13357 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Hao Zhong >Assignee: Hao Zhong >Priority: Normal > Fix For: 4.x > > Attachments: cassandra.patch > > > The GetEndpoints.execute method has the following code: > {code:title=GetEndpoints.java|borderStyle=solid} >List endpoints = probe.getEndpoints(ks, table, key); > for (InetAddress endpoint : endpoints) > { > System.out.println(endpoint.getHostAddress()); > } > {code} > This code can throw NPE. A similar bug is fixed in CASSANDRA-8950. The buggy > code is > {code:title=NodeCmd.java|borderStyle=solid} > List endpoints = this.probe.getEndpoints(keySpace, cf, key); > for (InetAddress anEndpoint : endpoints) > { >output.println(anEndpoint.getHostAddress()); > } > {code} > The fixed code is: > {code:title=NodeCmd.java|borderStyle=solid} > try > { > List endpoints = probe.getEndpoints(keySpace, cf, > key); > for (InetAddress anEndpoint : endpoints) >output.println(anEndpoint.getHostAddress()); > } > catch (IllegalArgumentException ex) > { > output.println(ex.getMessage()); > probe.failed(); > } > {code} > The GetEndpoints.execute method shall be modified as CASSANDRA-8950 does. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org