[jira] [Commented] (CASSANDRA-14103) Fix potential race during compaction strategy reload
[ https://issues.apache.org/jira/browse/CASSANDRA-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434981#comment-16434981 ] Marcus Eriksson commented on CASSANDRA-14103: - (sorry for the delay on this) LGTM, just a few minor comments; * Could we make {{CompactionStrategyManager}} take the initial sstables as a parameter to the constructor instead of calling {{cfs.getSSTables(...CANONICAL..)}} there? Feels it makes it more clear that the tracker has to be populated before we can create the CSM * Make {{maybeReloadDiskBoundaries}} return {{void}}, the only user of the return value is the test case and that could probably be refactored to check that the boundaries changed instead? > Fix potential race during compaction strategy reload > > > Key: CASSANDRA-14103 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14103 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Minor > Attachments: 3.11-14103-dtest.png, 3.11-14103-testall.png, > trunk-14103-dtest.png, trunk-14103-testall.png > > > When the compaction strategies are reloaded after disk boundary changes > (CASSANDRA-13948), it's possible that a recently finished SSTable is added > twice to the compaction strategy: once when the compaction strategies are > reloaded due to the disk boundary change ({{maybeReloadDiskBoundarie}}), and > another when the {{CompactionStrategyManager}} is processing the > {{SSTableAddedNotification}}. > This should be quite unlikely because a compaction must finish as soon as the > disk boundary changes, and even if it happens most compaction strategies > would not be affected by it since they deduplicate sstables internally, but > we should protect against such scenario. > For more context see [this > comment|https://issues.apache.org/jira/browse/CASSANDRA-13948?focusedCommentId=16280448&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16280448] > from Marcus. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp
[ https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14160: --- Reviewer: Jeff Jirsa > maxPurgeableTimestamp should traverse tables in order of minTimestamp > - > > Key: CASSANDRA-14160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14160 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Josh Snyder >Assignee: Josh Snyder >Priority: Major > Labels: performance > Fix For: 4.x > > > In maxPurgeableTimestamp, we iterate over the bloom filters of each > overlapping SSTable. Of the bloom filter hits, we take the SSTable with the > lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, > then we could short-circuit the operation at the first bloom filter hit, > reducing cache pressure (or worse, I/O) and CPU time. > I've written (but not yet benchmarked) [some > code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44] > to demonstrate this possibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp
[ https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434939#comment-16434939 ] Jeff Jirsa commented on CASSANDRA-14160: Re-pushed [here|https://github.com/jeffjirsa/cassandra/commits/14160] , tests running [here|https://circleci.com/gh/jeffjirsa/cassandra/tree/14160] (unit tests + dtests) > maxPurgeableTimestamp should traverse tables in order of minTimestamp > - > > Key: CASSANDRA-14160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14160 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Josh Snyder >Assignee: Josh Snyder >Priority: Major > Labels: performance > Fix For: 4.x > > > In maxPurgeableTimestamp, we iterate over the bloom filters of each > overlapping SSTable. Of the bloom filter hits, we take the SSTable with the > lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, > then we could short-circuit the operation at the first bloom filter hit, > reducing cache pressure (or worse, I/O) and CPU time. > I've written (but not yet benchmarked) [some > code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44] > to demonstrate this possibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)
[ https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434870#comment-16434870 ] Dave Brosius commented on CASSANDRA-14367: -- Ok thanks, Altho I would expect the same exact situation before the change. Not arguing for the ticket, just learning > prefer Collections.singletonList to Arrays.asList(one_element) > -- > > Key: CASSANDRA-14367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14367 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dave Brosius >Assignee: Dave Brosius >Priority: Trivial > Fix For: 4.x > > Attachments: 14367.txt > > > small improvement, but Arrays.asList first creates an array, then wraps it > with a collections instance, whereas Collections.singletonList just creates > one small (one field) bean instance. > so a small cut down on garbage generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round
[ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434866#comment-16434866 ] Kurt Greaves commented on CASSANDRA-13851: -- ping [~beobal]. keen on getting this in before something breaks it again. :) > Allow existing nodes to use all peers in shadow round > - > > Key: CASSANDRA-13851 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13851 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Kurt Greaves >Assignee: Kurt Greaves >Priority: Major > Fix For: 3.11.x, 4.x > > > In CASSANDRA-10134 we made collision checks necessary on every startup. A > side-effect was introduced that then requires a nodes seeds to be contacted > on every startup. Prior to this change an existing node could start up > regardless whether it could contact a seed node or not (because > checkForEndpointCollision() was only called for bootstrapping nodes). > Now if a nodes seeds are removed/deleted/fail it will no longer be able to > start up until live seeds are configured (or itself is made a seed), even > though it already knows about the rest of the ring. This is inconvenient for > operators and has the potential to cause some nasty surprises and increase > downtime. > One solution would be to use all a nodes existing peers as seeds in the > shadow round. Not a Gossip guru though so not sure of implications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)
[ https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434808#comment-16434808 ] Jeremiah Jordan commented on CASSANDRA-14367: - The static method call is not non-monomorphic, the later uses of the List will be, since there will now be multiple List implementations used. > prefer Collections.singletonList to Arrays.asList(one_element) > -- > > Key: CASSANDRA-14367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14367 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dave Brosius >Assignee: Dave Brosius >Priority: Trivial > Fix For: 4.x > > Attachments: 14367.txt > > > small improvement, but Arrays.asList first creates an array, then wraps it > with a collections instance, whereas Collections.singletonList just creates > one small (one field) bean instance. > so a small cut down on garbage generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)
[ https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434789#comment-16434789 ] Dave Brosius commented on CASSANDRA-14367: -- perfectly find with not accepting this, no biggie. But am curious as to what makes a static method call non - monomorphic? > prefer Collections.singletonList to Arrays.asList(one_element) > -- > > Key: CASSANDRA-14367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14367 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dave Brosius >Assignee: Dave Brosius >Priority: Trivial > Fix For: 4.x > > Attachments: 14367.txt > > > small improvement, but Arrays.asList first creates an array, then wraps it > with a collections instance, whereas Collections.singletonList just creates > one small (one field) bean instance. > so a small cut down on garbage generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)
[ https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Brosius updated CASSANDRA-14367: - Resolution: Won't Do Status: Resolved (was: Patch Available) > prefer Collections.singletonList to Arrays.asList(one_element) > -- > > Key: CASSANDRA-14367 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14367 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dave Brosius >Assignee: Dave Brosius >Priority: Trivial > Fix For: 4.x > > Attachments: 14367.txt > > > small improvement, but Arrays.asList first creates an array, then wraps it > with a collections instance, whereas Collections.singletonList just creates > one small (one field) bean instance. > so a small cut down on garbage generated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14379) Better handling of missing partition columns in system_schema.columns during startup
Jay Zhuang created CASSANDRA-14379: -- Summary: Better handling of missing partition columns in system_schema.columns during startup Key: CASSANDRA-14379 URL: https://issues.apache.org/jira/browse/CASSANDRA-14379 Project: Cassandra Issue Type: Improvement Components: Distributed Metadata Reporter: Jay Zhuang Assignee: Jay Zhuang Follow up for CASSANDRA-13180, during table deletion/creation, we saw one table having partially deleted columns (no partition column, only regular column). It's blocking node from startup: {noformat} java.lang.AssertionError: null at org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:308) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:363) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1028) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:987) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:945) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:922) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:910) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:138) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:128) ~[apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:241) [apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) [apache-cassandra-3.0.14.x.jar:3.0.14.x] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) [apache-cassandra-3.0.14.x.jar:3.0.14.x] {noformat} As partition column is mandatory, it should throw [{{MissingColumns}}|https://github.com/apache/cassandra/blob/60563f4e8910fb59af141fd24f1fc1f98f34f705/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L1351], the same as CASSANDRA-13180, so the user is able to cleanup the schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13889) cfstats should take sorting and limit parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-13889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434756#comment-16434756 ] Patrick Bannister commented on CASSANDRA-13889: --- Thanks for the review! > cfstats should take sorting and limit parameters > > > Key: CASSANDRA-13889 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13889 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jon Haddad >Assignee: Patrick Bannister >Priority: Major > Fix For: 4.0 > > Attachments: 13889-trunk.txt, sample_output_normal.txt, > sample_output_sorted.txt, sample_output_sorted_top3.txt > > > When looking at a problematic node I'm not familiar with, one of the first > things I do is check cfstats to identify the tables with the most reads, > writes, and data. This is fine as long as there aren't a lot of tables but > once it goes above a dozen it's quite difficult. cfstats should allow me to > sort the results and limit to top K tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-7622) Implement virtual tables
[ https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434732#comment-16434732 ] Dinesh Joshi commented on CASSANDRA-7622: - Hey [~cnlwsu], Here's my feedback on the code so far - * {{Schema#getVirtualTable}} - {{containsKey()}} followed by a {{get()}} is unnecessary. Using {{get}} will have the same effect. * {{VirtualSchema}} - the initializers for {{key}} and {{clustering}} fields are unnecessary as you're overwriting them in the constructor. It would be a good idea to make the fields final. * {{VirtualTable#classFromName}}, {{TableMetaData#virtualClass}} - shouldn't the error read differently? VirtualTable strategy class instead of Compaction strategy class? Also there is no {{AbstractVirtualColumnFamilyStore}}. Am I missing something here? * {{InMemoryVirtualTable$SimpleVirtualCommand}} - Use finals for fields * {{InMemoryVirtualTable$ResultReadState}} - Use finals for fields * {{InMemoryVirtualTable$ResultReadState}} - line 258 - isn't the if check redundant? Nits - * {{InMemoryVirtualTable}} - get rid of unused import {{org.apache.cassandra.db.marshal.AbstractType;}} > Implement virtual tables > > > Key: CASSANDRA-7622 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7622 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Tupshin Harper >Assignee: Chris Lohfink >Priority: Major > Fix For: 4.x > > > There are a variety of reasons to want virtual tables, which would be any > table that would be backed by an API, rather than data explicitly managed and > stored as sstables. > One possible use case would be to expose JMX data through CQL as a > resurrection of CASSANDRA-3527. > Another is a more general framework to implement the ability to expose yaml > configuration information. So it would be an alternate approach to > CASSANDRA-7370. > A possible implementation would be in terms of CASSANDRA-7443, but I am not > presupposing. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434503#comment-16434503 ] Preetika Tyagi commented on CASSANDRA-13853: [~aweisberg] Thank you for taking care of that. Here is my email id: [preetika.ty...@intel.com|mailto:preetika.ty...@intel.com] > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability, Tools >Reporter: Jon Haddad >Assignee: Preetika Tyagi >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch > > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434482#comment-16434482 ] Ariel Weisberg commented on CASSANDRA-13853: Test failures look unrelated https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13853-trunk I had to clean up the imports a bit, but other than that I didn't change anything. > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability, Tools >Reporter: Jon Haddad >Assignee: Preetika Tyagi >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch > > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13853: --- Status: Ready to Commit (was: Patch Available) > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability, Tools >Reporter: Jon Haddad >Assignee: Preetika Tyagi >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch > > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13853: --- Reviewer: Ariel Weisberg > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability, Tools >Reporter: Jon Haddad >Assignee: Preetika Tyagi >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch > > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434437#comment-16434437 ] Ariel Weisberg commented on CASSANDRA-13853: [~pree] what email address do you use for your github account? I want to set the author tag for the commits correctly. > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability, Tools >Reporter: Jon Haddad >Assignee: Preetika Tyagi >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch > > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13459) Diag. Events: Native transport integration
[ https://issues.apache.org/jira/browse/CASSANDRA-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434431#comment-16434431 ] Ariel Weisberg commented on CASSANDRA-13459: One of the differences between other events and diagnostic events is that other events are relatively infrequent so we might have been fine until now with not implementing proper backpressure. I checked quickly and I didn't see what looked like backpressure for clients Looking at some of the events like gossip or hints it seems like it's going to be a steady stream all the time. Most messages sent to clients are responses to requests and if the client can't communicate with the server it won't be sending a new requests so it's a self limiting problem except for the less common cases where communication fails in one direction. I think that may be why we have gotten away without proper backpressure until now. It's also possible that there is a hidden bit of code somewhere that disables read on a client if we can't write. bq. We could specify a subscription mechanism for native transport that is not specific to diag events. But what would the subject look like to subscribe to? Looking at what you have now there is no query language correct? You subscribe to these events via the wire protocol not the query language? Devil's advocate we could have a flat namepsace of events to subscribe to subscribe to right now (which is how it seems to work?). I am just saying for the wire protocol and internal implementation differentiate between subscription and debug/diagnostic. Those are two different concerns. > Diag. Events: Native transport integration > -- > > Key: CASSANDRA-13459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13459 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Major > Labels: client-impacting > > Events should be consumable by clients that would received subscribed events > from the connected node. This functionality is designed to work on top of > native transport with minor modifications to the protocol standard (see > [original > proposal|https://docs.google.com/document/d/1uEk7KYgxjNA0ybC9fOuegHTcK3Yi0hCQN5nTp5cNFyQ/edit?usp=sharing] > for further considered options). First we have to add another value for > existing event types. Also, we have to extend the protocol a bit to be able > to specify a sub-class and sub-type value. E.g. > {{DIAGNOSTIC_EVENT(GossiperEvent, MAJOR_STATE_CHANGE_HANDLED)}}. This still > has to be worked out and I'd appreciate any feedback. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14378) Simplify TableParams defaults
[ https://issues.apache.org/jira/browse/CASSANDRA-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14378: -- Fix Version/s: (was: 4.x) 4.0 Status: Patch Available (was: Open) > Simplify TableParams defaults > - > > Key: CASSANDRA-14378 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14378 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Trivial > Fix For: 4.0 > > > There is a block of unnecessary constants - only used once - that only > introduce indirection and make the code harder to read. And almost introduce > a static initialization order issue. We can get rid of that. > A trivial change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14378) Simplify TableParams defaults
[ https://issues.apache.org/jira/browse/CASSANDRA-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434350#comment-16434350 ] Aleksey Yeschenko commented on CASSANDRA-14378: --- Code [here|https://github.com/iamaleksey/cassandra/commits/14378-4.0]. > Simplify TableParams defaults > - > > Key: CASSANDRA-14378 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14378 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Trivial > Fix For: 4.x > > > There is a block of unnecessary constants - only used once - that only > introduce indirection and make the code harder to read. And almost introduce > a static initialization order issue. We can get rid of that. > A trivial change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14378) Simplify TableParams defaults
Aleksey Yeschenko created CASSANDRA-14378: - Summary: Simplify TableParams defaults Key: CASSANDRA-14378 URL: https://issues.apache.org/jira/browse/CASSANDRA-14378 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 4.x There is a block of unnecessary constants - only used once - that only introduce indirection and make the code harder to read. And almost introduce a static initialization order issue. We can get rid of that. A trivial change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Ninja reorder static constants in TableParams to avoid uninitialized speculativeRetry (hypothetical)
Repository: cassandra Updated Branches: refs/heads/trunk a831b99f9 -> 60563f4e8 Ninja reorder static constants in TableParams to avoid uninitialized speculativeRetry (hypothetical) Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/60563f4e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/60563f4e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/60563f4e Branch: refs/heads/trunk Commit: 60563f4e8910fb59af141fd24f1fc1f98f34f705 Parents: a831b99 Author: Aleksey Yeshchenko Authored: Wed Apr 11 18:51:18 2018 +0100 Committer: Aleksey Yeshchenko Committed: Wed Apr 11 18:51:18 2018 +0100 -- src/java/org/apache/cassandra/schema/TableParams.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/60563f4e/src/java/org/apache/cassandra/schema/TableParams.java -- diff --git a/src/java/org/apache/cassandra/schema/TableParams.java b/src/java/org/apache/cassandra/schema/TableParams.java index 895e3a7..1489c81 100644 --- a/src/java/org/apache/cassandra/schema/TableParams.java +++ b/src/java/org/apache/cassandra/schema/TableParams.java @@ -34,8 +34,6 @@ import static java.lang.String.format; public final class TableParams { -public static final TableParams DEFAULT = TableParams.builder().build(); - public enum Option { BLOOM_FILTER_FP_CHANCE, @@ -73,6 +71,8 @@ public final class TableParams public static final double DEFAULT_CRC_CHECK_CHANCE = 1.0; public static final SpeculativeRetryPolicy DEFAULT_SPECULATIVE_RETRY = new PercentileSpeculativeRetryPolicy(99.0); +public static final TableParams DEFAULT = TableParams.builder().build(); + public final String comment; public final double readRepairChance; public final double dcLocalReadRepairChance; - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14118) Refactor write path
[ https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434232#comment-16434232 ] Aleksey Yeschenko commented on CASSANDRA-14118: --- +1, this looks correct. I was expecting a bigger patch, with more things abstracted - which is partially the reason I procrastinated on this review - a little bit. FWIW, I think this changeset makes sense in isolation, and I'm expecting that - eventually - the write path will be abstracted away more fully. As of these commits, there are still default-engine specific things that live outside the abstracted away path: - there is an implicit assumption that secondary indexes are supported and work in a certain way. In particular TableWriteHandler having UpdateTransaction as an argument - MV related code living outside of either handler With that in mind, again, I'm +1 with the patch, as a proto-abstraction with understanding that it's only the beginning. > Refactor write path > --- > > Key: CASSANDRA-14118 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14118 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Dikang Gu >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.0 > > > As part of the pluggable storage engine effort, we'd like to modularize the > write path related code, make it to be independent from existing storage > engine implementation details. > For now, refer to > https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc > for high level designs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14118) Refactor write path
[ https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14118: -- Status: Ready to Commit (was: Patch Available) > Refactor write path > --- > > Key: CASSANDRA-14118 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14118 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Dikang Gu >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.0 > > > As part of the pluggable storage engine effort, we'd like to modularize the > write path related code, make it to be independent from existing storage > engine implementation details. > For now, refer to > https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc > for high level designs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14118) Refactor write path
[ https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14118: -- Fix Version/s: 4.0 > Refactor write path > --- > > Key: CASSANDRA-14118 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14118 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Dikang Gu >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.0 > > > As part of the pluggable storage engine effort, we'd like to modularize the > write path related code, make it to be independent from existing storage > engine implementation details. > For now, refer to > https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc > for high level designs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-13065: - Component/s: Materialized Views > Skip building views during base table streams on range movements > > > Key: CASSANDRA-13065 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13065 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 4.0 > > > Booting or decommisioning nodes with MVs is unbearably slow as all streams go > through the regular write paths. This causes read-before-writes for every > mutation and during bootstrap it causes them to be sent to batchlog. > The makes it virtually impossible to boot a new node in an acceptable amount > of time. > Using the regular streaming behaviour for consistent range movements works > much better in this case and does not break the MV local consistency contract. > Already tested on own cluster. > Bootstrap case is super easy to handle, decommission case requires > CASSANDRA-13064 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14260) Refactor pair to avoid boxing longs/ints
[ https://issues.apache.org/jira/browse/CASSANDRA-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14260: --- Resolution: Fixed Reviewer: Dinesh Joshi Fix Version/s: (was: 4.x) 4.0 Status: Resolved (was: Ready to Commit) Ran a few runs of dtest on the refactored branches [here|https://circleci.com/gh/jeffjirsa/cassandra/tree/pair-refactor] , it shows CASSANDRA-14371 but nothing else concerning. Committed as [a831b99f9123d1c2bdfd70761aca3a05446c9a4c|https://github.com/apache/cassandra/commit/a831b99f9123d1c2bdfd70761aca3a05446c9a4c] > Refactor pair to avoid boxing longs/ints > > > Key: CASSANDRA-14260 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14260 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.0 > > > We uses Pair all over the place, and in many cases either/both of X and > Y are primitives (ints, longs), and we end up boxing them into Integers and > Longs. We should have specialized versions that take primitives. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Refactor Pair usage to avoid boxing ints/longs
Repository: cassandra Updated Branches: refs/heads/trunk 95a52a8bf -> a831b99f9 Refactor Pair usage to avoid boxing ints/longs Patch by Jeff Jirsa; Reviewed by Dinesh Joshi for CASSANDRA-14260 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a831b99f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a831b99f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a831b99f Branch: refs/heads/trunk Commit: a831b99f9123d1c2bdfd70761aca3a05446c9a4c Parents: 95a52a8 Author: Jeff Jirsa Authored: Wed Apr 11 08:24:40 2018 -0700 Committer: Jeff Jirsa Committed: Wed Apr 11 08:24:40 2018 -0700 -- CHANGES.txt | 4 +- .../apache/cassandra/db/ColumnFamilyStore.java | 8 +- .../org/apache/cassandra/db/Directories.java| 38 ++-- .../db/SnapshotDetailsTabularData.java | 6 +- .../db/repair/CassandraValidationIterator.java | 4 +- .../db/streaming/CassandraOutgoingFile.java | 5 +- .../db/streaming/CassandraStreamHeader.java | 30 +++ .../db/streaming/CassandraStreamManager.java| 2 +- .../db/streaming/CassandraStreamReader.java | 8 +- .../db/streaming/CassandraStreamWriter.java | 17 ++-- .../CompressedCassandraStreamReader.java| 8 +- .../CompressedCassandraStreamWriter.java| 25 +++--- .../cassandra/db/streaming/CompressionInfo.java | 4 +- .../io/compress/CompressionMetadata.java| 22 ++--- .../cassandra/io/sstable/SSTableLoader.java | 2 +- .../io/sstable/format/SSTableReader.java| 83 +++--- .../io/sstable/format/big/BigTableScanner.java | 3 +- .../org/apache/cassandra/net/MessageOut.java| 44 -- .../apache/cassandra/service/StorageProxy.java | 91 .../cassandra/service/StorageService.java | 2 +- .../cassandra/db/ColumnFamilyStoreTest.java | 3 +- .../apache/cassandra/db/DirectoriesTest.java| 9 +- .../cassandra/io/sstable/SSTableReaderTest.java | 17 ++-- .../io/sstable/SSTableRewriterTest.java | 11 ++- .../compression/CompressedInputStreamTest.java | 12 +-- 25 files changed, 310 insertions(+), 148 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 707ea6b..2dc2021 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,7 +1,7 @@ 4.0 + * Refactor Pair usage to avoid boxing ints/longs (CASSANDRA-14260) * Add options to nodetool tablestats to sort and limit output (CASSANDRA-13889) - * Rename internals to reflect CQL vocabulary - (CASSANDRA-14354) + * Rename internals to reflect CQL vocabulary (CASSANDRA-14354) * Add support for hybrid MIN(), MAX() speculative retry policies (CASSANDRA-14293, CASSANDRA-14338, CASSANDRA-14352) * Fix some regressions caused by 14058 (CASSANDRA-14353) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 4c546dd..bfab6ea 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1451,9 +1451,9 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean Collection> ranges = StorageService.instance.getLocalRanges(keyspace.getName()); for (SSTableReader sstable : sstables) { -List> positions = sstable.getPositionsForRanges(ranges); -for (Pair position : positions) -expectedFileSize += position.right - position.left; +List positions = sstable.getPositionsForRanges(ranges); +for (SSTableReader.PartitionPositionBounds position : positions) +expectedFileSize += position.upperPosition - position.lowerPosition; } double compressionRatio = metric.compressionRatio.getValue(); @@ -1965,7 +1965,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean * @return Return a map of all snapshots to space being used * The pair for a snapshot has true size and size on disk. */ -public Map> getSnapshotDetails() +public Map getSnapshotDetails() { return getDirectories().getSnapshotDetails(); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/src/java/org/apache/cassandra/db/Directories.java -- diff --git a/src/java/org/apache/cassandra/db/Directories.java b/src/java/org/apache/cassandra/db/Direc
[jira] [Comment Edited] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434042#comment-16434042 ] Sergey Kirillov edited comment on CASSANDRA-14239 at 4/11/18 3:11 PM: -- Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not decreasing anymore. *UPD* When everything was blocked node had high CPU usage and where reading a lot from disks (which seems related to CASSANDRA-13065). After a while number of pending memtable jobs decreased and mutations unblocked, but in a minute node died again with OOM. was (Author: rushman): Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not decreasing anymore. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, > cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434047#comment-16434047 ] Paulo Motta commented on CASSANDRA-14239: - bq. Number of pending MemtableFlushWriter jobs is slowly decreasing, I'll try to wait till it decrease to zero, maybe this will unblock mutations. Perhaps you could try setting the system property {{-Dcassandra.repair.mutation_repair_rows_per_batch=1000}} (from the default 100) and see if this will make the pending queue decrease faster while keeping the GC sane. bq. Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not decreasing anymore. Can you attach a thread dump? You can generate it via jstack > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, > cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Kirillov updated CASSANDRA-13065: Attachment: Selection_423.png > Skip building views during base table streams on range movements > > > Key: CASSANDRA-13065 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13065 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 4.0 > > > Booting or decommisioning nodes with MVs is unbearably slow as all streams go > through the regular write paths. This causes read-before-writes for every > mutation and during bootstrap it causes them to be sent to batchlog. > The makes it virtually impossible to boot a new node in an acceptable amount > of time. > Using the regular streaming behaviour for consistent range movements works > much better in this case and does not break the MV local consistency contract. > Already tested on own cluster. > Bootstrap case is super easy to handle, decommission case requires > CASSANDRA-13064 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Kirillov updated CASSANDRA-14239: Attachment: dstat.png > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, > cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Kirillov updated CASSANDRA-13065: Attachment: (was: Selection_423.png) > Skip building views during base table streams on range movements > > > Key: CASSANDRA-13065 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13065 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 4.0 > > > Booting or decommisioning nodes with MVs is unbearably slow as all streams go > through the regular write paths. This causes read-before-writes for every > mutation and during bootstrap it causes them to be sent to batchlog. > The makes it virtually impossible to boot a new node in an acceptable amount > of time. > Using the regular streaming behaviour for consistent range movements works > much better in this case and does not break the MV local consistency contract. > Already tested on own cluster. > Bootstrap case is super easy to handle, decommission case requires > CASSANDRA-13064 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Kirillov updated CASSANDRA-14239: Attachment: Selection_421.png > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, > cassandra-env.sh, cassandra.yaml, gc.log.0.201804111524.zip, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434042#comment-16434042 ] Sergey Kirillov commented on CASSANDRA-14239: - Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not decreasing anymore. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, cassandra-env.sh, > cassandra.yaml, gc.log.0.201804111524.zip, gc.log.0.current.zip, > gc.log.20180441.zip, jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Kirillov updated CASSANDRA-14239: Attachment: Selection_420.png > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, Selection_420.png, cassandra-env.sh, > cassandra.yaml, gc.log.0.201804111524.zip, gc.log.0.current.zip, > gc.log.20180441.zip, jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434030#comment-16434030 ] Sergey Kirillov commented on CASSANDRA-14239: - [~pauloricardomg] I've done quick and dirty backport of CASSANDRA-13299 to 3.10 (which I'm using right now), so far there is no OOM, but node is still stucking in MutationStage. Number of pending MemtableFlushWriter jobs is slowly decreasing, I'll try to wait till it decrease to zero, maybe this will unblock mutations. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-14239: Labels: materializedviews (was: ) > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Labels: materializedviews > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433974#comment-16433974 ] Jürgen Albersdorfer commented on CASSANDRA-14239: - Thanks for your confirmation [~pauloricardomg], I thought I'm hunting Ghosts here. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433933#comment-16433933 ] Paulo Motta commented on CASSANDRA-14239: - {quote}Removing MVs is not easy in my case, but now at least I know that it is worth the efforts. {quote} We've done a few improvements to MV bootstrap performance on CASSANDRA-13299 and CASSANDRA-13065, but unfortunately these are only available on 4.0. The OOMs during bootstrap reported here could probably benefit from CASSANDRA-13299, so I think it's worth a backport of this to 3.11, which shouldn't be very hard. If you (or anyone else) feel adventurous and is willing to try a backport [~rushman] I'm happy to review it. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433921#comment-16433921 ] Sergey Kirillov commented on CASSANDRA-14239: - [~jalbersdorfer] so, I was right, it is related to MV update. It is really helpful to know this. Removing MVs is not easy in my case, but now at least I know that it is worth the efforts. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433914#comment-16433914 ] Jürgen Albersdorfer commented on CASSANDRA-14239: - Heap Management looked great too, this time. See [^gc.log.0.201804111524.zip] at [http://gceasy.io|http://gceasy.io/] - the last big reclaim was triggered manually via JMX after the join has completed successfully. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jürgen Albersdorfer updated CASSANDRA-14239: Attachment: gc.log.0.201804111524.zip > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, > jvm.options, jvm_opts.txt, stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433908#comment-16433908 ] Jürgen Albersdorfer commented on CASSANDRA-14239: - [~rushman]: I dropped my MATERIALIZED VIEW, did the join again, worked perfectly! > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13971) Automatic certificate management using Vault
[ https://issues.apache.org/jira/browse/CASSANDRA-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433902#comment-16433902 ] Stefan Podkowinski commented on CASSANDRA-13971: Any update on the review status [~jasobrown], now that the 4.0 window is about to close? > Automatic certificate management using Vault > > > Key: CASSANDRA-13971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13971 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Major > Labels: security > Fix For: 4.x > > > We've been adding security features during the last years to enable users to > secure their clusters, if they are willing to use them and do so correctly. > Some features are powerful and easy to work with, such as role based > authorization. Other features that require to manage a local keystore are > rather painful to deal with. Think about setting up SSL.. > To be fair, keystore related issues and certificate handling hasn't been > invented by us. We're just following Java standards there. But that doesn't > mean that we absolutely have to, if there are better options. I'd like to > give it a shoot and find out if we can automate certificate/key handling > (PKI) by using external APIs. In this case, the implementation will be based > on [Vault|https://vaultproject.io]. But certificate management services > offered by cloud providers may also be able to handle the use-case and I > intend to create a generic, pluggable API for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14377) Returning invalid JSON for NaN and Infinity float values
Piotr Sarna created CASSANDRA-14377: --- Summary: Returning invalid JSON for NaN and Infinity float values Key: CASSANDRA-14377 URL: https://issues.apache.org/jira/browse/CASSANDRA-14377 Project: Cassandra Issue Type: Bug Components: CQL Reporter: Piotr Sarna After inserting special float values like NaN and Infinity into a table: {{CREATE TABLE testme (t1 bigint, t2 float, t3 float, PRIMARY KEY (t1));}} {{INSERT INTO testme (t1, t2, t3) VALUES (7, NaN, Infinity);}} and returning them as JSON... {{cqlsh:demodb> select json * from testme;}} {{ [json]}} {{--}} {{ \{"t1": 7, "t2": NaN, "t3": Infinity}}} ... the result will not be validated (e.g. with [https://jsonlint.com/|https://jsonlint.com/)] ) because neither NaN nor Infinity is a valid JSON value. The consensus seems to be returning JSON's `null` in these cases, based on this article [https://stackoverflow.com/questions/1423081/json-left-out-infinity-and-nan-json-status-in-ecmascript] and other similar ones. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14310) Don't allow nodetool refresh before cfs is opened
[ https://issues.apache.org/jira/browse/CASSANDRA-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-14310: Resolution: Fixed Fix Version/s: (was: 3.11.x) (was: 4.x) (was: 3.0.x) 3.11.3 3.0.17 4.0 Status: Resolved (was: Patch Available) committed as {{22bb413ba29aa6a95034b7dac833a8273983fa42}} and merged up, thanks! > Don't allow nodetool refresh before cfs is opened > - > > Key: CASSANDRA-14310 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14310 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Major > Fix For: 4.0, 3.0.17, 3.11.3 > > > There is a potential deadlock in during startup if nodetool refresh is called > while sstables are being opened. We should not allow refresh to be called > before everything is initialized. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: Make sure we don't deadlock on nodetool refresh
Repository: cassandra-dtest Updated Branches: refs/heads/master 9c2eb35a8 -> 95735a4d0 Make sure we don't deadlock on nodetool refresh Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310 Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/95735a4d Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/95735a4d Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/95735a4d Branch: refs/heads/master Commit: 95735a4d0049249acc7de23465d89e07792c3de6 Parents: 9c2eb35 Author: Marcus Eriksson Authored: Mon Apr 9 13:56:28 2018 +0200 Committer: Marcus Eriksson Committed: Wed Apr 11 15:03:35 2018 +0200 -- byteman/sstable_open_delay.btm | 11 +++ refresh_test.py| 38 + 2 files changed, 49 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/95735a4d/byteman/sstable_open_delay.btm -- diff --git a/byteman/sstable_open_delay.btm b/byteman/sstable_open_delay.btm new file mode 100644 index 000..d31c2d0 --- /dev/null +++ b/byteman/sstable_open_delay.btm @@ -0,0 +1,11 @@ +# +# Make sstable opening on startup slower +# +RULE slow startup sstable opening +CLASS org.apache.cassandra.io.sstable.format.big.BigFormat$ReaderFactory +METHOD open +AT ENTRY +IF TRUE +DO +Thread.sleep(1); +ENDRULE http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/95735a4d/refresh_test.py -- diff --git a/refresh_test.py b/refresh_test.py new file mode 100644 index 000..4177ec8 --- /dev/null +++ b/refresh_test.py @@ -0,0 +1,38 @@ +import time + +from dtest import Tester +from ccmlib.node import ToolError +import pytest + +since = pytest.mark.since + +@since('3.0') +class TestRefresh(Tester): +def test_refresh_deadlock_startup(self): +""" Test refresh deadlock during startup (CASSANDRA-14310) """ +self.cluster.populate(1) +node = self.cluster.nodelist()[0] +node.byteman_port = '8100' +node.import_config_files() +self.cluster.start(wait_other_notice=True) +session = self.patient_cql_connection(node) +session.execute("CREATE KEYSPACE ks WITH replication = {'class':'SimpleStrategy', 'replication_factor':1}") +session.execute("CREATE TABLE ks.a (id int primary key, d text)") +session.execute("CREATE TABLE ks.b (id int primary key, d text)") +node.nodetool("disableautocompaction") # make sure we have more than 1 sstable +for x in range(0, 10): +session.execute("INSERT INTO ks.a (id, d) VALUES (%d, '%d %d')"%(x, x, x)) +session.execute("INSERT INTO ks.b (id, d) VALUES (%d, '%d %d')"%(x, x, x)) +node.flush() +node.stop() +node.update_startup_byteman_script('byteman/sstable_open_delay.btm') +node.start() +node.watch_log_for("opening keyspace ks", filename="debug.log") +time.sleep(5) +for x in range(0, 20): +try: +node.nodetool("refresh ks a") +node.nodetool("refresh ks b") +except ToolError: +pass # this is OK post-14310 - we just don't want to hang forever +time.sleep(1) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up
Avoid deadlock when running nodetool refresh before node is fully up Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b Branch: refs/heads/trunk Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42 Parents: edcb90f Author: Marcus Eriksson Authored: Tue Mar 13 08:45:30 2018 +0100 Committer: Marcus Eriksson Committed: Wed Apr 11 14:47:05 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 94b2276..9012f8c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.17 + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) * Handle all exceptions when opening sstables (CASSANDRA-14202) * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 14e06b0..4c7bc46 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -653,7 +653,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean * @param ksName The keyspace name * @param cfName The columnFamily name */ -public static synchronized void loadNewSSTables(String ksName, String cfName) +public static void loadNewSSTables(String ksName, String cfName) { /** ks/cf existence checks will be done by open and getCFS methods for us */ Keyspace keyspace = Keyspace.open(ksName); http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index cf8e257..77fcb81 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -4597,6 +4597,8 @@ public class StorageService extends NotificationBroadcasterSupport implements IE */ public void loadNewSSTables(String ksName, String cfName) { +if (!isInitialized()) +throw new RuntimeException("Not yet initialized, can't load new sstables"); ColumnFamilyStore.loadNewSSTables(ksName, cfName); } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up
Avoid deadlock when running nodetool refresh before node is fully up Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b Branch: refs/heads/cassandra-3.11 Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42 Parents: edcb90f Author: Marcus Eriksson Authored: Tue Mar 13 08:45:30 2018 +0100 Committer: Marcus Eriksson Committed: Wed Apr 11 14:47:05 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 94b2276..9012f8c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.17 + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) * Handle all exceptions when opening sstables (CASSANDRA-14202) * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 14e06b0..4c7bc46 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -653,7 +653,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean * @param ksName The keyspace name * @param cfName The columnFamily name */ -public static synchronized void loadNewSSTables(String ksName, String cfName) +public static void loadNewSSTables(String ksName, String cfName) { /** ks/cf existence checks will be done by open and getCFS methods for us */ Keyspace keyspace = Keyspace.open(ksName); http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index cf8e257..77fcb81 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -4597,6 +4597,8 @@ public class StorageService extends NotificationBroadcasterSupport implements IE */ public void loadNewSSTables(String ksName, String cfName) { +if (!isInitialized()) +throw new RuntimeException("Not yet initialized, can't load new sstables"); ColumnFamilyStore.loadNewSSTables(ksName, cfName); } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/95a52a8b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/95a52a8b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/95a52a8b Branch: refs/heads/trunk Commit: 95a52a8bfbabb4acb3518ee7f5e6d256110d7bf0 Parents: 42827e6 75a9320 Author: Marcus Eriksson Authored: Wed Apr 11 14:55:10 2018 +0200 Committer: Marcus Eriksson Committed: Wed Apr 11 14:55:10 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/src/java/org/apache/cassandra/service/StorageService.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 edcb90f08 -> 22bb413ba refs/heads/cassandra-3.11 19e329eb5 -> 75a932087 refs/heads/trunk 42827e6a6 -> 95a52a8bf Avoid deadlock when running nodetool refresh before node is fully up Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b Branch: refs/heads/cassandra-3.0 Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42 Parents: edcb90f Author: Marcus Eriksson Authored: Tue Mar 13 08:45:30 2018 +0100 Committer: Marcus Eriksson Committed: Wed Apr 11 14:47:05 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 94b2276..9012f8c 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.17 + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) * Handle all exceptions when opening sstables (CASSANDRA-14202) * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 14e06b0..4c7bc46 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -653,7 +653,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean * @param ksName The keyspace name * @param cfName The columnFamily name */ -public static synchronized void loadNewSSTables(String ksName, String cfName) +public static void loadNewSSTables(String ksName, String cfName) { /** ks/cf existence checks will be done by open and getCFS methods for us */ Keyspace keyspace = Keyspace.open(ksName); http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index cf8e257..77fcb81 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -4597,6 +4597,8 @@ public class StorageService extends NotificationBroadcasterSupport implements IE */ public void loadNewSSTables(String ksName, String cfName) { +if (!isInitialized()) +throw new RuntimeException("Not yet initialized, can't load new sstables"); ColumnFamilyStore.loadNewSSTables(ksName, cfName); } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75a93208 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75a93208 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75a93208 Branch: refs/heads/trunk Commit: 75a932087d027a569f37b1e3c1047aaff107549e Parents: 19e329e 22bb413 Author: Marcus Eriksson Authored: Wed Apr 11 14:47:47 2018 +0200 Committer: Marcus Eriksson Committed: Wed Apr 11 14:47:47 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/CHANGES.txt -- diff --cc CHANGES.txt index e0145d4,9012f8c..e55ae28 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,13 -1,5 +1,14 @@@ -3.0.17 +3.11.3 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370) + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891) + * Serialize empty buffer as empty string for json output format (CASSANDRA-14245) + * Allow logging implementation to be interchanged for embedded testing (CASSANDRA-13396) + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247) + * Fix Loss of digits when doing CAST from varint/bigint to decimal (CASSANDRA-14170) + * RateBasedBackPressure unnecessarily invokes a lock on the Guava RateLimiter (CASSANDRA-14163) + * Fix wildcard GROUP BY queries (CASSANDRA-14209) +Merged from 3.0: + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) * Handle all exceptions when opening sstables (CASSANDRA-14202) * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/service/StorageService.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75a93208 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75a93208 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75a93208 Branch: refs/heads/cassandra-3.11 Commit: 75a932087d027a569f37b1e3c1047aaff107549e Parents: 19e329e 22bb413 Author: Marcus Eriksson Authored: Wed Apr 11 14:47:47 2018 +0200 Committer: Marcus Eriksson Committed: Wed Apr 11 14:47:47 2018 +0200 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 2 +- src/java/org/apache/cassandra/service/StorageService.java | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/CHANGES.txt -- diff --cc CHANGES.txt index e0145d4,9012f8c..e55ae28 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,13 -1,5 +1,14 @@@ -3.0.17 +3.11.3 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370) + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891) + * Serialize empty buffer as empty string for json output format (CASSANDRA-14245) + * Allow logging implementation to be interchanged for embedded testing (CASSANDRA-13396) + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247) + * Fix Loss of digits when doing CAST from varint/bigint to decimal (CASSANDRA-14170) + * RateBasedBackPressure unnecessarily invokes a lock on the Guava RateLimiter (CASSANDRA-14163) + * Fix wildcard GROUP BY queries (CASSANDRA-14209) +Merged from 3.0: + * Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310) * Handle all exceptions when opening sstables (CASSANDRA-14202) * Handle incompletely written hint descriptors during startup (CASSANDRA-14080) * Handle repeat open bound from SRP in read repair (CASSANDRA-14330) http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/service/StorageService.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-6719) redesign loadnewsstables
[ https://issues.apache.org/jira/browse/CASSANDRA-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433839#comment-16433839 ] Marcus Eriksson edited comment on CASSANDRA-6719 at 4/11/18 12:33 PM: -- pushed a commit with the comments addressed [here|https://github.com/krummas/cassandra/commits/marcuse/6719] bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called this is on purpose and we should probably fix this in 3.0+ as well - I don't think we want to trigger the disk failure policy if import fails - instead we should abort the import. If someone has configured {{disk_failure_policy: stop_paranoid}} trying to load a corrupt file would actually stop the node bq. Row cache invalidation was not previously performed — this is a good thing regardless, so maybe skip an option for this one. added an option to explicitly skip the row cache invalidation - also added a {{--quick}} option which makes it behave more like the old version bq. If using nodetool refresh with JBOD, the counting keys per boundary work is done just to throw it away. added a check if there is only a single data directory (and row cache is not enabled) bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath and findBestDiskAndInvalidCache’s path -> srcPath fixed bq. Minor/usability nit: I couldn’t find many cases where @Option(required=true) is used. WDYT about moving the path to a positional argument since its required and this command does not take a variable number of positional args? makes sense, made it {{nodetool import }} bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an invalid state, make noVerify=true imply noVerifyTokens=true. yup, makes sense bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new StorageService.loadSSTables. bq. The comment on CFS#L861 is useful but out of place. bq. Minor/naming nit: The naming of the “allKeys” variable in ImportTest#testImportInvalidateCache is misleading. bq. Instead of hardcoding token values what about using e.g. t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0? bq. Are you intentionally leaving the Random seed hardcoded? fixed was (Author: krummas): pushed a commit with the comments addressed [here|https://github.com/krummas/cassandra/commits/marcuse/6719] bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called this is on purpose and we should probably fix this in 3.0+ as well - I don't think we want to trigger the disk failure policy if import fails - instead we should abort the import. If someone has configured {{disk_failure_policy: die_paranoid}} trying to load a corrupt file would actually stop the node bq. Row cache invalidation was not previously performed — this is a good thing regardless, so maybe skip an option for this one. added an option to explicitly skip the row cache invalidation - also added a {{--quick}} option which makes it behave more like the old version bq. If using nodetool refresh with JBOD, the counting keys per boundary work is done just to throw it away. added a check if there is only a single data directory (and row cache is not enabled) bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath and findBestDiskAndInvalidCache’s path -> srcPath fixed bq. Minor/usability nit: I couldn’t find many cases where @Option(required=true) is used. WDYT about moving the path to a positional argument since its required and this command does not take a variable number of positional args? makes sense, made it {{nodetool import }} bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an invalid state, make noVerify=true imply noVerifyTokens=true. yup, makes sense bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new StorageService.loadSSTables. bq. The comment on CFS#L861 is useful but out of place. bq. Minor/naming nit: The naming of the “allKeys” variable in ImportTest#testImportInvalidateCache is misleading. bq. Instead of hardcoding token values what about using e.g. t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0? bq. Are you intentionally leaving the Random seed hardcoded? fixed > redesign loadnewsstables > > > Key: CASSANDRA-6719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6719 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Marcus Eriksson >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: 6719.patch > > > CFSMBean.loadNewSSTables scans data directories for new sstables dropped > there by an external agent. This is dangerous because of possible filename > con
[jira] [Commented] (CASSANDRA-6719) redesign loadnewsstables
[ https://issues.apache.org/jira/browse/CASSANDRA-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433839#comment-16433839 ] Marcus Eriksson commented on CASSANDRA-6719: pushed a commit with the comments addressed [here|https://github.com/krummas/cassandra/commits/marcuse/6719] bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called this is on purpose and we should probably fix this in 3.0+ as well - I don't think we want to trigger the disk failure policy if import fails - instead we should abort the import. If someone has configured {{disk_failure_policy: die_paranoid}} trying to load a corrupt file would actually stop the node bq. Row cache invalidation was not previously performed — this is a good thing regardless, so maybe skip an option for this one. added an option to explicitly skip the row cache invalidation - also added a {{--quick}} option which makes it behave more like the old version bq. If using nodetool refresh with JBOD, the counting keys per boundary work is done just to throw it away. added a check if there is only a single data directory (and row cache is not enabled) bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath and findBestDiskAndInvalidCache’s path -> srcPath fixed bq. Minor/usability nit: I couldn’t find many cases where @Option(required=true) is used. WDYT about moving the path to a positional argument since its required and this command does not take a variable number of positional args? makes sense, made it {{nodetool import }} bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an invalid state, make noVerify=true imply noVerifyTokens=true. yup, makes sense bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new StorageService.loadSSTables. bq. The comment on CFS#L861 is useful but out of place. bq. Minor/naming nit: The naming of the “allKeys” variable in ImportTest#testImportInvalidateCache is misleading. bq. Instead of hardcoding token values what about using e.g. t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0? bq. Are you intentionally leaving the Random seed hardcoded? fixed > redesign loadnewsstables > > > Key: CASSANDRA-6719 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6719 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jonathan Ellis >Assignee: Marcus Eriksson >Priority: Minor > Labels: lhf > Fix For: 4.x > > Attachments: 6719.patch > > > CFSMBean.loadNewSSTables scans data directories for new sstables dropped > there by an external agent. This is dangerous because of possible filename > conflicts with existing or newly generated sstables. > Instead, we should support leaving the new sstables in a separate directory > (specified by a parameter, or configured as a new location in yaml) and take > care of renaming as necessary automagically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13459) Diag. Events: Native transport integration
[ https://issues.apache.org/jira/browse/CASSANDRA-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433832#comment-16433832 ] Stefan Podkowinski commented on CASSANDRA-13459: {quote}So I was just thinking that forward looking restricting this mechanism to diagnostic events might not make sense. I was thinking a more generic subscription mechanism where diagnostic are events is a subset of what clients can conditionally subscribe to means we don't end up with naming issues in the future. {quote} We could specify a subscription mechanism for native transport that is not specific to diag events. But what would the subject look like to subscribe to? If we want to support more powerful publish/subscribe semantics, we'd have to allow clients to specify event matchers in a more generic way, e.g. by using some kind of query language. Examples All auditing events for "ks" keyspace updates: {{SUBSCRIBE diag_events "event=AuditEvent#UPDATE(keyspace=ks)"}} Full query replication for ks.table1: {{SUBSCRIBE full_query_log "keyspace=ks,table=table1"}} Subscribe to any row updates matching a query: {{SUBSCRIBE cdc "select * from order_status where order_id = 1"}} Not a big fan of language-in-language, but I'm open to discuss any options, if that's something we should add. {quote}For V1 of this functionality my only sticking point is that even with 1-2 clients consuming diagnostic events we have to handle backpressure somehow. AFAIK we hold onto messages pending to a client for a while (indefinitely?). I am not actually sure what kind fo timeouts or health checks we do for clients. {quote} The current implementation will simply write and flush any event message to the netty stack. Netty should use it's own event loop, but I'm not sure how buffering is handled in detail. Maybe we could also use the {{Message.Dispatcher}} instead and add even messages to the (unbounded) queue of items to flush. But we don't do that either for "classic" schema/topo/status change messages. > Diag. Events: Native transport integration > -- > > Key: CASSANDRA-13459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13459 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Major > Labels: client-impacting > > Events should be consumable by clients that would received subscribed events > from the connected node. This functionality is designed to work on top of > native transport with minor modifications to the protocol standard (see > [original > proposal|https://docs.google.com/document/d/1uEk7KYgxjNA0ybC9fOuegHTcK3Yi0hCQN5nTp5cNFyQ/edit?usp=sharing] > for further considered options). First we have to add another value for > existing event types. Also, we have to extend the protocol a bit to be able > to specify a sub-class and sub-type value. E.g. > {{DIAGNOSTIC_EVENT(GossiperEvent, MAJOR_STATE_CHANGE_HANDLED)}}. This still > has to be worked out and I'd appreciate any feedback. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14023) add_dc_after_mv_network_replication_test - materialized_views_test.TestMaterializedViews fails due to invalid datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-14023: Resolution: Fixed Status: Resolved (was: Ready to Commit) > add_dc_after_mv_network_replication_test - > materialized_views_test.TestMaterializedViews fails due to invalid datacenter > > > Key: CASSANDRA-14023 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14023 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Michael Kjellman >Assignee: Marcus Eriksson >Priority: Major > > add_dc_after_mv_network_replication_test - > materialized_views_test.TestMaterializedViews always fails due to: > message="Unrecognized strategy option {dc2} passed to NetworkTopologyStrategy > for keyspace ks"> -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: Accept ConfigurationException and IRE when dropping non-existent ks in secondary_indexes_test.py
Repository: cassandra-dtest Updated Branches: refs/heads/master 4f2996b46 -> 9c2eb35a8 Accept ConfigurationException and IRE when dropping non-existent ks in secondary_indexes_test.py Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/9c2eb35a Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/9c2eb35a Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/9c2eb35a Branch: refs/heads/master Commit: 9c2eb35a8c1d9fde1499fdfc7b02e7db36d321e0 Parents: 4f2996b Author: Aleksey Yeschenko Authored: Wed Apr 11 13:14:57 2018 +0100 Committer: Aleksey Yeschenko Committed: Wed Apr 11 13:14:57 2018 +0100 -- secondary_indexes_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/9c2eb35a/secondary_indexes_test.py -- diff --git a/secondary_indexes_test.py b/secondary_indexes_test.py index 9b0f326..55b240e 100644 --- a/secondary_indexes_test.py +++ b/secondary_indexes_test.py @@ -161,7 +161,7 @@ class TestSecondaryIndexes(Tester): logger.debug("round %s" % i) try: session.execute("DROP KEYSPACE ks") -except ConfigurationException: +except (ConfigurationException, InvalidRequest): pass create_ks(session, 'ks', 1) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433688#comment-16433688 ] Sergey Kirillov commented on CASSANDRA-14239: - [~jalbersdorfer] I'm trying to do it as well. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433684#comment-16433684 ] Jürgen Albersdorfer commented on CASSANDRA-14239: - [~rushman]: Yes, I have a materialized view on one of the tables. I could eventually afford loosing it. Maybe I will drop it and retry with the same settings. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433676#comment-16433676 ] Sylvain Lebresne commented on CASSANDRA-13910: -- bq. Do you guys think we should add the deprecation warning to 3.11.latest? I do. I'd add it to 3.0.latest as well since I believe we'll support 3.0 -> 4.0 upgrades. > Remove read_repair_chance/dclocal_read_repair_chance > > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 4.0 > > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433670#comment-16433670 ] Aleksey Yeschenko edited comment on CASSANDRA-13910 at 4/11/18 10:02 AM: - [~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see how much DDL/metadata code I can clean up (actually dropping columns from {{system_schema}} tables is not something we've ever done). Do you guys think we should add the deprecation warning to 3.11.latest? was (Author: iamaleksey): [~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see how much DDL/metadata code I can clean up (actually dropping columns from {{system_schema}} tables is not something we've ever done. Do you guys think we should add the deprecation warning to 3.11.latest? > Remove read_repair_chance/dclocal_read_repair_chance > > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 4.0 > > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-13910: -- Status: Open (was: Patch Available) > Remove read_repair_chance/dclocal_read_repair_chance > > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 4.0 > > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433670#comment-16433670 ] Aleksey Yeschenko commented on CASSANDRA-13910: --- [~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see how much DDL/metadata code I can clean up (actually dropping columns from {{system_schema}} tables is not something we've ever done. Do you guys think we should add the deprecation warning to 3.11.latest? > Remove read_repair_chance/dclocal_read_repair_chance > > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 4.0 > > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433661#comment-16433661 ] Sergey Kirillov commented on CASSANDRA-14239: - [~jalbersdorfer] do you use a materialized views in your DB? I was able to localize this behavior to one table which has a few materialized views defined, so I suspect that this may be related to the MV update. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433655#comment-16433655 ] Sergey Kirillov commented on CASSANDRA-14239: - [~jalbersdorfer] I was trying to debug it and it looks like deadlock in memtable flush path. This leads to memtables which are never released to the pool and eventually you are getting OOM. However, I still don't understand why those flush/mutation threads are freezing and how to resolve this. It would be nice if someone from the devs could take a look. > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433637#comment-16433637 ] Jürgen Albersdorfer edited comment on CASSANDRA-14239 at 4/11/18 9:45 AM: -- I changed {code:java} disk_optimization_strategy: ssd memtable_heap_space_in_mb: 2048 memtable_offheap_space_in_mb: 2048 {code} Streaming was much more faster and produced less CPU pressure than before {code:java} -dsk/total- ---system-- total-cpu-usage --io/total- -net/total- read writ| int csw |usr sys idl wai hiq siq| read writ| recv send 9830B 31M| 48k 7751 | 67 2 31 0 0 1|0.20 85.8 | 30M 380k 0 28M| 51k 7838 | 65 2 32 0 0 1| 0 80.9 | 33M 511k 32k 35M| 54k 9024 | 66 2 31 0 0 1|0.60 102 | 37M 540k 0 28M| 41k 7072 | 62 2 36 0 0 1| 0 78.1 | 26M 265k 1638B 25M| 41k 6606 | 62 1 36 0 0 0|0.10 67.6 | 25M 110k 1638B 26M| 41k 7251 | 57 1 41 0 0 0|0.10 69.9 | 27M 138k 819B 24M| 40k 6129 | 56 1 42 0 0 1|0.20 61.5 | 25M 127k 0 25M| 38k 7273 | 56 1 42 0 0 0| 0 66.9 | 26M 162k 1024k 24M| 35k 6501 | 56 1 42 0 0 0|25.2 62.8 | 25M 128k 0 24M| 37k 7238 | 56 1 42 0 0 0| 0 62.6 | 26M 164k 0 24M| 35k 6349 | 56 1 42 0 0 0| 0 63.5 | 25M 145k 410B 26M| 40k 6979 | 56 2 42 0 0 0|0.10 73.1 | 28M 341k 0 28M| 41k 7042 | 56 1 42 0 0 0| 0 70.8 | 30M 350k 2048B 31M| 44k 7334 | 56 2 42 0 0 0|0.20 85.4 | 32M 347k 0 31M| 46k 6515 | 56 1 42 0 0 1| 0 86.0 | 33M 383k 0 30M| 47k 7572 | 56 1 42 0 0 1| 0 82.3 | 33M 466k 7373B 31M| 41k 5742 | 56 1 42 0 0 0|0.20 84.3 | 30M 319k 0 30M| 43k 7146 | 56 2 42 0 0 1| 0 87.4 | 28M 423k {code} when `Received complete` for all Nodes, bootstrap didn't finish and I can observe a * stalled number of `Completed` MutationStage, * while the `Pending` MutationStage seems to skyrocket. * Rest of it looks fine to me :( {code:java} nodetool tpstats Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CompactionExecutor 2 7 53 0 0 MutationStage 128 5722021 593964000 0 0 MemtableReclaimMemory 0 0 2194 0 0 PendingRangeCalculator 0 0 19 0 0 GossipStage 0 0 25736 0 0 SecondaryIndexManagement 0 0 0 0 0 HintsDispatcher 0 0 0 0 0 RequestResponseStage 0 0 167108 0 0 ReadRepairStage 0 0 0 0 0 CounterMutationStage 0 0 0 0 0 MigrationStage 0 0 40 0 0 MemtablePostFlush 1 11 2344 0 0 PerDiskMemtableFlushWriter_0 0 0 2194 0 0 ValidationExecutor 0 0 0 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 2 11 2194 0 0 InternalResponseStage 0 0 31 0 0 ViewMutationStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 HINT 0 MUTATION 0 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {code} *Why does `MutationStage`now `(busy) hang`? - While* * SlabPoolCleaner Thread uses a single logical CPU at 100% permanently * G1 Old Gen increases linearly over time and goes
[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433637#comment-16433637 ] Jürgen Albersdorfer commented on CASSANDRA-14239: - I changed disk_optimization_strategy: ssd memtable_heap_space_in_mb: 2048 memtable_offheap_space_in_mb: 2048 Streaming was much more faster and produced less CPU pressure than before {code:java} -dsk/total- ---system-- total-cpu-usage --io/total- -net/total- read writ| int csw |usr sys idl wai hiq siq| read writ| recv send 9830B 31M| 48k 7751 | 67 2 31 0 0 1|0.20 85.8 | 30M 380k 0 28M| 51k 7838 | 65 2 32 0 0 1| 0 80.9 | 33M 511k 32k 35M| 54k 9024 | 66 2 31 0 0 1|0.60 102 | 37M 540k 0 28M| 41k 7072 | 62 2 36 0 0 1| 0 78.1 | 26M 265k 1638B 25M| 41k 6606 | 62 1 36 0 0 0|0.10 67.6 | 25M 110k 1638B 26M| 41k 7251 | 57 1 41 0 0 0|0.10 69.9 | 27M 138k 819B 24M| 40k 6129 | 56 1 42 0 0 1|0.20 61.5 | 25M 127k 0 25M| 38k 7273 | 56 1 42 0 0 0| 0 66.9 | 26M 162k 1024k 24M| 35k 6501 | 56 1 42 0 0 0|25.2 62.8 | 25M 128k 0 24M| 37k 7238 | 56 1 42 0 0 0| 0 62.6 | 26M 164k 0 24M| 35k 6349 | 56 1 42 0 0 0| 0 63.5 | 25M 145k 410B 26M| 40k 6979 | 56 2 42 0 0 0|0.10 73.1 | 28M 341k 0 28M| 41k 7042 | 56 1 42 0 0 0| 0 70.8 | 30M 350k 2048B 31M| 44k 7334 | 56 2 42 0 0 0|0.20 85.4 | 32M 347k 0 31M| 46k 6515 | 56 1 42 0 0 1| 0 86.0 | 33M 383k 0 30M| 47k 7572 | 56 1 42 0 0 1| 0 82.3 | 33M 466k 7373B 31M| 41k 5742 | 56 1 42 0 0 0|0.20 84.3 | 30M 319k 0 30M| 43k 7146 | 56 2 42 0 0 1| 0 87.4 | 28M 423k {code} when `Received complete` for all Nodes, bootstrap didn't finish and I can observe a * stalled number of `Completed` MutationStage, * while the `Pending` MutationStage seems to skyrocket. * Rest of it looks fine to me :( {code:java} nodetool tpstats Pool Name Active Pending Completed Blocked All time blocked ReadStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CompactionExecutor 2 7 53 0 0 MutationStage 128 5722021 593964000 0 0 MemtableReclaimMemory 0 0 2194 0 0 PendingRangeCalculator 0 0 19 0 0 GossipStage 0 0 25736 0 0 SecondaryIndexManagement 0 0 0 0 0 HintsDispatcher 0 0 0 0 0 RequestResponseStage 0 0 167108 0 0 ReadRepairStage 0 0 0 0 0 CounterMutationStage 0 0 0 0 0 MigrationStage 0 0 40 0 0 MemtablePostFlush 1 11 2344 0 0 PerDiskMemtableFlushWriter_0 0 0 2194 0 0 ValidationExecutor 0 0 0 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 2 11 2194 0 0 InternalResponseStage 0 0 31 0 0 ViewMutationStage 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 CacheCleanupExecutor 0 0 0 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 HINT 0 MUTATION 0 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 {code} *Why does `MutationStage`now `(busy) hang`? - While* * SlabPoolCleaner Thread uses a single logical CPU at 100% permanently * G1 Old Gen increases linearly over time and goes far beyond 50GB * See attached [^gc.log.20180441.zip
[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM
[ https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jürgen Albersdorfer updated CASSANDRA-14239: Attachment: gc.log.20180441.zip > OutOfMemoryError when bootstrapping with less than 100GB RAM > > > Key: CASSANDRA-14239 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14239 > Project: Cassandra > Issue Type: Bug > Environment: Details of the bootstrapping Node > * ProLiant BL460c G7 > * 56GB RAM > * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and > saved_caches) > * CentOS 7.4 on SD-Card > * /tmp and /var/log on tmpfs > * Oracle JDK 1.8.0_151 > * Cassandra 3.11.1 > Cluster > * 10 existing Nodes (Up and Normal) >Reporter: Jürgen Albersdorfer >Priority: Major > Attachments: Objects-by-class.csv, > Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, > gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, > stack-traces.txt > > > Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on > our 10 Node C* 3.11.1 Cluster. > During bootstrap, when I watch the cassandra.log I observe a growth in JVM > Heap Old Gen which gets not significantly freed up any more. > I know that JVM collects on Old Gen only when really needed. I can see > collections, but there is always a remainder which seems to grow forever > without ever getting freed. > After the Node successfully Joined the Cluster, I can remove the extra RAM I > have given it for bootstrapping without any further effect. > It feels like Cassandra will not forget about every single byte streamed over > the Network over time during bootstrapping, - which would be a memory leak > and a major problem, too. > I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB > assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated > for org.apache.cassandra.db.Memtable (22 GB) > org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer > (11 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433556#comment-16433556 ] Sylvain Lebresne commented on CASSANDRA-13910: -- bq. Turning on [~slebresne] signal. I actually personally prefer being clear and throw an exception. As a user, what I would find rude, is if I'm too easily misled to believe one of my action has worked when it has in fact no action, and just having a warning make that more likely. If something has been removed, I don't find it rude to get an exception, I find it honest and helpful. It's a preference though at the end of the day, so just giving my 2 cents but not pushing more than that. bq. The WARN on using old things is how we have done this in the past. I'm sure your memory is better than me, but didn't we mostly used warnings when we deprecated something? That is, we had a release where the old settings were still working but were warning, and when it stopped working we removed it? > Remove read_repair_chance/dclocal_read_repair_chance > > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko >Priority: Minor > Fix For: 4.0 > > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org