date:20180411

[jira] [Commented] (CASSANDRA-14103) Fix potential race during compaction strategy reload

2018-04-11 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434981#comment-16434981
 ] 

Marcus Eriksson commented on CASSANDRA-14103:
-

(sorry for the delay on this)

LGTM, just a few minor comments;
* Could we make {{CompactionStrategyManager}} take the initial sstables as a 
parameter to the constructor instead of calling 
{{cfs.getSSTables(...CANONICAL..)}} there? Feels it makes it more clear that 
the tracker has to be populated before we can create the CSM
* Make {{maybeReloadDiskBoundaries}} return {{void}}, the only user of the 
return value is the test case and that could probably be refactored to check 
that the boundaries changed instead?

> Fix potential race during compaction strategy reload
> 
>
> Key: CASSANDRA-14103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14103
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Attachments: 3.11-14103-dtest.png, 3.11-14103-testall.png, 
> trunk-14103-dtest.png, trunk-14103-testall.png
>
>
> When the compaction strategies are reloaded after disk boundary changes 
> (CASSANDRA-13948), it's possible that a recently finished SSTable is added 
> twice to the compaction strategy: once when the compaction strategies are 
> reloaded due to the disk boundary change ({{maybeReloadDiskBoundarie}}), and 
> another when the {{CompactionStrategyManager}} is processing the 
> {{SSTableAddedNotification}}.
> This should be quite unlikely because a compaction must finish as soon as the 
> disk boundary changes, and even if it happens most compaction strategies 
> would not be affected by it since they deduplicate sstables internally, but 
> we should protect against such scenario. 
> For more context see [this 
> comment|https://issues.apache.org/jira/browse/CASSANDRA-13948?focusedCommentId=16280448&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16280448]
>  from Marcus.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-04-11 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14160:
---
Reviewer: Jeff Jirsa

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-04-11 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434939#comment-16434939
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


Re-pushed [here|https://github.com/jeffjirsa/cassandra/commits/14160] , tests 
running [here|https://circleci.com/gh/jeffjirsa/cassandra/tree/14160] (unit 
tests + dtests) 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)

2018-04-11 Thread Dave Brosius (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434870#comment-16434870
 ] 

Dave Brosius commented on CASSANDRA-14367:
--

Ok thanks, Altho I would expect the same exact situation before the change. Not 
arguing for the ticket, just learning

> prefer Collections.singletonList to Arrays.asList(one_element)
> --
>
> Key: CASSANDRA-14367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 4.x
>
> Attachments: 14367.txt
>
>
> small improvement, but Arrays.asList first creates an array, then wraps it 
> with a collections instance, whereas Collections.singletonList just creates 
> one small (one field) bean instance.
> so a small cut down on garbage generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round

2018-04-11 Thread Kurt Greaves (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434866#comment-16434866
 ] 

Kurt Greaves commented on CASSANDRA-13851:
--

ping [~beobal]. keen on getting this in before something breaks it again. :)

> Allow existing nodes to use all peers in shadow round
> -
>
> Key: CASSANDRA-13851
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13851
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.11.x, 4.x
>
>
> In CASSANDRA-10134 we made collision checks necessary on every startup. A 
> side-effect was introduced that then requires a nodes seeds to be contacted 
> on every startup. Prior to this change an existing node could start up 
> regardless whether it could contact a seed node or not (because 
> checkForEndpointCollision() was only called for bootstrapping nodes). 
> Now if a nodes seeds are removed/deleted/fail it will no longer be able to 
> start up until live seeds are configured (or itself is made a seed), even 
> though it already knows about the rest of the ring. This is inconvenient for 
> operators and has the potential to cause some nasty surprises and increase 
> downtime.
> One solution would be to use all a nodes existing peers as seeds in the 
> shadow round. Not a Gossip guru though so not sure of implications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)

2018-04-11 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434808#comment-16434808
 ] 

Jeremiah Jordan commented on CASSANDRA-14367:
-

The static method call is not non-monomorphic, the later uses of the List will 
be, since there will now be multiple List implementations used.

> prefer Collections.singletonList to Arrays.asList(one_element)
> --
>
> Key: CASSANDRA-14367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 4.x
>
> Attachments: 14367.txt
>
>
> small improvement, but Arrays.asList first creates an array, then wraps it 
> with a collections instance, whereas Collections.singletonList just creates 
> one small (one field) bean instance.
> so a small cut down on garbage generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)

2018-04-11 Thread Dave Brosius (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434789#comment-16434789
 ] 

Dave Brosius commented on CASSANDRA-14367:
--

 perfectly find with not accepting this, no biggie. But am curious as to what 
makes a static method call non - monomorphic?

> prefer Collections.singletonList to Arrays.asList(one_element)
> --
>
> Key: CASSANDRA-14367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 4.x
>
> Attachments: 14367.txt
>
>
> small improvement, but Arrays.asList first creates an array, then wraps it 
> with a collections instance, whereas Collections.singletonList just creates 
> one small (one field) bean instance.
> so a small cut down on garbage generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14367) prefer Collections.singletonList to Arrays.asList(one_element)

2018-04-11 Thread Dave Brosius (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-14367:
-
Resolution: Won't Do
Status: Resolved  (was: Patch Available)

> prefer Collections.singletonList to Arrays.asList(one_element)
> --
>
> Key: CASSANDRA-14367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14367
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Dave Brosius
>Priority: Trivial
> Fix For: 4.x
>
> Attachments: 14367.txt
>
>
> small improvement, but Arrays.asList first creates an array, then wraps it 
> with a collections instance, whereas Collections.singletonList just creates 
> one small (one field) bean instance.
> so a small cut down on garbage generated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14379) Better handling of missing partition columns in system_schema.columns during startup

2018-04-11 Thread Jay Zhuang (JIRA)

Jay Zhuang created CASSANDRA-14379:
--

 Summary: Better handling of missing partition columns in 
system_schema.columns during startup
 Key: CASSANDRA-14379
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14379
 Project: Cassandra
  Issue Type: Improvement
  Components: Distributed Metadata
Reporter: Jay Zhuang
Assignee: Jay Zhuang


Follow up for CASSANDRA-13180, during table deletion/creation, we saw one table 
having partially deleted columns (no partition column, only regular column). 
It's blocking node from startup:
{noformat}
java.lang.AssertionError: null
at 
org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:308) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:363) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1028) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:987) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:945)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:922)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:910)
 ~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:138) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:128) 
~[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:241) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
[apache-cassandra-3.0.14.x.jar:3.0.14.x]
{noformat}

As partition column is mandatory, it should throw 
[{{MissingColumns}}|https://github.com/apache/cassandra/blob/60563f4e8910fb59af141fd24f1fc1f98f34f705/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L1351],
 the same as CASSANDRA-13180, so the user is able to cleanup the schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13889) cfstats should take sorting and limit parameters

2018-04-11 Thread Patrick Bannister (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434756#comment-16434756
 ] 

Patrick Bannister commented on CASSANDRA-13889:
---

Thanks for the review!

> cfstats should take sorting and limit parameters
> 
>
> Key: CASSANDRA-13889
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13889
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jon Haddad
>Assignee: Patrick Bannister
>Priority: Major
> Fix For: 4.0
>
> Attachments: 13889-trunk.txt, sample_output_normal.txt, 
> sample_output_sorted.txt, sample_output_sorted_top3.txt
>
>
> When looking at a problematic node I'm not familiar with, one of the first 
> things I do is check cfstats to identify the tables with the most reads, 
> writes, and data.  This is fine as long as there aren't a lot of tables but 
> once it goes above a dozen it's quite difficult.  cfstats should allow me to 
> sort the results and limit to top K tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2018-04-11 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434732#comment-16434732
 ] 

Dinesh Joshi commented on CASSANDRA-7622:
-

Hey [~cnlwsu],

Here's my feedback on the code so far -
 * {{Schema#getVirtualTable}} - {{containsKey()}} followed by a {{get()}} is 
unnecessary. Using {{get}} will have the same effect.
 * {{VirtualSchema}} - the initializers for {{key}} and {{clustering}} fields 
are unnecessary as you're overwriting them in the constructor. It would be a 
good idea to make the fields final.
 * {{VirtualTable#classFromName}}, {{TableMetaData#virtualClass}} - shouldn't 
the error read differently? VirtualTable strategy class instead of Compaction 
strategy class? Also there is no {{AbstractVirtualColumnFamilyStore}}. Am I 
missing something here?
 * {{InMemoryVirtualTable$SimpleVirtualCommand}} - Use finals for fields
 * {{InMemoryVirtualTable$ResultReadState}} - Use finals for fields
 * {{InMemoryVirtualTable$ResultReadState}} - line 258 - isn't the if check 
redundant?

Nits -
 * {{InMemoryVirtualTable}} - get rid of unused import 
{{org.apache.cassandra.db.marshal.AbstractType;}}

 

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Tupshin Harper
>Assignee: Chris Lohfink
>Priority: Major
> Fix For: 4.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-04-11 Thread Preetika Tyagi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434503#comment-16434503
 ] 

Preetika Tyagi commented on CASSANDRA-13853:


[~aweisberg] Thank you for taking care of that. Here is my email id:  
[preetika.ty...@intel.com|mailto:preetika.ty...@intel.com]

 

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-04-11 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434482#comment-16434482
 ] 

Ariel Weisberg commented on CASSANDRA-13853:


Test failures look unrelated 
https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13853-trunk

I had to clean up the imports a bit, but other than that I didn't change 
anything.



> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-04-11 Thread Ariel Weisberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13853:
---
Status: Ready to Commit  (was: Patch Available)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-04-11 Thread Ariel Weisberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-13853:
---
Reviewer: Ariel Weisberg

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-04-11 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434437#comment-16434437
 ] 

Ariel Weisberg commented on CASSANDRA-13853:


[~pree] what email address do you use for your github account? I want to set 
the author tag for the commits correctly.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v6.patch, jira_13853_dtest_v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13459) Diag. Events: Native transport integration

2018-04-11 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434431#comment-16434431
 ] 

Ariel Weisberg commented on CASSANDRA-13459:


One of the differences between other events and diagnostic events is that other 
events are relatively infrequent so we might have been fine until now with not 
implementing proper backpressure. I checked quickly and I didn't see what 
looked like backpressure for clients Looking at some of the events like gossip 
or hints it seems like it's going to be a steady stream all the time.

Most messages sent to clients are responses to requests and if the client can't 
communicate with the server it won't be sending a new requests so it's a self 
limiting problem except for the less common cases where communication fails in 
one direction. I think that may be why we have gotten away without proper 
backpressure until now. It's also possible that there is a hidden bit of code 
somewhere that disables read on a client if we can't write.

bq. We could specify a subscription mechanism for native transport that is not 
specific to diag events. But what would the subject look like to subscribe to?

Looking at what you have now there is no query language correct? You subscribe 
to these events via the wire protocol not the query language? 

Devil's advocate we could have a flat namepsace of events to subscribe to 
subscribe to right now (which is how it seems to work?). I am just saying for 
the wire protocol and internal implementation differentiate between 
subscription and debug/diagnostic. Those are two different concerns.

> Diag. Events: Native transport integration
> --
>
> Key: CASSANDRA-13459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13459
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>  Labels: client-impacting
>
> Events should be consumable by clients that would received subscribed events 
> from the connected node. This functionality is designed to work on top of 
> native transport with minor modifications to the protocol standard (see 
> [original 
> proposal|https://docs.google.com/document/d/1uEk7KYgxjNA0ybC9fOuegHTcK3Yi0hCQN5nTp5cNFyQ/edit?usp=sharing]
>  for further considered options). First we have to add another value for 
> existing event types. Also, we have to extend the protocol a bit to be able 
> to specify a sub-class and sub-type value. E.g. 
> {{DIAGNOSTIC_EVENT(GossiperEvent, MAJOR_STATE_CHANGE_HANDLED)}}. This still 
> has to be worked out and I'd appreciate any feedback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14378) Simplify TableParams defaults

2018-04-11 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14378:
--
Fix Version/s: (was: 4.x)
   4.0
   Status: Patch Available  (was: Open)

> Simplify TableParams defaults
> -
>
> Key: CASSANDRA-14378
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14378
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Trivial
> Fix For: 4.0
>
>
> There is a block of unnecessary constants - only used once - that only 
> introduce indirection and make the code harder to read. And almost introduce 
> a static initialization order issue. We can get rid of that.
> A trivial change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14378) Simplify TableParams defaults

2018-04-11 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434350#comment-16434350
 ] 

Aleksey Yeschenko commented on CASSANDRA-14378:
---

Code [here|https://github.com/iamaleksey/cassandra/commits/14378-4.0].

> Simplify TableParams defaults
> -
>
> Key: CASSANDRA-14378
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14378
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Trivial
> Fix For: 4.x
>
>
> There is a block of unnecessary constants - only used once - that only 
> introduce indirection and make the code harder to read. And almost introduce 
> a static initialization order issue. We can get rid of that.
> A trivial change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14378) Simplify TableParams defaults

2018-04-11 Thread Aleksey Yeschenko (JIRA)

Aleksey Yeschenko created CASSANDRA-14378:
-

 Summary: Simplify TableParams defaults
 Key: CASSANDRA-14378
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14378
 Project: Cassandra
  Issue Type: Improvement
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 4.x


There is a block of unnecessary constants - only used once - that only 
introduce indirection and make the code harder to read. And almost introduce a 
static initialization order issue. We can get rid of that.

A trivial change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: Ninja reorder static constants in TableParams to avoid uninitialized speculativeRetry (hypothetical)

2018-04-11 Thread aleksey

Repository: cassandra
Updated Branches:
  refs/heads/trunk a831b99f9 -> 60563f4e8


Ninja reorder static constants in TableParams to avoid uninitialized 
speculativeRetry (hypothetical)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/60563f4e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/60563f4e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/60563f4e

Branch: refs/heads/trunk
Commit: 60563f4e8910fb59af141fd24f1fc1f98f34f705
Parents: a831b99
Author: Aleksey Yeshchenko 
Authored: Wed Apr 11 18:51:18 2018 +0100
Committer: Aleksey Yeshchenko 
Committed: Wed Apr 11 18:51:18 2018 +0100

--
 src/java/org/apache/cassandra/schema/TableParams.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/60563f4e/src/java/org/apache/cassandra/schema/TableParams.java
--
diff --git a/src/java/org/apache/cassandra/schema/TableParams.java 
b/src/java/org/apache/cassandra/schema/TableParams.java
index 895e3a7..1489c81 100644
--- a/src/java/org/apache/cassandra/schema/TableParams.java
+++ b/src/java/org/apache/cassandra/schema/TableParams.java
@@ -34,8 +34,6 @@ import static java.lang.String.format;
 
 public final class TableParams
 {
-public static final TableParams DEFAULT = TableParams.builder().build();
-
 public enum Option
 {
 BLOOM_FILTER_FP_CHANCE,
@@ -73,6 +71,8 @@ public final class TableParams
 public static final double DEFAULT_CRC_CHECK_CHANCE = 1.0;
 public static final SpeculativeRetryPolicy DEFAULT_SPECULATIVE_RETRY = new 
PercentileSpeculativeRetryPolicy(99.0);
 
+public static final TableParams DEFAULT = TableParams.builder().build();
+
 public final String comment;
 public final double readRepairChance;
 public final double dcLocalReadRepairChance;


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-04-11 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434232#comment-16434232
 ] 

Aleksey Yeschenko commented on CASSANDRA-14118:
---

+1, this looks correct.

I was expecting a bigger patch, with more things abstracted - which is 
partially the reason I procrastinated on this review - a little bit. FWIW, I 
think this changeset makes sense in isolation, and I'm expecting that - 
eventually - the write path will be abstracted away more fully. As of these 
commits, there are still default-engine specific things that live outside the 
abstracted away path:
- there is an implicit assumption that secondary indexes are supported and work 
in a certain way. In particular TableWriteHandler having UpdateTransaction as 
an argument
- MV related code living outside of either handler

With that in mind, again, I'm +1 with the patch, as a proto-abstraction with 
understanding that it's only the beginning.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14118) Refactor write path

2018-04-11 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14118:
--
Status: Ready to Commit  (was: Patch Available)

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14118) Refactor write path

2018-04-11 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14118:
--
Fix Version/s: 4.0

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements

2018-04-11 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13065:
-
Component/s: Materialized Views

> Skip building views during base table streams on range movements
> 
>
> Key: CASSANDRA-13065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13065
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Materialized Views
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
>Priority: Critical
> Fix For: 4.0
>
>
> Booting or decommisioning nodes with MVs is unbearably slow as all streams go 
> through the regular write paths. This causes read-before-writes for every 
> mutation and during bootstrap it causes them to be sent to batchlog.
> The makes it virtually impossible to boot a new node in an acceptable amount 
> of time.
> Using the regular streaming behaviour for consistent range movements works 
> much better in this case and does not break the MV local consistency contract.
> Already tested on own cluster.
> Bootstrap case is super easy to handle, decommission case requires 
> CASSANDRA-13064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14260) Refactor pair to avoid boxing longs/ints

2018-04-11 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14260:
---
   Resolution: Fixed
 Reviewer: Dinesh Joshi
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

Ran a few runs of dtest on the refactored branches 
[here|https://circleci.com/gh/jeffjirsa/cassandra/tree/pair-refactor] , it 
shows CASSANDRA-14371 but nothing else concerning. Committed as 
[a831b99f9123d1c2bdfd70761aca3a05446c9a4c|https://github.com/apache/cassandra/commit/a831b99f9123d1c2bdfd70761aca3a05446c9a4c]
 

> Refactor pair to avoid boxing longs/ints
> 
>
> Key: CASSANDRA-14260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14260
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.0
>
>
> We uses Pair all over the place, and in many cases either/both of X and 
> Y are primitives (ints, longs), and we end up boxing them into Integers and 
> Longs. We should have specialized versions that take primitives. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra git commit: Refactor Pair usage to avoid boxing ints/longs

2018-04-11 Thread jjirsa

Repository: cassandra
Updated Branches:
  refs/heads/trunk 95a52a8bf -> a831b99f9


Refactor Pair usage to avoid boxing ints/longs

Patch by Jeff Jirsa; Reviewed by Dinesh Joshi for CASSANDRA-14260


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a831b99f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a831b99f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a831b99f

Branch: refs/heads/trunk
Commit: a831b99f9123d1c2bdfd70761aca3a05446c9a4c
Parents: 95a52a8
Author: Jeff Jirsa 
Authored: Wed Apr 11 08:24:40 2018 -0700
Committer: Jeff Jirsa 
Committed: Wed Apr 11 08:24:40 2018 -0700

--
 CHANGES.txt |  4 +-
 .../apache/cassandra/db/ColumnFamilyStore.java  |  8 +-
 .../org/apache/cassandra/db/Directories.java| 38 ++--
 .../db/SnapshotDetailsTabularData.java  |  6 +-
 .../db/repair/CassandraValidationIterator.java  |  4 +-
 .../db/streaming/CassandraOutgoingFile.java |  5 +-
 .../db/streaming/CassandraStreamHeader.java | 30 +++
 .../db/streaming/CassandraStreamManager.java|  2 +-
 .../db/streaming/CassandraStreamReader.java |  8 +-
 .../db/streaming/CassandraStreamWriter.java | 17 ++--
 .../CompressedCassandraStreamReader.java|  8 +-
 .../CompressedCassandraStreamWriter.java| 25 +++---
 .../cassandra/db/streaming/CompressionInfo.java |  4 +-
 .../io/compress/CompressionMetadata.java| 22 ++---
 .../cassandra/io/sstable/SSTableLoader.java |  2 +-
 .../io/sstable/format/SSTableReader.java| 83 +++---
 .../io/sstable/format/big/BigTableScanner.java  |  3 +-
 .../org/apache/cassandra/net/MessageOut.java| 44 --
 .../apache/cassandra/service/StorageProxy.java  | 91 
 .../cassandra/service/StorageService.java   |  2 +-
 .../cassandra/db/ColumnFamilyStoreTest.java |  3 +-
 .../apache/cassandra/db/DirectoriesTest.java|  9 +-
 .../cassandra/io/sstable/SSTableReaderTest.java | 17 ++--
 .../io/sstable/SSTableRewriterTest.java | 11 ++-
 .../compression/CompressedInputStreamTest.java  | 12 +--
 25 files changed, 310 insertions(+), 148 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 707ea6b..2dc2021 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,7 +1,7 @@
 4.0
+ * Refactor Pair usage to avoid boxing ints/longs (CASSANDRA-14260)
  * Add options to nodetool tablestats to sort and limit output 
(CASSANDRA-13889)
- * Rename internals to reflect CQL vocabulary
-   (CASSANDRA-14354)
+ * Rename internals to reflect CQL vocabulary (CASSANDRA-14354)
  * Add support for hybrid MIN(), MAX() speculative retry policies
(CASSANDRA-14293, CASSANDRA-14338, CASSANDRA-14352)
  * Fix some regressions caused by 14058 (CASSANDRA-14353)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 4c546dd..bfab6ea 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1451,9 +1451,9 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 Collection> ranges = 
StorageService.instance.getLocalRanges(keyspace.getName());
 for (SSTableReader sstable : sstables)
 {
-List> positions = 
sstable.getPositionsForRanges(ranges);
-for (Pair position : positions)
-expectedFileSize += position.right - position.left;
+List positions = 
sstable.getPositionsForRanges(ranges);
+for (SSTableReader.PartitionPositionBounds position : positions)
+expectedFileSize += position.upperPosition - 
position.lowerPosition;
 }
 
 double compressionRatio = metric.compressionRatio.getValue();
@@ -1965,7 +1965,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * @return  Return a map of all snapshots to space being used
  * The pair for a snapshot has true size and size on disk.
  */
-public Map> getSnapshotDetails()
+public Map getSnapshotDetails()
 {
 return getDirectories().getSnapshotDetails();
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a831b99f/src/java/org/apache/cassandra/db/Directories.java
--
diff --git a/src/java/org/apache/cassandra/db/Directories.java 
b/src/java/org/apache/cassandra/db/Direc

[jira] [Comment Edited] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434042#comment-16434042
 ] 

Sergey Kirillov edited comment on CASSANDRA-14239 at 4/11/18 3:11 PM:
--

Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not 
decreasing anymore. 

*UPD* When everything was blocked node had high CPU usage and where reading a 
lot from disks (which seems related to CASSANDRA-13065).
After a while number of pending memtable jobs decreased and mutations 
unblocked, but in a minute node died again with OOM.


was (Author: rushman):
Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not 
decreasing anymore. 

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, 
> cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434047#comment-16434047
 ] 

Paulo Motta commented on CASSANDRA-14239:
-

bq. Number of pending MemtableFlushWriter jobs is slowly decreasing,  I'll 
try to wait till it decrease to zero, maybe this will unblock mutations.

Perhaps you could try setting the system property 
{{-Dcassandra.repair.mutation_repair_rows_per_batch=1000}} (from the default 
100) and see if this will make the pending queue decrease faster while keeping 
the GC sane.

bq. Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not 
decreasing anymore.  

Can you attach a thread dump? You can generate it via jstack 

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, 
> cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements

2018-04-11 Thread Sergey Kirillov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kirillov updated CASSANDRA-13065:

Attachment: Selection_423.png

> Skip building views during base table streams on range movements
> 
>
> Key: CASSANDRA-13065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13065
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
>Priority: Critical
> Fix For: 4.0
>
>
> Booting or decommisioning nodes with MVs is unbearably slow as all streams go 
> through the regular write paths. This causes read-before-writes for every 
> mutation and during bootstrap it causes them to be sent to batchlog.
> The makes it virtually impossible to boot a new node in an acceptable amount 
> of time.
> Using the regular streaming behaviour for consistent range movements works 
> much better in this case and does not break the MV local consistency contract.
> Already tested on own cluster.
> Bootstrap case is super easy to handle, decommission case requires 
> CASSANDRA-13064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kirillov updated CASSANDRA-14239:

Attachment: dstat.png

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, 
> cassandra-env.sh, cassandra.yaml, dstat.png, gc.log.0.201804111524.zip, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13065) Skip building views during base table streams on range movements

2018-04-11 Thread Sergey Kirillov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kirillov updated CASSANDRA-13065:

Attachment: (was: Selection_423.png)

> Skip building views during base table streams on range movements
> 
>
> Key: CASSANDRA-13065
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13065
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benjamin Roth
>Assignee: Benjamin Roth
>Priority: Critical
> Fix For: 4.0
>
>
> Booting or decommisioning nodes with MVs is unbearably slow as all streams go 
> through the regular write paths. This causes read-before-writes for every 
> mutation and during bootstrap it causes them to be sent to batchlog.
> The makes it virtually impossible to boot a new node in an acceptable amount 
> of time.
> Using the regular streaming behaviour for consistent range movements works 
> much better in this case and does not break the MV local consistency contract.
> Already tested on own cluster.
> Bootstrap case is super easy to handle, decommission case requires 
> CASSANDRA-13064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kirillov updated CASSANDRA-14239:

Attachment: Selection_421.png

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, Selection_421.png, 
> cassandra-env.sh, cassandra.yaml, gc.log.0.201804111524.zip, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434042#comment-16434042
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

Ok. Now it stuck on 5134 pending MemtableFlushWriter jobs. Number is not 
decreasing anymore. 

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, cassandra-env.sh, 
> cassandra.yaml, gc.log.0.201804111524.zip, gc.log.0.current.zip, 
> gc.log.20180441.zip, jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kirillov updated CASSANDRA-14239:

Attachment: Selection_420.png

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, Selection_420.png, cassandra-env.sh, 
> cassandra.yaml, gc.log.0.201804111524.zip, gc.log.0.current.zip, 
> gc.log.20180441.zip, jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434030#comment-16434030
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

[~pauloricardomg] I've done quick and dirty backport of  CASSANDRA-13299 to 
3.10 (which I'm using right now), so far there is no OOM, but node is still 
stucking in MutationStage.  Number of pending MemtableFlushWriter jobs is 
slowly decreasing, I'll try to wait till it decrease to zero, maybe this 
will unblock mutations.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Paulo Motta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14239:

Labels: materializedviews  (was: )

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
>  Labels: materializedviews
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433974#comment-16433974
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

Thanks for your confirmation [~pauloricardomg], I thought I'm hunting Ghosts 
here.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Paulo Motta (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433933#comment-16433933
 ] 

Paulo Motta commented on CASSANDRA-14239:
-

{quote}Removing MVs is not easy in my case, but now at least I know that it is 
worth the efforts.
{quote}
We've done a few improvements to MV bootstrap performance on CASSANDRA-13299 
and CASSANDRA-13065, but unfortunately these are only available on 4.0.

The OOMs during bootstrap reported here could probably benefit from 
CASSANDRA-13299, so I think it's worth a backport of this to 3.11, which 
shouldn't be very hard. If you (or anyone else) feel adventurous and is willing 
to try a backport [~rushman] I'm happy to review it.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433921#comment-16433921
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

[~jalbersdorfer] so, I was right, it is related to MV update. It is really 
helpful to know this. 

Removing MVs is not easy in my case, but now at least I know that it is worth 
the efforts.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433914#comment-16433914
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

Heap Management looked great too, this time. See [^gc.log.0.201804111524.zip] 
at [http://gceasy.io|http://gceasy.io/] - the last big reclaim was triggered 
manually via JMX after the join has completed successfully.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jürgen Albersdorfer updated CASSANDRA-14239:

Attachment: gc.log.0.201804111524.zip

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.201804111524.zip, gc.log.0.current.zip, gc.log.20180441.zip, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433908#comment-16433908
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

[~rushman]: I dropped my MATERIALIZED VIEW, did the join again, worked 
perfectly!

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13971) Automatic certificate management using Vault

2018-04-11 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433902#comment-16433902
 ] 

Stefan Podkowinski commented on CASSANDRA-13971:


Any update on the review status [~jasobrown], now that the 4.0 window is about 
to close?

> Automatic certificate management using Vault
> 
>
> Key: CASSANDRA-13971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13971
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Streaming and Messaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
>
> We've been adding security features during the last years to enable users to 
> secure their clusters, if they are willing to use them and do so correctly. 
> Some features are powerful and easy to work with, such as role based 
> authorization. Other features that require to manage a local keystore are 
> rather painful to deal with. Think about setting up SSL..
> To be fair, keystore related issues and certificate handling hasn't been 
> invented by us. We're just following Java standards there. But that doesn't 
> mean that we absolutely have to, if there are better options. I'd like to 
> give it a shoot and find out if we can automate certificate/key handling 
> (PKI) by using external APIs. In this case, the implementation will be based 
> on [Vault|https://vaultproject.io]. But certificate management services 
> offered by cloud providers may also be able to handle the use-case and I 
> intend to create a generic, pluggable API for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14377) Returning invalid JSON for NaN and Infinity float values

2018-04-11 Thread Piotr Sarna (JIRA)

Piotr Sarna created CASSANDRA-14377:
---

 Summary: Returning invalid JSON for NaN and Infinity float values
 Key: CASSANDRA-14377
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14377
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
Reporter: Piotr Sarna


After inserting special float values like NaN and Infinity into a table:

{{CREATE TABLE testme (t1 bigint, t2 float, t3 float, PRIMARY KEY (t1));}}
{{INSERT INTO testme (t1, t2, t3) VALUES (7, NaN, Infinity);}}

and returning them as JSON...

{{cqlsh:demodb> select json * from testme;}}
{{ [json]}}
{{--}}
{{ \{"t1": 7, "t2": NaN, "t3": Infinity}}}

 

... the result will not be validated (e.g. with 
[https://jsonlint.com/|https://jsonlint.com/)] ) because neither NaN nor 
Infinity is a valid JSON value. The consensus seems to be returning JSON's 
`null` in these cases, based on this article 
[https://stackoverflow.com/questions/1423081/json-left-out-infinity-and-nan-json-status-in-ecmascript]
 and other similar ones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14310) Don't allow nodetool refresh before cfs is opened

2018-04-11 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14310:

   Resolution: Fixed
Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 3.0.x)
   3.11.3
   3.0.17
   4.0
   Status: Resolved  (was: Patch Available)

committed as {{22bb413ba29aa6a95034b7dac833a8273983fa42}} and merged up, thanks!

> Don't allow nodetool refresh before cfs is opened
> -
>
> Key: CASSANDRA-14310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14310
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> There is a potential deadlock in during startup if nodetool refresh is called 
> while sstables are being opened. We should not allow refresh to be called 
> before everything is initialized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra-dtest git commit: Make sure we don't deadlock on nodetool refresh

2018-04-11 Thread marcuse

Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 9c2eb35a8 -> 95735a4d0


Make sure we don't deadlock on nodetool refresh

Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/95735a4d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/95735a4d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/95735a4d

Branch: refs/heads/master
Commit: 95735a4d0049249acc7de23465d89e07792c3de6
Parents: 9c2eb35
Author: Marcus Eriksson 
Authored: Mon Apr 9 13:56:28 2018 +0200
Committer: Marcus Eriksson 
Committed: Wed Apr 11 15:03:35 2018 +0200

--
 byteman/sstable_open_delay.btm | 11 +++
 refresh_test.py| 38 +
 2 files changed, 49 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/95735a4d/byteman/sstable_open_delay.btm
--
diff --git a/byteman/sstable_open_delay.btm b/byteman/sstable_open_delay.btm
new file mode 100644
index 000..d31c2d0
--- /dev/null
+++ b/byteman/sstable_open_delay.btm
@@ -0,0 +1,11 @@
+#
+# Make sstable opening on startup slower
+#
+RULE slow startup sstable opening
+CLASS org.apache.cassandra.io.sstable.format.big.BigFormat$ReaderFactory
+METHOD open
+AT ENTRY
+IF TRUE
+DO
+Thread.sleep(1);
+ENDRULE

http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/95735a4d/refresh_test.py
--
diff --git a/refresh_test.py b/refresh_test.py
new file mode 100644
index 000..4177ec8
--- /dev/null
+++ b/refresh_test.py
@@ -0,0 +1,38 @@
+import time
+
+from dtest import Tester
+from ccmlib.node import ToolError
+import pytest
+
+since = pytest.mark.since
+
+@since('3.0')
+class TestRefresh(Tester):
+def test_refresh_deadlock_startup(self):
+""" Test refresh deadlock during startup (CASSANDRA-14310) """
+self.cluster.populate(1)
+node = self.cluster.nodelist()[0]
+node.byteman_port = '8100'
+node.import_config_files()
+self.cluster.start(wait_other_notice=True)
+session = self.patient_cql_connection(node)
+session.execute("CREATE KEYSPACE ks WITH replication = 
{'class':'SimpleStrategy', 'replication_factor':1}")
+session.execute("CREATE TABLE ks.a (id int primary key, d text)")
+session.execute("CREATE TABLE ks.b (id int primary key, d text)")
+node.nodetool("disableautocompaction") # make sure we have more than 1 
sstable
+for x in range(0, 10):
+session.execute("INSERT INTO ks.a (id, d) VALUES (%d, '%d 
%d')"%(x, x, x))
+session.execute("INSERT INTO ks.b (id, d) VALUES (%d, '%d 
%d')"%(x, x, x))
+node.flush()
+node.stop()
+node.update_startup_byteman_script('byteman/sstable_open_delay.btm')
+node.start()
+node.watch_log_for("opening keyspace ks", filename="debug.log")
+time.sleep(5)
+for x in range(0, 20):
+try:
+node.nodetool("refresh ks a")
+node.nodetool("refresh ks b")
+except ToolError:
+pass # this is OK post-14310 - we just don't want to hang 
forever
+time.sleep(1)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[3/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up

2018-04-11 Thread marcuse

Avoid deadlock when running nodetool refresh before node is fully up

Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b

Branch: refs/heads/trunk
Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42
Parents: edcb90f
Author: Marcus Eriksson 
Authored: Tue Mar 13 08:45:30 2018 +0100
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:47:05 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94b2276..9012f8c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
  * Handle all exceptions when opening sstables (CASSANDRA-14202)
  * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
  * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 14e06b0..4c7bc46 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -653,7 +653,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * @param ksName The keyspace name
  * @param cfName The columnFamily name
  */
-public static synchronized void loadNewSSTables(String ksName, String 
cfName)
+public static void loadNewSSTables(String ksName, String cfName)
 {
 /** ks/cf existence checks will be done by open and getCFS methods for 
us */
 Keyspace keyspace = Keyspace.open(ksName);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index cf8e257..77fcb81 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -4597,6 +4597,8 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
  */
 public void loadNewSSTables(String ksName, String cfName)
 {
+if (!isInitialized())
+throw new RuntimeException("Not yet initialized, can't load new 
sstables");
 ColumnFamilyStore.loadNewSSTables(ksName, cfName);
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[2/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up

2018-04-11 Thread marcuse

Avoid deadlock when running nodetool refresh before node is fully up

Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b

Branch: refs/heads/cassandra-3.11
Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42
Parents: edcb90f
Author: Marcus Eriksson 
Authored: Tue Mar 13 08:45:30 2018 +0100
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:47:05 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94b2276..9012f8c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
  * Handle all exceptions when opening sstables (CASSANDRA-14202)
  * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
  * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 14e06b0..4c7bc46 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -653,7 +653,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * @param ksName The keyspace name
  * @param cfName The columnFamily name
  */
-public static synchronized void loadNewSSTables(String ksName, String 
cfName)
+public static void loadNewSSTables(String ksName, String cfName)
 {
 /** ks/cf existence checks will be done by open and getCFS methods for 
us */
 Keyspace keyspace = Keyspace.open(ksName);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index cf8e257..77fcb81 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -4597,6 +4597,8 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
  */
 public void loadNewSSTables(String ksName, String cfName)
 {
+if (!isInitialized())
+throw new RuntimeException("Not yet initialized, can't load new 
sstables");
 ColumnFamilyStore.loadNewSSTables(ksName, cfName);
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2018-04-11 Thread marcuse

Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/95a52a8b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/95a52a8b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/95a52a8b

Branch: refs/heads/trunk
Commit: 95a52a8bfbabb4acb3518ee7f5e6d256110d7bf0
Parents: 42827e6 75a9320
Author: Marcus Eriksson 
Authored: Wed Apr 11 14:55:10 2018 +0200
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:55:10 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/95a52a8b/src/java/org/apache/cassandra/service/StorageService.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[1/6] cassandra git commit: Avoid deadlock when running nodetool refresh before node is fully up

2018-04-11 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 edcb90f08 -> 22bb413ba
  refs/heads/cassandra-3.11 19e329eb5 -> 75a932087
  refs/heads/trunk 42827e6a6 -> 95a52a8bf


Avoid deadlock when running nodetool refresh before node is fully up

Patch by marcuse; reviewed by Jordan West for CASSANDRA-14310


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22bb413b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22bb413b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22bb413b

Branch: refs/heads/cassandra-3.0
Commit: 22bb413ba29aa6a95034b7dac833a8273983fa42
Parents: edcb90f
Author: Marcus Eriksson 
Authored: Tue Mar 13 08:45:30 2018 +0100
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:47:05 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 94b2276..9012f8c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
  * Handle all exceptions when opening sstables (CASSANDRA-14202)
  * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
  * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 14e06b0..4c7bc46 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -653,7 +653,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
  * @param ksName The keyspace name
  * @param cfName The columnFamily name
  */
-public static synchronized void loadNewSSTables(String ksName, String 
cfName)
+public static void loadNewSSTables(String ksName, String cfName)
 {
 /** ks/cf existence checks will be done by open and getCFS methods for 
us */
 Keyspace keyspace = Keyspace.open(ksName);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/22bb413b/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index cf8e257..77fcb81 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -4597,6 +4597,8 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
  */
 public void loadNewSSTables(String ksName, String cfName)
 {
+if (!isInitialized())
+throw new RuntimeException("Not yet initialized, can't load new 
sstables");
 ColumnFamilyStore.loadNewSSTables(ksName, cfName);
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-04-11 Thread marcuse

Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75a93208
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75a93208
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75a93208

Branch: refs/heads/trunk
Commit: 75a932087d027a569f37b1e3c1047aaff107549e
Parents: 19e329e 22bb413
Author: Marcus Eriksson 
Authored: Wed Apr 11 14:47:47 2018 +0200
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:47:47 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/CHANGES.txt
--
diff --cc CHANGES.txt
index e0145d4,9012f8c..e55ae28
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,13 -1,5 +1,14 @@@
 -3.0.17
 +3.11.3
 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370)
 + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891)
 + * Serialize empty buffer as empty string for json output format 
(CASSANDRA-14245)
 + * Allow logging implementation to be interchanged for embedded testing 
(CASSANDRA-13396)
 + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
 + * Fix Loss of digits when doing CAST from varint/bigint to decimal 
(CASSANDRA-14170)
 + * RateBasedBackPressure unnecessarily invokes a lock on the Guava 
RateLimiter (CASSANDRA-14163)
 + * Fix wildcard GROUP BY queries (CASSANDRA-14209)
 +Merged from 3.0:
+  * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
   * Handle all exceptions when opening sstables (CASSANDRA-14202)
   * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
   * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/service/StorageService.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-04-11 Thread marcuse

Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/75a93208
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/75a93208
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/75a93208

Branch: refs/heads/cassandra-3.11
Commit: 75a932087d027a569f37b1e3c1047aaff107549e
Parents: 19e329e 22bb413
Author: Marcus Eriksson 
Authored: Wed Apr 11 14:47:47 2018 +0200
Committer: Marcus Eriksson 
Committed: Wed Apr 11 14:47:47 2018 +0200

--
 CHANGES.txt   | 1 +
 src/java/org/apache/cassandra/db/ColumnFamilyStore.java   | 2 +-
 src/java/org/apache/cassandra/service/StorageService.java | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/CHANGES.txt
--
diff --cc CHANGES.txt
index e0145d4,9012f8c..e55ae28
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,13 -1,5 +1,14 @@@
 -3.0.17
 +3.11.3
 + * Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370)
 + * CQL fromJson(null) throws NullPointerException (CASSANDRA-13891)
 + * Serialize empty buffer as empty string for json output format 
(CASSANDRA-14245)
 + * Allow logging implementation to be interchanged for embedded testing 
(CASSANDRA-13396)
 + * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
 + * Fix Loss of digits when doing CAST from varint/bigint to decimal 
(CASSANDRA-14170)
 + * RateBasedBackPressure unnecessarily invokes a lock on the Guava 
RateLimiter (CASSANDRA-14163)
 + * Fix wildcard GROUP BY queries (CASSANDRA-14209)
 +Merged from 3.0:
+  * Avoid deadlock when running nodetool refresh before node is fully up 
(CASSANDRA-14310)
   * Handle all exceptions when opening sstables (CASSANDRA-14202)
   * Handle incompletely written hint descriptors during startup 
(CASSANDRA-14080)
   * Handle repeat open bound from SRP in read repair (CASSANDRA-14330)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/75a93208/src/java/org/apache/cassandra/service/StorageService.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-6719) redesign loadnewsstables

2018-04-11 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433839#comment-16433839
 ] 

Marcus Eriksson edited comment on CASSANDRA-6719 at 4/11/18 12:33 PM:
--

pushed a commit with the comments addressed 
[here|https://github.com/krummas/cassandra/commits/marcuse/6719]

bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called
this is on purpose and we should probably fix this in 3.0+ as well - I don't 
think we want to trigger the disk failure policy if import fails - instead we 
should abort the import. If someone has configured {{disk_failure_policy: 
stop_paranoid}} trying to load a corrupt file would actually stop the node
bq. Row cache invalidation was not previously performed — this is a good thing 
regardless, so maybe skip an option for this one. 
added an option to explicitly skip the row cache invalidation - also added a 
{{--quick}} option which makes it behave more like the old version
bq. If using nodetool refresh with JBOD, the counting keys per boundary work is 
done just to throw it away.
added a check if there is only a single data directory (and row cache is not 
enabled)
bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath 
and findBestDiskAndInvalidCache’s path -> srcPath
fixed
bq. Minor/usability nit: I couldn’t find many cases where 
@Option(required=true) is used. WDYT about moving the path to a positional 
argument since its required and this command does not take a variable number of 
positional args?
makes sense, made it {{nodetool import   }}
bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an 
invalid state, make noVerify=true imply noVerifyTokens=true. 
yup, makes sense
bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new 
StorageService.loadSSTables. 
bq. The comment on CFS#L861 is useful but out of place. 
bq. Minor/naming nit: The naming of the “allKeys” variable in 
ImportTest#testImportInvalidateCache is misleading. 
bq. Instead of hardcoding token values what about using e.g. 
t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0?
bq. Are you intentionally leaving the Random seed hardcoded?
fixed


was (Author: krummas):
pushed a commit with the comments addressed 
[here|https://github.com/krummas/cassandra/commits/marcuse/6719]

bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called
this is on purpose and we should probably fix this in 3.0+ as well - I don't 
think we want to trigger the disk failure policy if import fails - instead we 
should abort the import. If someone has configured {{disk_failure_policy: 
die_paranoid}} trying to load a corrupt file would actually stop the node
bq. Row cache invalidation was not previously performed — this is a good thing 
regardless, so maybe skip an option for this one. 
added an option to explicitly skip the row cache invalidation - also added a 
{{--quick}} option which makes it behave more like the old version
bq. If using nodetool refresh with JBOD, the counting keys per boundary work is 
done just to throw it away.
added a check if there is only a single data directory (and row cache is not 
enabled)
bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath 
and findBestDiskAndInvalidCache’s path -> srcPath
fixed
bq. Minor/usability nit: I couldn’t find many cases where 
@Option(required=true) is used. WDYT about moving the path to a positional 
argument since its required and this command does not take a variable number of 
positional args?
makes sense, made it {{nodetool import   }}
bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an 
invalid state, make noVerify=true imply noVerifyTokens=true. 
yup, makes sense
bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new 
StorageService.loadSSTables. 
bq. The comment on CFS#L861 is useful but out of place. 
bq. Minor/naming nit: The naming of the “allKeys” variable in 
ImportTest#testImportInvalidateCache is misleading. 
bq. Instead of hardcoding token values what about using e.g. 
t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0?
bq. Are you intentionally leaving the Random seed hardcoded?
fixed

> redesign loadnewsstables
> 
>
> Key: CASSANDRA-6719
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6719
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 6719.patch
>
>
> CFSMBean.loadNewSSTables scans data directories for new sstables dropped 
> there by an external agent.  This is dangerous because of possible filename 
> con

[jira] [Commented] (CASSANDRA-6719) redesign loadnewsstables

2018-04-11 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433839#comment-16433839
 ] 

Marcus Eriksson commented on CASSANDRA-6719:


pushed a commit with the comments addressed 
[here|https://github.com/krummas/cassandra/commits/marcuse/6719]

bq. FSUtils.handleCorruptSSTable/handleFSError are no longer called
this is on purpose and we should probably fix this in 3.0+ as well - I don't 
think we want to trigger the disk failure policy if import fails - instead we 
should abort the import. If someone has configured {{disk_failure_policy: 
die_paranoid}} trying to load a corrupt file would actually stop the node
bq. Row cache invalidation was not previously performed — this is a good thing 
regardless, so maybe skip an option for this one. 
added an option to explicitly skip the row cache invalidation - also added a 
{{--quick}} option which makes it behave more like the old version
bq. If using nodetool refresh with JBOD, the counting keys per boundary work is 
done just to throw it away.
added a check if there is only a single data directory (and row cache is not 
enabled)
bq. Minor/naming nit: consider renaming CFS#loadSSTables’s dirPath -> srcPath 
and findBestDiskAndInvalidCache’s path -> srcPath
fixed
bq. Minor/usability nit: I couldn’t find many cases where 
@Option(required=true) is used. WDYT about moving the path to a positional 
argument since its required and this command does not take a variable number of 
positional args?
makes sense, made it {{nodetool import   }}
bq. Minor/usability nit: Instead of noVerify=true,noVerifyTokens=false being an 
invalid state, make noVerify=true imply noVerifyTokens=true. 
yup, makes sense
bq. The JavaDoc for CFS.loadNewSSTables should be updated to point to the new 
StorageService.loadSSTables. 
bq. The comment on CFS#L861 is useful but out of place. 
bq. Minor/naming nit: The naming of the “allKeys” variable in 
ImportTest#testImportInvalidateCache is misleading. 
bq. Instead of hardcoding token values what about using e.g. 
t.compareTo(mock.getDiskBoundaries().positions.get(0).getToken()) <= 0?
bq. Are you intentionally leaving the Random seed hardcoded?
fixed

> redesign loadnewsstables
> 
>
> Key: CASSANDRA-6719
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6719
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 6719.patch
>
>
> CFSMBean.loadNewSSTables scans data directories for new sstables dropped 
> there by an external agent.  This is dangerous because of possible filename 
> conflicts with existing or newly generated sstables.
> Instead, we should support leaving the new sstables in a separate directory 
> (specified by a parameter, or configured as a new location in yaml) and take 
> care of renaming as necessary automagically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13459) Diag. Events: Native transport integration

2018-04-11 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433832#comment-16433832
 ] 

Stefan Podkowinski commented on CASSANDRA-13459:


{quote}So I was just thinking that forward looking restricting this mechanism 
to diagnostic events might not make sense. I was thinking a more generic 
subscription mechanism where diagnostic are events is a subset of what clients 
can conditionally subscribe to means we don't end up with naming issues in the 
future.
{quote}
We could specify a subscription mechanism for native transport that is not 
specific to diag events. But what would the subject look like to subscribe to? 
If we want to support more powerful publish/subscribe semantics, we'd have to 
allow clients to specify event matchers in a more generic way, e.g. by using 
some kind of query language.

Examples

All auditing events for "ks" keyspace updates:
 {{SUBSCRIBE diag_events "event=AuditEvent#UPDATE(keyspace=ks)"}}

Full query replication for ks.table1:
 {{SUBSCRIBE full_query_log "keyspace=ks,table=table1"}}

Subscribe to any row updates matching a query:
 {{SUBSCRIBE cdc "select * from order_status where order_id = 1"}}

Not a big fan of language-in-language, but I'm open to discuss any options, if 
that's something we should add.
{quote}For V1 of this functionality my only sticking point is that even with 
1-2 clients consuming diagnostic events we have to handle backpressure somehow. 
AFAIK we hold onto messages pending to a client for a while (indefinitely?). I 
am not actually sure what kind fo timeouts or health checks we do for clients.
{quote}
The current implementation will simply write and flush any event message to the 
netty stack. Netty should use it's own event loop, but I'm not sure how 
buffering is handled in detail. Maybe we could also use the 
{{Message.Dispatcher}} instead and add even messages to the (unbounded) queue 
of items to flush. But we don't do that either for "classic" schema/topo/status 
change messages.

 

 

> Diag. Events: Native transport integration
> --
>
> Key: CASSANDRA-13459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13459
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>  Labels: client-impacting
>
> Events should be consumable by clients that would received subscribed events 
> from the connected node. This functionality is designed to work on top of 
> native transport with minor modifications to the protocol standard (see 
> [original 
> proposal|https://docs.google.com/document/d/1uEk7KYgxjNA0ybC9fOuegHTcK3Yi0hCQN5nTp5cNFyQ/edit?usp=sharing]
>  for further considered options). First we have to add another value for 
> existing event types. Also, we have to extend the protocol a bit to be able 
> to specify a sub-class and sub-type value. E.g. 
> {{DIAGNOSTIC_EVENT(GossiperEvent, MAJOR_STATE_CHANGE_HANDLED)}}. This still 
> has to be worked out and I'd appreciate any feedback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14023) add_dc_after_mv_network_replication_test - materialized_views_test.TestMaterializedViews fails due to invalid datacenter

2018-04-11 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-14023:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> add_dc_after_mv_network_replication_test - 
> materialized_views_test.TestMaterializedViews fails due to invalid datacenter
> 
>
> Key: CASSANDRA-14023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Marcus Eriksson
>Priority: Major
>
> add_dc_after_mv_network_replication_test - 
> materialized_views_test.TestMaterializedViews always fails due to:
>  message="Unrecognized strategy option {dc2} passed to NetworkTopologyStrategy 
> for keyspace ks">



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra-dtest git commit: Accept ConfigurationException and IRE when dropping non-existent ks in secondary_indexes_test.py

2018-04-11 Thread aleksey

Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 4f2996b46 -> 9c2eb35a8


Accept ConfigurationException and IRE when dropping non-existent ks in 
secondary_indexes_test.py


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/9c2eb35a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/9c2eb35a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/9c2eb35a

Branch: refs/heads/master
Commit: 9c2eb35a8c1d9fde1499fdfc7b02e7db36d321e0
Parents: 4f2996b
Author: Aleksey Yeschenko 
Authored: Wed Apr 11 13:14:57 2018 +0100
Committer: Aleksey Yeschenko 
Committed: Wed Apr 11 13:14:57 2018 +0100

--
 secondary_indexes_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/9c2eb35a/secondary_indexes_test.py
--
diff --git a/secondary_indexes_test.py b/secondary_indexes_test.py
index 9b0f326..55b240e 100644
--- a/secondary_indexes_test.py
+++ b/secondary_indexes_test.py
@@ -161,7 +161,7 @@ class TestSecondaryIndexes(Tester):
 logger.debug("round %s" % i)
 try:
 session.execute("DROP KEYSPACE ks")
-except ConfigurationException:
+except (ConfigurationException, InvalidRequest):
 pass
 
 create_ks(session, 'ks', 1)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433688#comment-16433688
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

[~jalbersdorfer] I'm trying to do it as well.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433684#comment-16433684
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

[~rushman]: Yes, I have a materialized view on one of the tables. I could 
eventually afford loosing it. Maybe I will drop it and retry with the same 
settings.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance

2018-04-11 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433676#comment-16433676
 ] 

Sylvain Lebresne commented on CASSANDRA-13910:
--

bq. Do you guys think we should add the deprecation warning to 3.11.latest?

I do. I'd add it to 3.0.latest as well since I believe we'll support 3.0 -> 4.0 
upgrades.

> Remove read_repair_chance/dclocal_read_repair_chance
> 
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.0
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance

2018-04-11 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433670#comment-16433670
 ] 

Aleksey Yeschenko edited comment on CASSANDRA-13910 at 4/11/18 10:02 AM:
-

[~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see 
how much DDL/metadata code I can clean up (actually dropping columns from 
{{system_schema}} tables is not something we've ever done).

Do you guys think we should add the deprecation warning to 3.11.latest?


was (Author: iamaleksey):
[~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see 
how much DDL/metadata code I can clean up (actually dropping columns from 
{{system_schema}} tables is not something we've ever done.

Do you guys think we should add the deprecation warning to 3.11.latest?

> Remove read_repair_chance/dclocal_read_repair_chance
> 
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.0
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance

2018-04-11 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13910:
--
Status: Open  (was: Patch Available)

> Remove read_repair_chance/dclocal_read_repair_chance
> 
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.0
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance

2018-04-11 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433670#comment-16433670
 ] 

Aleksey Yeschenko commented on CASSANDRA-13910:
---

[~slebresne] [~bdeggleston] Alright then, thrown an exception it is. I'll see 
how much DDL/metadata code I can clean up (actually dropping columns from 
{{system_schema}} tables is not something we've ever done.

Do you guys think we should add the deprecation warning to 3.11.latest?

> Remove read_repair_chance/dclocal_read_repair_chance
> 
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.0
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433661#comment-16433661
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

[~jalbersdorfer] do you use a materialized views in your DB? I was able to 
localize this behavior to one table which has a few materialized views defined, 
so I suspect that this may be related to the MV update.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread Sergey Kirillov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433655#comment-16433655
 ] 

Sergey Kirillov commented on CASSANDRA-14239:
-

[~jalbersdorfer] I was trying to debug it and it looks like deadlock in 
memtable flush path. This leads to memtables which are never released to the 
pool and eventually you are getting OOM.

However, I still don't understand why those flush/mutation threads are freezing 
and how to resolve this. It would be nice if someone from the devs could take a 
look.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433637#comment-16433637
 ] 

Jürgen Albersdorfer edited comment on CASSANDRA-14239 at 4/11/18 9:45 AM:
--

I changed
{code:java}
disk_optimization_strategy: ssd
memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 2048
{code}
Streaming was much more faster and produced less CPU pressure than before 
{code:java}
-dsk/total- ---system-- total-cpu-usage --io/total- -net/total-
 read  writ| int   csw |usr sys idl wai hiq siq| read  writ| recv  send
9830B   31M|  48k 7751 | 67   2  31   0   0   1|0.20  85.8 |  30M  380k
   0    28M|  51k 7838 | 65   2  32   0   0   1|   0  80.9 |  33M  511k
  32k   35M|  54k 9024 | 66   2  31   0   0   1|0.60   102 |  37M  540k
   0    28M|  41k 7072 | 62   2  36   0   0   1|   0  78.1 |  26M  265k
1638B   25M|  41k 6606 | 62   1  36   0   0   0|0.10  67.6 |  25M  110k
1638B   26M|  41k 7251 | 57   1  41   0   0   0|0.10  69.9 |  27M  138k
 819B   24M|  40k 6129 | 56   1  42   0   0   1|0.20  61.5 |  25M  127k
   0    25M|  38k 7273 | 56   1  42   0   0   0|   0  66.9 |  26M  162k
1024k   24M|  35k 6501 | 56   1  42   0   0   0|25.2  62.8 |  25M  128k
   0    24M|  37k 7238 | 56   1  42   0   0   0|   0  62.6 |  26M  164k
   0    24M|  35k 6349 | 56   1  42   0   0   0|   0  63.5 |  25M  145k
 410B   26M|  40k 6979 | 56   2  42   0   0   0|0.10  73.1 |  28M  341k
   0    28M|  41k 7042 | 56   1  42   0   0   0|   0  70.8 |  30M  350k
2048B   31M|  44k 7334 | 56   2  42   0   0   0|0.20  85.4 |  32M  347k
   0    31M|  46k 6515 | 56   1  42   0   0   1|   0  86.0 |  33M  383k
   0    30M|  47k 7572 | 56   1  42   0   0   1|   0  82.3 |  33M  466k
7373B   31M|  41k 5742 | 56   1  42   0   0   0|0.20  84.3 |  30M  319k
   0    30M|  43k 7146 | 56   2  42   0   0   1|   0  87.4 |  28M  423k
{code}
when `Received complete` for all Nodes, bootstrap didn't finish and I can 
observe a

 
 * stalled number of `Completed` MutationStage,
 * while the `Pending` MutationStage seems to skyrocket.
 * Rest of it looks fine to me  :(

 
{code:java}
nodetool tpstats
Pool Name Active   Pending  Completed   Blocked  
All time blocked
ReadStage  0 0  0 0 
    0
MiscStage  0 0  0 0 
    0
CompactionExecutor 2 7 53 0 
    0
MutationStage    128   5722021  593964000 0 
    0
MemtableReclaimMemory  0 0   2194 0 
    0
PendingRangeCalculator 0 0 19 0 
    0
GossipStage    0 0  25736 0 
    0
SecondaryIndexManagement   0 0  0 0 
    0
HintsDispatcher    0 0  0 0 
    0
RequestResponseStage   0 0 167108 0 
    0
ReadRepairStage    0 0  0 0 
    0
CounterMutationStage   0 0  0 0 
    0
MigrationStage 0 0 40 0 
    0
MemtablePostFlush  1    11   2344 0 
    0
PerDiskMemtableFlushWriter_0   0 0   2194 0 
    0
ValidationExecutor 0 0  0 0 
    0
Sampler    0 0  0 0 
    0
MemtableFlushWriter    2    11   2194 0 
    0
InternalResponseStage  0 0 31 0 
    0
ViewMutationStage  0 0  0 0 
    0
AntiEntropyStage   0 0  0 0 
    0
CacheCleanupExecutor   0 0  0 0 
    0

Message type   Dropped
READ 0
RANGE_SLICE  0
_TRACE   0
HINT 0
MUTATION 0
COUNTER_MUTATION 0
BATCH_STORE  0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE  0
READ_REPAIR  0

{code}
   

*Why does `MutationStage`now `(busy) hang`? - While*
 * SlabPoolCleaner Thread uses a single logical CPU at 100% permanently
 * G1 Old Gen increases linearly over time and goes

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433637#comment-16433637
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

 

I changed
disk_optimization_strategy: ssd
memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 2048
Streaming was much more faster and produced less CPU pressure than before

 
{code:java}
-dsk/total- ---system-- total-cpu-usage --io/total- -net/total-
 read  writ| int   csw |usr sys idl wai hiq siq| read  writ| recv  send
9830B   31M|  48k 7751 | 67   2  31   0   0   1|0.20  85.8 |  30M  380k
   0    28M|  51k 7838 | 65   2  32   0   0   1|   0  80.9 |  33M  511k
  32k   35M|  54k 9024 | 66   2  31   0   0   1|0.60   102 |  37M  540k
   0    28M|  41k 7072 | 62   2  36   0   0   1|   0  78.1 |  26M  265k
1638B   25M|  41k 6606 | 62   1  36   0   0   0|0.10  67.6 |  25M  110k
1638B   26M|  41k 7251 | 57   1  41   0   0   0|0.10  69.9 |  27M  138k
 819B   24M|  40k 6129 | 56   1  42   0   0   1|0.20  61.5 |  25M  127k
   0    25M|  38k 7273 | 56   1  42   0   0   0|   0  66.9 |  26M  162k
1024k   24M|  35k 6501 | 56   1  42   0   0   0|25.2  62.8 |  25M  128k
   0    24M|  37k 7238 | 56   1  42   0   0   0|   0  62.6 |  26M  164k
   0    24M|  35k 6349 | 56   1  42   0   0   0|   0  63.5 |  25M  145k
 410B   26M|  40k 6979 | 56   2  42   0   0   0|0.10  73.1 |  28M  341k
   0    28M|  41k 7042 | 56   1  42   0   0   0|   0  70.8 |  30M  350k
2048B   31M|  44k 7334 | 56   2  42   0   0   0|0.20  85.4 |  32M  347k
   0    31M|  46k 6515 | 56   1  42   0   0   1|   0  86.0 |  33M  383k
   0    30M|  47k 7572 | 56   1  42   0   0   1|   0  82.3 |  33M  466k
7373B   31M|  41k 5742 | 56   1  42   0   0   0|0.20  84.3 |  30M  319k
   0    30M|  43k 7146 | 56   2  42   0   0   1|   0  87.4 |  28M  423k
{code}
when `Received complete` for all Nodes, bootstrap didn't finish and I can 
observe a

 
 * stalled number of `Completed` MutationStage,
 * while the `Pending` MutationStage seems to skyrocket.
 * Rest of it looks fine to me  :(

 
{code:java}
nodetool tpstats
Pool Name Active   Pending  Completed   Blocked  
All time blocked
ReadStage  0 0  0 0 
    0
MiscStage  0 0  0 0 
    0
CompactionExecutor 2 7 53 0 
    0
MutationStage    128   5722021  593964000 0 
    0
MemtableReclaimMemory  0 0   2194 0 
    0
PendingRangeCalculator 0 0 19 0 
    0
GossipStage    0 0  25736 0 
    0
SecondaryIndexManagement   0 0  0 0 
    0
HintsDispatcher    0 0  0 0 
    0
RequestResponseStage   0 0 167108 0 
    0
ReadRepairStage    0 0  0 0 
    0
CounterMutationStage   0 0  0 0 
    0
MigrationStage 0 0 40 0 
    0
MemtablePostFlush  1    11   2344 0 
    0
PerDiskMemtableFlushWriter_0   0 0   2194 0 
    0
ValidationExecutor 0 0  0 0 
    0
Sampler    0 0  0 0 
    0
MemtableFlushWriter    2    11   2194 0 
    0
InternalResponseStage  0 0 31 0 
    0
ViewMutationStage  0 0  0 0 
    0
AntiEntropyStage   0 0  0 0 
    0
CacheCleanupExecutor   0 0  0 0 
    0

Message type   Dropped
READ 0
RANGE_SLICE  0
_TRACE   0
HINT 0
MUTATION 0
COUNTER_MUTATION 0
BATCH_STORE  0
BATCH_REMOVE 0
REQUEST_RESPONSE 0
PAGED_RANGE  0
READ_REPAIR  0

{code}
 

 

 

*Why does `MutationStage`now `(busy) hang`? - While* 
 * SlabPoolCleaner Thread uses a single logical CPU at 100% permanently
 * G1 Old Gen increases linearly over time and goes far beyond 50GB
 * See attached [^gc.log.20180441.zip

[jira] [Updated] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-04-11 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jürgen Albersdorfer updated CASSANDRA-14239:

Attachment: gc.log.20180441.zip

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> gc.log.0.current.zip, gc.log.20180441.zip, jvm.options, jvm_opts.txt, 
> stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13910) Remove read_repair_chance/dclocal_read_repair_chance

2018-04-11 Thread Sylvain Lebresne (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433556#comment-16433556
 ] 

Sylvain Lebresne commented on CASSANDRA-13910:
--

bq. Turning on [~slebresne] signal.

I actually personally prefer being clear and throw an exception. As a user, 
what I would find rude, is if I'm too easily misled to believe one of my action 
has worked when it has in fact no action, and just having a warning make that 
more likely. If something has been removed, I don't find it rude to get an 
exception, I find it honest and helpful. It's a preference though at the end of 
the day, so just giving my 2 cents but not pushing more than that.

bq. The WARN on using old things is how we have done this in the past.

I'm sure your memory is better than me, but didn't we mostly used warnings when 
we deprecated something? That is, we had a release where the old settings were 
still working but were warning, and when it stopped working we removed it?



> Remove read_repair_chance/dclocal_read_repair_chance
> 
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 4.0
>
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

70 matches

Mail list logo