[jira] [Commented] (CASSANDRA-13997) Upgrade guava to 23.3
[ https://issues.apache.org/jira/browse/CASSANDRA-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243295#comment-16243295 ] Thibault Kruse commented on CASSANDRA-13997: For us it would be nice if cassandra 3.x could be made API compatible with Guava > 19.0. Such as replacing Iterators.emptyIterator() with Collections.emptyIterator() as done in https://github.com/krummas/cassandra/commits/marcuse/guava23 This would not require changing the guava version, just abolishing certain usages of guava that have been deprecated. > Upgrade guava to 23.3 > - > > Key: CASSANDRA-13997 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13997 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > For 4.0 we should upgrade guava to the latest version > patch here: https://github.com/krummas/cassandra/commits/marcuse/guava23 > A bunch of quite commonly used methods have been deprecated since guava 18 > which we use now ({{Throwables.propagate}} for example), this patch mostly > updates uses where compilation fails. {{Futures.transform(ListenableFuture > ..., AsyncFunction ...}} was deprecated in Guava 19 and removed in 20 for > example, we should probably open new tickets to remove calls to all > deprecated guava methods. > Also had to add a dependency on {{com.google.j2objc.j2objc-annotations}}, to > avoid some build-time warnings (maybe due to > https://github.com/google/guava/commit/fffd2b1f67d158c7b4052123c5032b0ba54a910d > ?) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13985) Support restricting reads and writes to specific datacenters on a per user basis
[ https://issues.apache.org/jira/browse/CASSANDRA-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243207#comment-16243207 ] Blake Eggleston commented on CASSANDRA-13985: - Here’s an initial implementation optionally add specific datacenters when granting permissions: https://github.com/bdeggleston/cassandra/tree/13985 > Support restricting reads and writes to specific datacenters on a per user > basis > > > Key: CASSANDRA-13985 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13985 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > > There are a few use cases where it makes sense to restrict the operations a > given user can perform in specific data centers. The obvious use case is the > production/analytics datacenter configuration. You don’t want the production > user to be reading/or writing to the analytics datacenter, and you don’t want > the analytics user to be reading from the production datacenter. > Although we expect users to get this right on that application level, we > should also be able to enforce this at the database level. The first approach > that comes to mind would be to support an optional DC parameter when granting > select and modify permissions to roles. Something like {{GRANT SELECT ON > some_keyspace TO that_user IN DC dc1}}, statements that omit the dc would > implicitly be granting permission to all dcs. However, I’m not married to > this approach. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13991) NullPointerException when querying a table with a previous state
[ https://issues.apache.org/jira/browse/CASSANDRA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-13991: - Reproduced In: 3.11.1, 3.0.15 (was: 3.0.15, 3.11.1) Labels: lhf (was: ) bq. https://github.com/gocql/gocql/issues/1017 Update: bug is fixed in gocql > NullPointerException when querying a table with a previous state > > > Key: CASSANDRA-13991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13991 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Chris mildebrandt > Labels: lhf > Attachments: CASSANDRA-13991.log > > > Performing the following steps (using the gocql library) results in an NPE: > * With a table of 12 entries, read all rows. > * Set the page size to 1 and read the first row. Save the query state. > * Read all the row again. > * Set the page size to 5 and the page state to the previous state. (This is > where the NPE occurs). > This can be reproduced with the following project: > https://github.com/eyeofthefrog/CASSANDRA-13991 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14001) Gossip after node restart can take a long time to converge about "down" nodes in large clusters
Joseph Lynch created CASSANDRA-14001: Summary: Gossip after node restart can take a long time to converge about "down" nodes in large clusters Key: CASSANDRA-14001 URL: https://issues.apache.org/jira/browse/CASSANDRA-14001 Project: Cassandra Issue Type: Improvement Components: Lifecycle Reporter: Joseph Lynch Priority: Minor When nodes restart in a large cluster, they mark all nodes as "alive", which first calls {{markDead}} and then creates an {{EchoMessage}} and in the callback to that marks the node as alive. This works great, except when that initial echo fails for w.e. reason and that node is marked as dead, in which case it will remain dead for a long while. We mostly see this on 100+ node clusters, and almost always when nodes are in different datacenters that have unreliable network connections (e.g, cross region in AWS) and I think that it comes down to a combination of: 1. Only a node itself can mark another node as "UP" 2. Nodes only gossip with dead nodes with probability {{#dead / (#live +1)}} In particular the algorithm in #2 leads to long convergence times because the number of dead nodes it typically very small compared to the cluster size. My back of the envelope model of this algorithm indicates that for a 100 node cluster this would take an average of ~50 seconds with a stdev of 50 seconds, which means we might be waiting _minutes_ for the nodes to gossip with each other. I'm modeling this as the minimum of two [geometric distributions|https://en.wikipedia.org/wiki/Geometric_distribution] with parameter {{p=1/#nodes}}, yielding a geometric distribution with parameter {{p=1-(1-(1/#nodes)^2)}}. So for a 100 node cluster: {noformat} 100 node cluster => X = Pr(node1 gossips with node2) = geom(0.01) Y = Pr(node 2 gossips with node1) = geom(0.01) Z = min(X or Y) = geom(1 - (1 - 0.01)^2) = geom(0.02) E[Z] = 1/0.02 = 50 V[Z] = (1-0.02)/(0.02)^2 = 2450 1000 node cluster -> Z = geom(1 - (1 - 0.001)^2) = geom(0.002) E[Z] = 500 V[Z] = 24500 {noformat} Since we gossip every second that means that on expectation in a 100 node cluster these nodes would see each other after about a minute and in a thousand node cluster, after ~8 minutes. For 100 node clusters the variance is astounding, and means that in particular edge cases we might be waiting hours before these nodes gossip with each other. I'm thinking of writing a patch which either: # Makes gossip order a shuffled list that includes dead nodes a la [swim gossip|https://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf]. This would make it so that we waste some rounds on dead nodes but guarantee linear bounding of gossip. # Adds an endpoint that re-triggers gossip with all nodes. Operators could call this after a restart a few times if they detect a gossip inconsistency. # Bounding the probability we gossip with a dead node at some reasonable number like 1/10 or something. This might cause a lot of gossip load when a node is actually down for large clusters, but would also act to bound the variance. # Something else? I've got a WIP [branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:force_gossip] on 3.11 which implements options #1 and #2, but I can reduce/change/modify as needed if people think there is a better way. The patch doesn't pass tests yet but I'm not going to change/add the tests unless we think moving to time bounded gossip for down nodes is a good idea. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13985) Support restricting reads and writes to specific datacenters on a per user basis
[ https://issues.apache.org/jira/browse/CASSANDRA-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243207#comment-16243207 ] Blake Eggleston edited comment on CASSANDRA-13985 at 11/8/17 1:09 AM: -- Here’s an initial implementation to optionally add specific datacenters when granting permissions: https://github.com/bdeggleston/cassandra/tree/13985 was (Author: bdeggleston): Here’s an initial implementation optionally add specific datacenters when granting permissions: https://github.com/bdeggleston/cassandra/tree/13985 > Support restricting reads and writes to specific datacenters on a per user > basis > > > Key: CASSANDRA-13985 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13985 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > > There are a few use cases where it makes sense to restrict the operations a > given user can perform in specific data centers. The obvious use case is the > production/analytics datacenter configuration. You don’t want the production > user to be reading/or writing to the analytics datacenter, and you don’t want > the analytics user to be reading from the production datacenter. > Although we expect users to get this right on that application level, we > should also be able to enforce this at the database level. The first approach > that comes to mind would be to support an optional DC parameter when granting > select and modify permissions to roles. Something like {{GRANT SELECT ON > some_keyspace TO that_user IN DC dc1}}, statements that omit the dc would > implicitly be granting permission to all dcs. However, I’m not married to > this approach. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14001) Gossip after node restart can take a long time to converge about "down" nodes in large clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243223#comment-16243223 ] Joseph Lynch edited comment on CASSANDRA-14001 at 11/8/17 1:26 AM: --- I think CASSANDRA-13993 might help with this, but I _think_ it's solving a slightly different problem. was (Author: jolynch): I think CASSANDRA-13993 might help with this, but I _thin_ it's solving a slightly different problem. > Gossip after node restart can take a long time to converge about "down" nodes > in large clusters > --- > > Key: CASSANDRA-14001 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14001 > Project: Cassandra > Issue Type: Improvement > Components: Lifecycle >Reporter: Joseph Lynch >Priority: Minor > > When nodes restart in a large cluster, they mark all nodes as "alive", which > first calls {{markDead}} and then creates an {{EchoMessage}} and in the > callback to that marks the node as alive. This works great, except when that > initial echo fails for w.e. reason and that node is marked as dead, in which > case it will remain dead for a long while. > We mostly see this on 100+ node clusters, and almost always when nodes are in > different datacenters that have unreliable network connections (e.g, cross > region in AWS) and I think that it comes down to a combination of: > 1. Only a node itself can mark another node as "UP" > 2. Nodes only gossip with dead nodes with probability {{#dead / (#live +1)}} > In particular the algorithm in #2 leads to long convergence times because the > number of dead nodes it typically very small compared to the cluster size. My > back of the envelope model of this algorithm indicates that for a 100 node > cluster this would take an average of ~50 seconds with a stdev of 50 seconds, > which means we might be waiting _minutes_ for the nodes to gossip with each > other. I'm modeling this as the minimum of two [geometric > distributions|https://en.wikipedia.org/wiki/Geometric_distribution] with > parameter {{p=1/#nodes}}, yielding a geometric distribution with parameter > {{p=1-(1-(1/#nodes)^2)}}. So for a 100 node cluster: > {noformat} > 100 node cluster => > X = Pr(node1 gossips with node2) = geom(0.01) > Y = Pr(node 2 gossips with node1) = geom(0.01) > Z = min(X or Y) = geom(1 - (1 - 0.01)^2) = geom(0.02) > E[Z] = 1/0.02 = 50 > V[Z] = (1-0.02)/(0.02)^2 = 2450 > 1000 node cluster -> > Z = geom(1 - (1 - 0.001)^2) = geom(0.002) > E[Z] = 500 > V[Z] = 24500 > {noformat} > Since we gossip every second that means that on expectation in a 100 node > cluster these nodes would see each other after about a minute and in a > thousand node cluster, after ~8 minutes. For 100 node clusters the variance > is astounding, and means that in particular edge cases we might be waiting > hours before these nodes gossip with each other. > I'm thinking of writing a patch which either: > # Makes gossip order a shuffled list that includes dead nodes a la [swim > gossip|https://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf]. This would > make it so that we waste some rounds on dead nodes but guarantee linear > bounding of gossip. > # Adds an endpoint that re-triggers gossip with all nodes. Operators could > call this after a restart a few times if they detect a gossip inconsistency. > # Bounding the probability we gossip with a dead node at some reasonable > number like 1/10 or something. This might cause a lot of gossip load when a > node is actually down for large clusters, but would also act to bound the > variance. > # Something else? > I've got a WIP > [branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:force_gossip] > on 3.11 which implements options #1 and #2, but I can reduce/change/modify > as needed if people think there is a better way. The patch doesn't pass tests > yet but I'm not going to change/add the tests unless we think moving to time > bounded gossip for down nodes is a good idea. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14001) Gossip after node restart can take a long time to converge about "down" nodes in large clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243223#comment-16243223 ] Joseph Lynch commented on CASSANDRA-14001: -- I think CASSANDRA-13993 might help with this, but I _thin_ it's solving a slightly different problem. > Gossip after node restart can take a long time to converge about "down" nodes > in large clusters > --- > > Key: CASSANDRA-14001 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14001 > Project: Cassandra > Issue Type: Improvement > Components: Lifecycle >Reporter: Joseph Lynch >Priority: Minor > > When nodes restart in a large cluster, they mark all nodes as "alive", which > first calls {{markDead}} and then creates an {{EchoMessage}} and in the > callback to that marks the node as alive. This works great, except when that > initial echo fails for w.e. reason and that node is marked as dead, in which > case it will remain dead for a long while. > We mostly see this on 100+ node clusters, and almost always when nodes are in > different datacenters that have unreliable network connections (e.g, cross > region in AWS) and I think that it comes down to a combination of: > 1. Only a node itself can mark another node as "UP" > 2. Nodes only gossip with dead nodes with probability {{#dead / (#live +1)}} > In particular the algorithm in #2 leads to long convergence times because the > number of dead nodes it typically very small compared to the cluster size. My > back of the envelope model of this algorithm indicates that for a 100 node > cluster this would take an average of ~50 seconds with a stdev of 50 seconds, > which means we might be waiting _minutes_ for the nodes to gossip with each > other. I'm modeling this as the minimum of two [geometric > distributions|https://en.wikipedia.org/wiki/Geometric_distribution] with > parameter {{p=1/#nodes}}, yielding a geometric distribution with parameter > {{p=1-(1-(1/#nodes)^2)}}. So for a 100 node cluster: > {noformat} > 100 node cluster => > X = Pr(node1 gossips with node2) = geom(0.01) > Y = Pr(node 2 gossips with node1) = geom(0.01) > Z = min(X or Y) = geom(1 - (1 - 0.01)^2) = geom(0.02) > E[Z] = 1/0.02 = 50 > V[Z] = (1-0.02)/(0.02)^2 = 2450 > 1000 node cluster -> > Z = geom(1 - (1 - 0.001)^2) = geom(0.002) > E[Z] = 500 > V[Z] = 24500 > {noformat} > Since we gossip every second that means that on expectation in a 100 node > cluster these nodes would see each other after about a minute and in a > thousand node cluster, after ~8 minutes. For 100 node clusters the variance > is astounding, and means that in particular edge cases we might be waiting > hours before these nodes gossip with each other. > I'm thinking of writing a patch which either: > # Makes gossip order a shuffled list that includes dead nodes a la [swim > gossip|https://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf]. This would > make it so that we waste some rounds on dead nodes but guarantee linear > bounding of gossip. > # Adds an endpoint that re-triggers gossip with all nodes. Operators could > call this after a restart a few times if they detect a gossip inconsistency. > # Bounding the probability we gossip with a dead node at some reasonable > number like 1/10 or something. This might cause a lot of gossip load when a > node is actually down for large clusters, but would also act to bound the > variance. > # Something else? > I've got a WIP > [branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:force_gossip] > on 3.11 which implements options #1 and #2, but I can reduce/change/modify > as needed if people think there is a better way. The patch doesn't pass tests > yet but I'm not going to change/add the tests unless we think moving to time > bounded gossip for down nodes is a good idea. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13964) Tracing interferes with digest requests when using RandomPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243074#comment-16243074 ] ASF GitHub Bot commented on CASSANDRA-13964: Github user beobal closed the pull request at: https://github.com/apache/cassandra-dtest/pull/10 > Tracing interferes with digest requests when using RandomPartitioner > > > Key: CASSANDRA-13964 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13964 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths, Observability >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > Fix For: 3.0.16, 3.11.2, 4.0 > > > A {{ThreadLocal}} is used to generate the MD5 digest when a > replica serves a read command and the {{isDigestQuery}} flag is set. The same > threadlocal is also used by {{RandomPartitioner}} to decorate partition keys. > So in a cluster with RP, if tracing is enabled the data digest is corrupted > by the partitioner making tokens for the tracing mutations. This causes a > digest mismatch on the coordinator, triggering a full data read on every read > where CL > 1 (or speculative execution/read repair kick in). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13987) Multithreaded commitlog subtly changed durability
[ https://issues.apache.org/jira/browse/CASSANDRA-13987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242915#comment-16242915 ] Jason Brown commented on CASSANDRA-13987: - [~benedict] thanks for the comments, and for (indirectly) confirming that my understanding of the multithreaded commit log is more-or-less correct. > Multithreaded commitlog subtly changed durability > - > > Key: CASSANDRA-13987 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13987 > Project: Cassandra > Issue Type: Improvement >Reporter: Jason Brown >Assignee: Jason Brown > Fix For: 4.x > > > When multithreaded commitlog was introduced in CASSANDRA-3578, we subtly > changed the way that commitlog durability worked. Everything still gets > written to an mmap file. However, not everything is replayable from the > mmaped file after a process crash, in periodic mode. > In brief, the reason this changesd is due to the chained markers that are > required for the multithreaded commit log. At each msync, we wait for > outstanding mutations to serialize into the commitlog, and update a marker > before and after the commits that have accumluated since the last sync. With > those markers, we can safely replay that section of the commitlog. Without > the markers, we have no guarantee that the commits in that section were > successfully written, thus we abandon those commits on replay. > If you have correlated process failures of multiple nodes at "nearly" the > same time (see ["There Is No > Now"|http://queue.acm.org/detail.cfm?id=2745385]), it is possible to have > data loss if none of the nodes msync the commitlog. For example, with RF=3, > if quorum write succeeds on two nodes (and we acknowledge the write back to > the client), and then the process on both nodes OOMs (say, due to reading the > index for a 100GB partition), the write will be lost if neither process > msync'ed the commitlog. More exactly, the commitlog cannot be fully replayed. > The reason why this data is silently lost is due to the chained markers that > were introduced with CASSANDRA-3578. > The problem we are addressing with this ticket is incrementally improving > 'durability' due to process crash, not host crash. (Note: operators should > use batch mode to ensure greater durability, but batch mode in it's current > implementation is a) borked, and b) will burn through, *very* rapidly, SSDs > that don't have a non-volatile write cache sitting in front.) > The current default for {{commitlog_sync_period_in_ms}} is 10 seconds, which > means that a node could lose up to ten seconds of data due to process crash. > The unfortunate thing is that the data is still avaialble, in the mmap file, > but we can't replay it due to incomplete chained markers. > ftr, I don't believe we've ever had a stated policy about commitlog > durability wrt process crash. Pre-2.0 we naturally piggy-backed off the > memory mapped file and the fact that every mutation was acquired a lock and > wrote into the mmap buffer, and the ability to replay everything out of it > came for free. With CASSANDRA-3578, that was subtly changed. > Something [~jjirsa] pointed out to me is that [MySQL provides a way to adjust > the durability > guarantees|https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit] > of each commit in innodb via the {{innodb_flush_log_at_trx_commit}}. I'm > using that idea as a loose springboard for what to do here. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13872) document speculative_retry on DDL page
[ https://issues.apache.org/jira/browse/CASSANDRA-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-13872: --- Resolution: Fixed Status: Resolved (was: Patch Available) Awesome, thanks for the update. Merged to trunk as {{976f48fb06}}. Closing this out. > document speculative_retry on DDL page > -- > > Key: CASSANDRA-13872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13872 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Jordan Vaughan > Labels: docuentation, lhf > Fix For: 4.0 > > > There's no mention of speculative_retry or how it works on > https://cassandra.apache.org/doc/latest/cql/ddl.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13874) nodetool setcachecapacity behaves oddly when cache size = 0
[ https://issues.apache.org/jira/browse/CASSANDRA-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad reassigned CASSANDRA-13874: -- Assignee: Michal Szczepanski > nodetool setcachecapacity behaves oddly when cache size = 0 > --- > > Key: CASSANDRA-13874 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13874 > Project: Cassandra > Issue Type: Bug >Reporter: Jon Haddad >Assignee: Michal Szczepanski > Labels: lhf, user-experience > Attachments: 13874-trunk.txt > > > If a node has row cache disabled, trying to turn it on via setcachecapacity > doesn't issue an error, and doesn't turn it on, it just silently doesn't work. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Document speculative_retry case-insensitivity and new "P" suffix on DDL page
Repository: cassandra Updated Branches: refs/heads/trunk 9e7a401b9 -> 976f48fb0 Document speculative_retry case-insensitivity and new "P" suffix on DDL page Patch by Jordan Vaughan for CASSANDRA-13872; Reviewed by Jon Haddad Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/976f48fb Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/976f48fb Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/976f48fb Branch: refs/heads/trunk Commit: 976f48fb0664b15e1741a435447e593ad80edc4a Parents: 9e7a401 Author: Jordan Vaughan Authored: Thu Nov 2 00:12:09 2017 -0700 Committer: Jon Haddad Committed: Tue Nov 7 12:19:17 2017 -0800 -- doc/source/cql/ddl.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/976f48fb/doc/source/cql/ddl.rst -- diff --git a/doc/source/cql/ddl.rst b/doc/source/cql/ddl.rst index a09265b..780a412 100644 --- a/doc/source/cql/ddl.rst +++ b/doc/source/cql/ddl.rst @@ -493,7 +493,7 @@ Speculative retry options By default, Cassandra read coordinators only query as many replicas as necessary to satisfy consistency levels: one for consistency level ``ONE``, a quorum for ``QUORUM``, and so on. ``speculative_retry`` determines when coordinators may query additional replicas, which is useful -when replicas are slow or unresponsive. The following are legal values: +when replicas are slow or unresponsive. The following are legal values (case-insensitive): = = FormatExample Description @@ -502,6 +502,7 @@ when replicas are slow or unresponsive. The following are legal values: If a replica takes longer than ``X`` percent of this table's average response time, the coordinator queries an additional replica. ``X`` must be between 0 and 100. + ``XP``90.5PSynonym for ``XPERCENTILE`` ``Yms`` 25ms If a replica takes more than ``Y`` milliseconds to respond, the coordinator queries an additional replica. ``ALWAYS`` Coordinators always query all replicas. - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: Add test for digest requests with RandomPartitioner and tracing enabled
Repository: cassandra-dtest Updated Branches: refs/heads/master 7cc06a086 -> 01df7c498 Add test for digest requests with RandomPartitioner and tracing enabled Patch by Sam Tunnicliffe; reviewed by Jason Brown and Philip Thompson for CASSANDRA-13964 Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/01df7c49 Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/01df7c49 Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/01df7c49 Branch: refs/heads/master Commit: 01df7c49864ed5fa66db2181599a463a33b1f877 Parents: 7cc06a0 Author: Sam Tunnicliffe Authored: Tue Oct 17 14:50:25 2017 +0100 Committer: Sam Tunnicliffe Committed: Tue Nov 7 16:20:55 2017 + -- cql_tracing_test.py | 39 +-- tools/jmxutils.py | 12 2 files changed, 49 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/01df7c49/cql_tracing_test.py -- diff --git a/cql_tracing_test.py b/cql_tracing_test.py index aaf55aa..549e4d0 100644 --- a/cql_tracing_test.py +++ b/cql_tracing_test.py @@ -3,6 +3,7 @@ from distutils.version import LooseVersion from dtest import Tester, debug, create_ks from tools.decorators import since +from tools.jmxutils import make_mbean, JolokiaAgent, remove_perf_disable_shared_mem class TestCqlTracing(Tester): @@ -15,16 +16,23 @@ class TestCqlTracing(Tester): # instantiated when specified as a custom tracing implementation. """ -def prepare(self, create_keyspace=True, nodes=3, rf=3, protocol_version=3, jvm_args=None, **kwargs): +def prepare(self, create_keyspace=True, nodes=3, rf=3, protocol_version=3, jvm_args=None, random_partitioner=False, **kwargs): if jvm_args is None: jvm_args = [] jvm_args.append('-Dcassandra.wait_for_tracing_events_timeout_secs=15') cluster = self.cluster -cluster.populate(nodes).start(wait_for_binary_proto=True, jvm_args=jvm_args) +if random_partitioner: + cluster.set_partitioner("org.apache.cassandra.dht.RandomPartitioner") +else: + cluster.set_partitioner("org.apache.cassandra.dht.Murmur3Partitioner") + +cluster.populate(nodes) node1 = cluster.nodelist()[0] +remove_perf_disable_shared_mem(node1) # necessary for jmx +cluster.start(wait_for_binary_proto=True, jvm_args=jvm_args) session = self.patient_cql_connection(node1, protocol_version=protocol_version) if create_keyspace: @@ -176,3 +184,30 @@ class TestCqlTracing(Tester): self.assertIn("Default constructor for Tracing class " "'org.apache.cassandra.tracing.TracingImpl' is inaccessible.", check_for_errs_in) + +@since('3.0') +def test_tracing_does_not_interfere_with_digest_calculation(self): +""" +Test that enabling tracing doesn't interfere with digest responses when using RandomPartitioner. +The use of a threadlocal MessageDigest for generating both DigestResponse messages and for +calculating tokens meant that the DigestResponse was always incorrect when both RP and tracing +were enabled, leading to unnecessary data reads. + +@jira_ticket CASSANDRA-13964 +""" + +session = self.prepare(random_partitioner=True) +self.trace(session) + +node1 = self.cluster.nodelist()[0] + +rr_count = make_mbean('metrics', type='ReadRepair', name='RepairedBlocking') +with JolokiaAgent(node1) as jmx: +# the MBean may not have been initialized, in which case Jolokia agent will return +# a HTTP 404 response. If we receive such, we know that no digest mismatch was reported +# If we are able to read the MBean attribute, assert that the count is 0 +if jmx.has_mbean(rr_count): +# expect 0 digest mismatches +self.assertEqual(0, jmx.read_attribute(rr_count, 'Count')) +else: +pass http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/01df7c49/tools/jmxutils.py -- diff --git a/tools/jmxutils.py b/tools/jmxutils.py index 8c20eb8..7468226 100644 --- a/tools/jmxutils.py +++ b/tools/jmxutils.py @@ -243,6 +243,18 @@ class JolokiaAgent(object): raise Exception("Jolokia agent returned non-200 status: %s" % (response,)) return response +def has_mbean(self, mbean, verbose=True): +""" +Check for the existence of an MBean + +`mbean` should be the full name of an mbean. See the mbean() utility +
[jira] [Commented] (CASSANDRA-13983) Support a means of logging all queries as they were invoked
[ https://issues.apache.org/jira/browse/CASSANDRA-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242533#comment-16242533 ] Blake Eggleston commented on CASSANDRA-13983: - First round of comments: bin/fqltool * looks like you unnecessarily copied a nodetool compatibility check at the top of the file FullQueryLogger * configure ** first preconditions check uses {{|}}, not {{||}} ** Personally, I try to avoid combining assignments/evaluations like this {{RollCycles.valueOf(rollCycle = rollCycle.toUpperCase())}}. They're harder to follow than they need to be. Could we just parse the value at the top of the method and check that it's not null here? WeightedQueue * It doesn't look like we use special Weigher implementations anywhere. I think this could be slightly simplified if we made the type param {{}}? BinLog * onReleased ** {{bytesInStoreFiles}} is incremented, but doesn't seem to be decremented after a file is deleted, so once we've recorded more than maxLogSize, we'll always delete all files ** I think we should be a bit safer with how we access {{bytesInStoreFiles}}. The calling method in chronicle is synchronized, so this should be safe as is, but it doesn't look like there are any documented guarantees that this won't silently change at some point in the future. Maybe we could synchronize the method, or use an atomic? Misc: method comment style is a bit inconsistent. Can you use the {{/** */}} style, not the {{//}} style? > Support a means of logging all queries as they were invoked > --- > > Key: CASSANDRA-13983 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13983 > Project: Cassandra > Issue Type: New Feature > Components: CQL, Observability, Testing, Tools >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > For correctness testing it's useful to be able to capture production traffic > so that it can be replayed against both the old and new versions of Cassandra > while comparing the results. > Implementing this functionality once inside the database is high performance > and presents less operational complexity. > In [this patch|https://github.com/apache/cassandra/pull/169] there is an > implementation of a full query log that logs uses chronicle-queue (apache > licensed, the maven artifacts are labeled incorrectly in some cases, > dependencies are also apache licensed) to implement a rotating log of queries. > * Single thread asynchronously writes log entries to disk to reduce impact on > query latency > * Heap memory usage bounded by a weighted queue with configurable maximum > weight sitting in front of logging thread > * If the weighted queue is full producers can be blocked or samples can be > dropped > * Disk utilization is bounded by deleting old log segments once a > configurable size is reached > * The on disk serialization uses a flexible schema binary format > (chronicle-wire) making it easy to skip unrecognized fields, add new ones, > and omit old ones. > * Can be enabled and configured via JMX, disabled, and reset (delete on disk > data), logging path is configurable via both JMX and YAML > * Introduce new {{fqltool}} in /bin that currently implements {{Dump}} which > can dump in a human readable format full query logs as well as follow active > full query logs > Follow up work: > * Introduce new {{fqltool}} command Replay which can replay N full query logs > to two different clusters and compare the result and check for > inconsistencies. <- Actively working on getting this done > * Log not just queries but their results to facilitate a comparison between > the original query result and the replayed result. <- Really just don't have > specific use case at the moment > * "Consistent" query logging allowing replay to fully replicate the original > order of execution and completion even in the face of races (including CAS). > <- This is more speculative -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13958) [CQL] Inconsistent handling double dollar sign for strings
[ https://issues.apache.org/jira/browse/CASSANDRA-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242424#comment-16242424 ] Robert Stupp commented on CASSANDRA-13958: -- Not a complete review or so and I haven't tried the patch at all. But some thoughts on the patch: * The unit test method names should continue to start with {{test}} (not {{should}}). Nice work on the separation of the test methods though. * I see that the test checks for three {{$}} signs. Would love to see checks with four or more dollar signs (leading, middle and trailing) in various combinations and spanning multiple lines. Both for the unit and cqlsh tests. An algorithmic approach to test those combinations might be beneficial over coding all combinations manually. * The last point (many combinations) should also work for multiple parameters to a single statement - especially to verify that statements like {{INSERT INTO tab (x,y,z) VALUES (foo$$, $$poiewf$ewfi$, ewfpioj$$)}} work. > [CQL] Inconsistent handling double dollar sign for strings > -- > > Key: CASSANDRA-13958 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13958 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Hugo Picado >Assignee: Michał Szczygieł >Priority: Minor > > Double dollar signs is a [built-in method for escaping columns that may > contain single quotes in its > content](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/escape_char_r.html). > The way this is handled however is not consistent, in the sense that it > allows for $ to appear in the middle of the string but not in the last char. > *Examples:* > Valid: insert into users(id, name) values(1, $$john$$) > Inserts the string *john* > Valid: insert into users(id, name) values(1, $$jo$hn$$) > Inserts the string *jo$hn* > Valid: insert into users(id, name) values(1, $$$john$$) > Inserts the string *$john* > Invalid: insert into users(id, name) values(1, $$john$$$) > Fails with: > {code} > Invalid syntax at line 1, char 48 > insert into users(id, name) values(1, $$john$$$); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ludovic Boutros updated CASSANDRA-13403: Attachment: testSASIRepair.patch > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, > testSASIRepair.patch > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-3858) expose "propagation delay" metric in JMX
[ https://issues.apache.org/jira/browse/CASSANDRA-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242307#comment-16242307 ] George commented on CASSANDRA-3858: --- I'm surprised there's not great interest for such a feature. Propagation delays must be a valid concern. What am I missing? > expose "propagation delay" metric in JMX > > > Key: CASSANDRA-3858 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3858 > Project: Cassandra > Issue Type: Improvement >Reporter: Peter Schuller >Priority: Minor > > My idea is to augment the gossip protocol to contain timestamps. We wouldn't > use the timestamps for anything "important", but we could use them to allow > each node to expose a number which is the number of milliseconds (or seconds) > "old" information is about nodes that are "the oldest" and also alive. > When nodes go down you'd see spikes, but for most cases where nodes live, > this information should give you a pretty good idea of how fast gossip > information is propagating through the cluster, assuming you keep your clocks > in synch. > It should be a good thing to have graphed, and to have alerts on. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242309#comment-16242309 ] Ludovic Boutros commented on CASSANDRA-13403: - [~ifesdjeen], I think the issue is here in the [CompactionManager|https://github.com/apache/cassandra/blob/6d429cd0315d3509c904d0e83f91f7d12ba12085/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1570]. The two SSTableWriters share the same LifeCycleTransaction instance. Therefore, the second commit call is not applied and SASI index are not committed. I'have made a small unit test to reproduce the issue. I will attach it as a small patch for reference. > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13964) Tracing interferes with digest requests when using RandomPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-13964: Resolution: Fixed Fix Version/s: 4.0 3.11.2 3.0.16 Reproduced In: 3.11.1, 3.0.15, 4.0 (was: 3.0.15, 3.11.1, 4.0) Status: Resolved (was: Patch Available) The CI was generally good barring a couple of flaky-ish tests which I've checked are passing locally, so I've committed to 3.0 in {{58daf1376456289f97f0ef0b0daf9e0d03ba6b81}} and merged to 3.11 and trunk. > Tracing interferes with digest requests when using RandomPartitioner > > > Key: CASSANDRA-13964 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13964 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths, Observability >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > Fix For: 3.0.16, 3.11.2, 4.0 > > > A {{ThreadLocal}} is used to generate the MD5 digest when a > replica serves a read command and the {{isDigestQuery}} flag is set. The same > threadlocal is also used by {{RandomPartitioner}} to decorate partition keys. > So in a cluster with RP, if tracing is enabled the data digest is corrupted > by the partitioner making tokens for the tracing mutations. This causes a > digest mismatch on the coordinator, triggering a full data read on every read > where CL > 1 (or speculative execution/read repair kick in). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ab6201c6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ab6201c6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ab6201c6 Branch: refs/heads/trunk Commit: ab6201c65b193c2df4c2f25f779a591c917b1df8 Parents: 6d429cd 58daf13 Author: Sam Tunnicliffe Authored: Tue Nov 7 13:59:20 2017 + Committer: Sam Tunnicliffe Committed: Tue Nov 7 13:59:20 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 43 ++-- .../org/apache/cassandra/utils/FBUtilities.java | 19 - 3 files changed, 40 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/CHANGES.txt -- diff --cc CHANGES.txt index 275294f,3f4f3f2..1269dcf --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,8 -1,5 +1,9 @@@ -3.0.16 +3.11.2 + * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897) + * Update jackson JSON jars (CASSANDRA-13949) + * Avoid locks when checking LCS fanout and if we should defrag (CASSANDRA-13930) +Merged from 3.0: + * Tracing interferes with digest requests when using RandomPartitioner (CASSANDRA-13964) * Add flag to disable materialized views, and warnings on creation (CASSANDRA-13959) * Don't let user drop or generally break tables in system_distributed (CASSANDRA-13813) * Provide a JMX call to sync schema with local storage (CASSANDRA-13954) http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- diff --cc src/java/org/apache/cassandra/dht/RandomPartitioner.java index 82c2493,c7837c9..bdf8b85 --- a/src/java/org/apache/cassandra/dht/RandomPartitioner.java +++ b/src/java/org/apache/cassandra/dht/RandomPartitioner.java @@@ -117,20 -103,7 +141,20 @@@ public class RandomPartitioner implemen return new BigIntegerToken(token); } -private final Token.TokenFactory tokenFactory = new Token.TokenFactory() { +public BigIntegerToken getRandomToken(Random random) +{ - BigInteger token = FBUtilities.hashToBigInteger(GuidGenerator.guidAsBytes(random, "host/127.0.0.1", 0)); ++BigInteger token = hashToBigInteger(GuidGenerator.guidAsBytes(random, "host/127.0.0.1", 0)); +if ( token.signum() == -1 ) +token = token.multiply(BigInteger.valueOf(-1L)); +return new BigIntegerToken(token); +} + +private boolean isValidToken(BigInteger token) { +return token.compareTo(ZERO) >= 0 && token.compareTo(MAXIMUM) <= 0; +} + +private final Token.TokenFactory tokenFactory = new Token.TokenFactory() +{ public ByteBuffer toByteArray(Token token) { BigIntegerToken bigIntegerToken = (BigIntegerToken) token; @@@ -275,9 -230,14 +300,19 @@@ return partitionOrdering; } +public Optional splitter() +{ +return Optional.of(splitter); +} + + private static BigInteger hashToBigInteger(ByteBuffer data) + { + MessageDigest messageDigest = localMD5Digest.get(); + if (data.hasArray()) + messageDigest.update(data.array(), data.arrayOffset() + data.position(), data.remaining()); + else + messageDigest.update(data.duplicate()); + + return new BigInteger(messageDigest.digest()).abs(); + } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/src/java/org/apache/cassandra/utils/FBUtilities.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9e7a401b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9e7a401b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9e7a401b Branch: refs/heads/trunk Commit: 9e7a401b9c1e9244ebb7654f5e4bafa1d633d2ca Parents: 07fbd8e ab6201c Author: Sam Tunnicliffe Authored: Tue Nov 7 14:03:45 2017 + Committer: Sam Tunnicliffe Committed: Tue Nov 7 14:03:45 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 22 ++-- 2 files changed, 12 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e7a401b/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9e7a401b/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/6] cassandra git commit: RandomPartitioner has separate MessageDigest for token generation
RandomPartitioner has separate MessageDigest for token generation Patch by Sam Tunnicliffe; reviewed by Jason Brown for CASSANDRA-13964 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/58daf137 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/58daf137 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/58daf137 Branch: refs/heads/trunk Commit: 58daf1376456289f97f0ef0b0daf9e0d03ba6b81 Parents: 6c29ee8 Author: Sam Tunnicliffe Authored: Tue Oct 17 14:51:43 2017 +0100 Committer: Sam Tunnicliffe Committed: Tue Nov 7 13:38:25 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 43 ++-- .../org/apache/cassandra/utils/FBUtilities.java | 19 - 3 files changed, 41 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 935931c..3f4f3f2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.16 + * Tracing interferes with digest requests when using RandomPartitioner (CASSANDRA-13964) * Add flag to disable materialized views, and warnings on creation (CASSANDRA-13959) * Don't let user drop or generally break tables in system_distributed (CASSANDRA-13813) * Provide a JMX call to sync schema with local storage (CASSANDRA-13954) http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- diff --git a/src/java/org/apache/cassandra/dht/RandomPartitioner.java b/src/java/org/apache/cassandra/dht/RandomPartitioner.java index b0dea01..c7837c9 100644 --- a/src/java/org/apache/cassandra/dht/RandomPartitioner.java +++ b/src/java/org/apache/cassandra/dht/RandomPartitioner.java @@ -20,6 +20,7 @@ package org.apache.cassandra.dht; import java.math.BigDecimal; import java.math.BigInteger; import java.nio.ByteBuffer; +import java.security.MessageDigest; import java.util.*; import com.google.common.annotations.VisibleForTesting; @@ -45,11 +46,35 @@ public class RandomPartitioner implements IPartitioner public static final BigIntegerToken MINIMUM = new BigIntegerToken("-1"); public static final BigInteger MAXIMUM = new BigInteger("2").pow(127); -private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(FBUtilities.hashToBigInteger(ByteBuffer.allocate(1; +/** + * Maintain a separate threadlocal message digest, exclusively for token hashing. This is necessary because + * when Tracing is enabled and using the default tracing implementation, creating the mutations for the trace + * events involves tokenizing the partition keys. This happens multiple times whilst servicing a ReadCommand, + * and so can interfere with the stateful digest calculation if the node is a replica producing a digest response. + */ +private static final ThreadLocal localMD5Digest = new ThreadLocal() +{ +@Override +protected MessageDigest initialValue() +{ +return FBUtilities.newMessageDigest("MD5"); +} + +@Override +public MessageDigest get() +{ +MessageDigest digest = super.get(); +digest.reset(); +return digest; +} +}; + +private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(hashToBigInteger(ByteBuffer.allocate(1; public static final RandomPartitioner instance = new RandomPartitioner(); public static final AbstractType partitionOrdering = new PartitionerDefinedOrder(instance); + public DecoratedKey decorateKey(ByteBuffer key) { return new CachedHashDecoratedKey(getToken(key), key); @@ -72,7 +97,7 @@ public class RandomPartitioner implements IPartitioner public BigIntegerToken getRandomToken() { -BigInteger token = FBUtilities.hashToBigInteger(GuidGenerator.guidAsBytes()); +BigInteger token = hashToBigInteger(GuidGenerator.guidAsBytes()); if ( token.signum() == -1 ) token = token.multiply(BigInteger.valueOf(-1L)); return new BigIntegerToken(token); @@ -160,7 +185,8 @@ public class RandomPartitioner implements IPartitioner { if (key.remaining() == 0) return MINIMUM; -return new BigIntegerToken(FBUtilities.hashToBigInteger(key)); + +return new BigIntegerToken(hashToBigInteger(key)); } public Map describeOwnership(List sortedTokens) @@ -203,4 +229,15 @@ public class RandomPartitioner implements IPartitioner { return parti
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ab6201c6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ab6201c6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ab6201c6 Branch: refs/heads/cassandra-3.11 Commit: ab6201c65b193c2df4c2f25f779a591c917b1df8 Parents: 6d429cd 58daf13 Author: Sam Tunnicliffe Authored: Tue Nov 7 13:59:20 2017 + Committer: Sam Tunnicliffe Committed: Tue Nov 7 13:59:20 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 43 ++-- .../org/apache/cassandra/utils/FBUtilities.java | 19 - 3 files changed, 40 insertions(+), 23 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/CHANGES.txt -- diff --cc CHANGES.txt index 275294f,3f4f3f2..1269dcf --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,8 -1,5 +1,9 @@@ -3.0.16 +3.11.2 + * Round buffer size to powers of 2 for the chunk cache (CASSANDRA-13897) + * Update jackson JSON jars (CASSANDRA-13949) + * Avoid locks when checking LCS fanout and if we should defrag (CASSANDRA-13930) +Merged from 3.0: + * Tracing interferes with digest requests when using RandomPartitioner (CASSANDRA-13964) * Add flag to disable materialized views, and warnings on creation (CASSANDRA-13959) * Don't let user drop or generally break tables in system_distributed (CASSANDRA-13813) * Provide a JMX call to sync schema with local storage (CASSANDRA-13954) http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- diff --cc src/java/org/apache/cassandra/dht/RandomPartitioner.java index 82c2493,c7837c9..bdf8b85 --- a/src/java/org/apache/cassandra/dht/RandomPartitioner.java +++ b/src/java/org/apache/cassandra/dht/RandomPartitioner.java @@@ -117,20 -103,7 +141,20 @@@ public class RandomPartitioner implemen return new BigIntegerToken(token); } -private final Token.TokenFactory tokenFactory = new Token.TokenFactory() { +public BigIntegerToken getRandomToken(Random random) +{ - BigInteger token = FBUtilities.hashToBigInteger(GuidGenerator.guidAsBytes(random, "host/127.0.0.1", 0)); ++BigInteger token = hashToBigInteger(GuidGenerator.guidAsBytes(random, "host/127.0.0.1", 0)); +if ( token.signum() == -1 ) +token = token.multiply(BigInteger.valueOf(-1L)); +return new BigIntegerToken(token); +} + +private boolean isValidToken(BigInteger token) { +return token.compareTo(ZERO) >= 0 && token.compareTo(MAXIMUM) <= 0; +} + +private final Token.TokenFactory tokenFactory = new Token.TokenFactory() +{ public ByteBuffer toByteArray(Token token) { BigIntegerToken bigIntegerToken = (BigIntegerToken) token; @@@ -275,9 -230,14 +300,19 @@@ return partitionOrdering; } +public Optional splitter() +{ +return Optional.of(splitter); +} + + private static BigInteger hashToBigInteger(ByteBuffer data) + { + MessageDigest messageDigest = localMD5Digest.get(); + if (data.hasArray()) + messageDigest.update(data.array(), data.arrayOffset() + data.position(), data.remaining()); + else + messageDigest.update(data.duplicate()); + + return new BigInteger(messageDigest.digest()).abs(); + } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/ab6201c6/src/java/org/apache/cassandra/utils/FBUtilities.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/6] cassandra git commit: RandomPartitioner has separate MessageDigest for token generation
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 6c29ee84a -> 58daf1376 refs/heads/cassandra-3.11 6d429cd03 -> ab6201c65 refs/heads/trunk 07fbd8ee6 -> 9e7a401b9 RandomPartitioner has separate MessageDigest for token generation Patch by Sam Tunnicliffe; reviewed by Jason Brown for CASSANDRA-13964 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/58daf137 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/58daf137 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/58daf137 Branch: refs/heads/cassandra-3.0 Commit: 58daf1376456289f97f0ef0b0daf9e0d03ba6b81 Parents: 6c29ee8 Author: Sam Tunnicliffe Authored: Tue Oct 17 14:51:43 2017 +0100 Committer: Sam Tunnicliffe Committed: Tue Nov 7 13:38:25 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 43 ++-- .../org/apache/cassandra/utils/FBUtilities.java | 19 - 3 files changed, 41 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 935931c..3f4f3f2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.16 + * Tracing interferes with digest requests when using RandomPartitioner (CASSANDRA-13964) * Add flag to disable materialized views, and warnings on creation (CASSANDRA-13959) * Don't let user drop or generally break tables in system_distributed (CASSANDRA-13813) * Provide a JMX call to sync schema with local storage (CASSANDRA-13954) http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- diff --git a/src/java/org/apache/cassandra/dht/RandomPartitioner.java b/src/java/org/apache/cassandra/dht/RandomPartitioner.java index b0dea01..c7837c9 100644 --- a/src/java/org/apache/cassandra/dht/RandomPartitioner.java +++ b/src/java/org/apache/cassandra/dht/RandomPartitioner.java @@ -20,6 +20,7 @@ package org.apache.cassandra.dht; import java.math.BigDecimal; import java.math.BigInteger; import java.nio.ByteBuffer; +import java.security.MessageDigest; import java.util.*; import com.google.common.annotations.VisibleForTesting; @@ -45,11 +46,35 @@ public class RandomPartitioner implements IPartitioner public static final BigIntegerToken MINIMUM = new BigIntegerToken("-1"); public static final BigInteger MAXIMUM = new BigInteger("2").pow(127); -private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(FBUtilities.hashToBigInteger(ByteBuffer.allocate(1; +/** + * Maintain a separate threadlocal message digest, exclusively for token hashing. This is necessary because + * when Tracing is enabled and using the default tracing implementation, creating the mutations for the trace + * events involves tokenizing the partition keys. This happens multiple times whilst servicing a ReadCommand, + * and so can interfere with the stateful digest calculation if the node is a replica producing a digest response. + */ +private static final ThreadLocal localMD5Digest = new ThreadLocal() +{ +@Override +protected MessageDigest initialValue() +{ +return FBUtilities.newMessageDigest("MD5"); +} + +@Override +public MessageDigest get() +{ +MessageDigest digest = super.get(); +digest.reset(); +return digest; +} +}; + +private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(hashToBigInteger(ByteBuffer.allocate(1; public static final RandomPartitioner instance = new RandomPartitioner(); public static final AbstractType partitionOrdering = new PartitionerDefinedOrder(instance); + public DecoratedKey decorateKey(ByteBuffer key) { return new CachedHashDecoratedKey(getToken(key), key); @@ -72,7 +97,7 @@ public class RandomPartitioner implements IPartitioner public BigIntegerToken getRandomToken() { -BigInteger token = FBUtilities.hashToBigInteger(GuidGenerator.guidAsBytes()); +BigInteger token = hashToBigInteger(GuidGenerator.guidAsBytes()); if ( token.signum() == -1 ) token = token.multiply(BigInteger.valueOf(-1L)); return new BigIntegerToken(token); @@ -160,7 +185,8 @@ public class RandomPartitioner implements IPartitioner { if (key.remaining() == 0) return MINIMUM; -return new BigIntegerToken(FBUtilities.hashToBigInteger(key)); + +return new BigIntegerTok
[2/6] cassandra git commit: RandomPartitioner has separate MessageDigest for token generation
RandomPartitioner has separate MessageDigest for token generation Patch by Sam Tunnicliffe; reviewed by Jason Brown for CASSANDRA-13964 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/58daf137 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/58daf137 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/58daf137 Branch: refs/heads/cassandra-3.11 Commit: 58daf1376456289f97f0ef0b0daf9e0d03ba6b81 Parents: 6c29ee8 Author: Sam Tunnicliffe Authored: Tue Oct 17 14:51:43 2017 +0100 Committer: Sam Tunnicliffe Committed: Tue Nov 7 13:38:25 2017 + -- CHANGES.txt | 1 + .../apache/cassandra/dht/RandomPartitioner.java | 43 ++-- .../org/apache/cassandra/utils/FBUtilities.java | 19 - 3 files changed, 41 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 935931c..3f4f3f2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.16 + * Tracing interferes with digest requests when using RandomPartitioner (CASSANDRA-13964) * Add flag to disable materialized views, and warnings on creation (CASSANDRA-13959) * Don't let user drop or generally break tables in system_distributed (CASSANDRA-13813) * Provide a JMX call to sync schema with local storage (CASSANDRA-13954) http://git-wip-us.apache.org/repos/asf/cassandra/blob/58daf137/src/java/org/apache/cassandra/dht/RandomPartitioner.java -- diff --git a/src/java/org/apache/cassandra/dht/RandomPartitioner.java b/src/java/org/apache/cassandra/dht/RandomPartitioner.java index b0dea01..c7837c9 100644 --- a/src/java/org/apache/cassandra/dht/RandomPartitioner.java +++ b/src/java/org/apache/cassandra/dht/RandomPartitioner.java @@ -20,6 +20,7 @@ package org.apache.cassandra.dht; import java.math.BigDecimal; import java.math.BigInteger; import java.nio.ByteBuffer; +import java.security.MessageDigest; import java.util.*; import com.google.common.annotations.VisibleForTesting; @@ -45,11 +46,35 @@ public class RandomPartitioner implements IPartitioner public static final BigIntegerToken MINIMUM = new BigIntegerToken("-1"); public static final BigInteger MAXIMUM = new BigInteger("2").pow(127); -private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(FBUtilities.hashToBigInteger(ByteBuffer.allocate(1; +/** + * Maintain a separate threadlocal message digest, exclusively for token hashing. This is necessary because + * when Tracing is enabled and using the default tracing implementation, creating the mutations for the trace + * events involves tokenizing the partition keys. This happens multiple times whilst servicing a ReadCommand, + * and so can interfere with the stateful digest calculation if the node is a replica producing a digest response. + */ +private static final ThreadLocal localMD5Digest = new ThreadLocal() +{ +@Override +protected MessageDigest initialValue() +{ +return FBUtilities.newMessageDigest("MD5"); +} + +@Override +public MessageDigest get() +{ +MessageDigest digest = super.get(); +digest.reset(); +return digest; +} +}; + +private static final int HEAP_SIZE = (int) ObjectSizes.measureDeep(new BigIntegerToken(hashToBigInteger(ByteBuffer.allocate(1; public static final RandomPartitioner instance = new RandomPartitioner(); public static final AbstractType partitionOrdering = new PartitionerDefinedOrder(instance); + public DecoratedKey decorateKey(ByteBuffer key) { return new CachedHashDecoratedKey(getToken(key), key); @@ -72,7 +97,7 @@ public class RandomPartitioner implements IPartitioner public BigIntegerToken getRandomToken() { -BigInteger token = FBUtilities.hashToBigInteger(GuidGenerator.guidAsBytes()); +BigInteger token = hashToBigInteger(GuidGenerator.guidAsBytes()); if ( token.signum() == -1 ) token = token.multiply(BigInteger.valueOf(-1L)); return new BigIntegerToken(token); @@ -160,7 +185,8 @@ public class RandomPartitioner implements IPartitioner { if (key.remaining() == 0) return MINIMUM; -return new BigIntegerToken(FBUtilities.hashToBigInteger(key)); + +return new BigIntegerToken(hashToBigInteger(key)); } public Map describeOwnership(List sortedTokens) @@ -203,4 +229,15 @@ public class RandomPartitioner implements IPartitioner { ret
[jira] [Commented] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242120#comment-16242120 ] Ludovic Boutros commented on CASSANDRA-13403: - And the if we rebuild the index: {code} INFO [RMI TCP Connection(7)-10.53.0.15] 2017-11-07 15:44:34,456 ColumnFamilyStore.java:806 - User Requested secondary index re-build for lubo_test/t_doc indexes: i_doc DEBUG [RMI TCP Connection(7)-10.53.0.15] 2017-11-07 15:44:34,458 ColumnFamilyStore.java:899 - Enqueuing flush of IndexInfo: 0,385KiB (0%) on-heap, 0,000KiB (0%) off-heap DEBUG [PerDiskMemtableFlushWriter_0:5] 2017-11-07 15:44:34,514 Memtable.java:461 - Writing Memtable-IndexInfo@1020363412(0,049KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223372036854775808), max(9223372036854775807)] DEBUG [PerDiskMemtableFlushWriter_0:5] 2017-11-07 15:44:34,515 Memtable.java:490 - Completed flushing /data/cassandra/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-18-big-Data.db (0,036KiB) for commitlog position CommitLogPosition(segmentId=1510062526702, position=2214781) DEBUG [MemtableFlushWriter:5] 2017-11-07 15:44:34,644 ColumnFamilyStore.java:1197 - Flushed to [BigTableReader(path='/data/cassandra/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-18-big-Data.db')] (1 sstables, 4,854KiB), biggest 4,854KiB, smallest 4,854KiB INFO [RMI TCP Connection(7)-10.53.0.15] 2017-11-07 15:44:34,644 SecondaryIndexManager.java:365 - Submitting index build of i_doc for data in BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-Data.db'),BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-Data.db') INFO [CompactionExecutor:10] 2017-11-07 15:44:34,646 PerSSTableIndexWriter.java:279 - Scheduling index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-SI_i_doc.db INFO [SASI-General:3] 2017-11-07 15:44:34,675 PerSSTableIndexWriter.java:330 - Index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-SI_i_doc.db took 28 ms. {code} {code} INFO [CompactionExecutor:10] 2017-11-07 15:44:34,676 DataTracker.java:152 - SSTableIndex.open(column: r, minTerm: 0, maxTerm: 0, minKey: 1, maxKey: 7, sstable: BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-Data.db')) {code} {code} INFO [CompactionExecutor:10] 2017-11-07 15:44:34,677 PerSSTableIndexWriter.java:279 - Scheduling index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-SI_i_doc.db INFO [SASI-General:3] 2017-11-07 15:44:34,683 PerSSTableIndexWriter.java:330 - Index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-SI_i_doc.db took 5 ms. {code} {code} INFO [CompactionExecutor:10] 2017-11-07 15:44:34,683 DataTracker.java:152 - SSTableIndex.open(column: r, minTerm: 0, maxTerm: 0, minKey: 11, maxKey: 9, sstable: BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-Data.db')) {code} {code} INFO [RMI TCP Connection(7)-10.53.0.15] 2017-11-07 15:44:34,683 SecondaryIndexManager.java:385 - Index build of i_doc complete {code} We can see the two lines of log of the DataTracker. > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cse
[jira] [Commented] (CASSANDRA-13997) Upgrade guava to 23.3
[ https://issues.apache.org/jira/browse/CASSANDRA-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242118#comment-16242118 ] Marcus Eriksson commented on CASSANDRA-13997: - pushed a commit that upgrades airline to 0.8 https://circleci.com/gh/krummas/cassandra/tree/marcuse%2Fguava23 https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/421/ > Upgrade guava to 23.3 > - > > Key: CASSANDRA-13997 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13997 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > For 4.0 we should upgrade guava to the latest version > patch here: https://github.com/krummas/cassandra/commits/marcuse/guava23 > A bunch of quite commonly used methods have been deprecated since guava 18 > which we use now ({{Throwables.propagate}} for example), this patch mostly > updates uses where compilation fails. {{Futures.transform(ListenableFuture > ..., AsyncFunction ...}} was deprecated in Guava 19 and removed in 20 for > example, we should probably open new tickets to remove calls to all > deprecated guava methods. > Also had to add a dependency on {{com.google.j2objc.j2objc-annotations}}, to > avoid some build-time warnings (maybe due to > https://github.com/google/guava/commit/fffd2b1f67d158c7b4052123c5032b0ba54a910d > ?) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13997) Upgrade guava to 23.3
[ https://issues.apache.org/jira/browse/CASSANDRA-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242108#comment-16242108 ] Marcus Eriksson commented on CASSANDRA-13997: - dtest run yesterday timed out: https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/420/ circle shows that we probably need to upgrade airline as well: https://circleci.com/gh/krummas/cassandra/171 > Upgrade guava to 23.3 > - > > Key: CASSANDRA-13997 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13997 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > For 4.0 we should upgrade guava to the latest version > patch here: https://github.com/krummas/cassandra/commits/marcuse/guava23 > A bunch of quite commonly used methods have been deprecated since guava 18 > which we use now ({{Throwables.propagate}} for example), this patch mostly > updates uses where compilation fails. {{Futures.transform(ListenableFuture > ..., AsyncFunction ...}} was deprecated in Guava 19 and removed in 20 for > example, we should probably open new tickets to remove calls to all > deprecated guava methods. > Also had to add a dependency on {{com.google.j2objc.j2objc-annotations}}, to > avoid some build-time warnings (maybe due to > https://github.com/google/guava/commit/fffd2b1f67d158c7b4052123c5032b0ba54a910d > ?) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14000) Remove v5 as a beta version from 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14000: Description: Currently, V5 has only two features (if anyone knows other ones, please correct me): * [CASSANDRA-10786] * [CASSANDRA-12838] V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. UPDATE: [CASSANDRA-12838] adds a {{DURATION}} type, which can not be done any way other than bumping a protocol version. The problem is was: Currently, V5 has only two features (if anyone knows other ones, please correct me): * [CASSANDRA-10786] * [CASSANDRA-12838] V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. UPDATE: [CASSANDRA-12838] adds a {{DURATION}} type, which can not be done any way other than bumping a protocol version. > Remove v5 as a beta version from 3.11 > -- > > Key: CASSANDRA-14000 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14000 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Blocker > > Currently, V5 has only two features (if anyone knows other ones, please > correct me): > * [CASSANDRA-10786] > * [CASSANDRA-12838] > V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release > more features quicker. However, we did not. > I suggest we remove v5 protocol support from 3.11, as all the new features go > into 4.0 anyways and protocol is on an early stage, so most likely there will > be a couple more changes. > UPDATE: [CASSANDRA-12838] adds a {{DURATION}} type, which can not be done any > way other than bumping a protocol version. The problem is -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14000) Remove v5 as a beta version from 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14000: Description: Currently, V5 has only two features (if anyone knows other ones, please correct me): * [CASSANDRA-10786] * [CASSANDRA-12838] V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. UPDATE: [CASSANDRA-12838] adds a {{DURATION}} type, which can not be done any way other than bumping a protocol version. was: Currently, V5 has only two features (if anyone knows other ones, please correct me): * https://issues.apache.org/jira/browse/CASSANDRA-10145 * https://issues.apache.org/jira/browse/CASSANDRA-10786 * https://issues.apache.org/jira/browse/CASSANDRA-12838 V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. > Remove v5 as a beta version from 3.11 > -- > > Key: CASSANDRA-14000 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14000 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Blocker > > Currently, V5 has only two features (if anyone knows other ones, please > correct me): > * [CASSANDRA-10786] > * [CASSANDRA-12838] > V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release > more features quicker. However, we did not. > I suggest we remove v5 protocol support from 3.11, as all the new features go > into 4.0 anyways and protocol is on an early stage, so most likely there will > be a couple more changes. > UPDATE: [CASSANDRA-12838] adds a {{DURATION}} type, which can not be done any > way other than bumping a protocol version. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14000) Remove v5 as a beta version from 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-14000: - Description: Currently, V5 has only two features (if anyone knows other ones, please correct me): * https://issues.apache.org/jira/browse/CASSANDRA-10145 * https://issues.apache.org/jira/browse/CASSANDRA-10786 * https://issues.apache.org/jira/browse/CASSANDRA-12838 V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. was: Currently, V5 has only two features (if anyone knows other ones, please correct me): * https://issues.apache.org/jira/browse/CASSANDRA-10786 * https://issues.apache.org/jira/browse/CASSANDRA-12838 V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. > Remove v5 as a beta version from 3.11 > -- > > Key: CASSANDRA-14000 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14000 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Blocker > > Currently, V5 has only two features (if anyone knows other ones, please > correct me): > * https://issues.apache.org/jira/browse/CASSANDRA-10145 > * https://issues.apache.org/jira/browse/CASSANDRA-10786 > * https://issues.apache.org/jira/browse/CASSANDRA-12838 > V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release > more features quicker. However, we did not. > I suggest we remove v5 protocol support from 3.11, as all the new features go > into 4.0 anyways and protocol is on an early stage, so most likely there will > be a couple more changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14000) Remove v5 as a beta version from 3.11
Alex Petrov created CASSANDRA-14000: --- Summary: Remove v5 as a beta version from 3.11 Key: CASSANDRA-14000 URL: https://issues.apache.org/jira/browse/CASSANDRA-14000 Project: Cassandra Issue Type: Bug Reporter: Alex Petrov Assignee: Alex Petrov Priority: Blocker Currently, V5 has only two features (if anyone knows other ones, please correct me): * https://issues.apache.org/jira/browse/CASSANDRA-10786 * https://issues.apache.org/jira/browse/CASSANDRA-12838 V5 "beta" mode was suggested in [CASSANDRA-12142], hoping that we can release more features quicker. However, we did not. I suggest we remove v5 protocol support from 3.11, as all the new features go into 4.0 anyways and protocol is on an early stage, so most likely there will be a couple more changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13997) Upgrade guava to 23.3
[ https://issues.apache.org/jira/browse/CASSANDRA-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-13997: --- Reviewer: Stefan Podkowinski > Upgrade guava to 23.3 > - > > Key: CASSANDRA-13997 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13997 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > For 4.0 we should upgrade guava to the latest version > patch here: https://github.com/krummas/cassandra/commits/marcuse/guava23 > A bunch of quite commonly used methods have been deprecated since guava 18 > which we use now ({{Throwables.propagate}} for example), this patch mostly > updates uses where compilation fails. {{Futures.transform(ListenableFuture > ..., AsyncFunction ...}} was deprecated in Guava 19 and removed in 20 for > example, we should probably open new tickets to remove calls to all > deprecated guava methods. > Also had to add a dependency on {{com.google.j2objc.j2objc-annotations}}, to > avoid some build-time warnings (maybe due to > https://github.com/google/guava/commit/fffd2b1f67d158c7b4052123c5032b0ba54a910d > ?) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242078#comment-16242078 ] Ludovic Boutros commented on CASSANDRA-13403: - Another thing in the logs : {code} INFO [CompactionExecutor:5] 2017-11-07 14:52:50,945 CompactionManager.java:1472 - Performing anticompaction on 2 sstables INFO [CompactionExecutor:5] 2017-11-07 14:52:50,956 CompactionManager.java:1509 - Anticompacting [BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-21-big-Data.db'), BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-20-big-Data.db')] INFO [CompactionExecutor:5] 2017-11-07 14:52:51,308 PerSSTableIndexWriter.java:279 - Scheduling index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-SI_i_doc.db INFO [SASI-General:2] 2017-11-07 14:52:51,343 PerSSTableIndexWriter.java:330 - Index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-SI_i_doc.db took 34 ms. {code} {code} INFO [CompactionExecutor:5] 2017-11-07 14:52:51,380 DataTracker.java:152 - SSTableIndex.open(column: r, minTerm: 0, maxTerm: 0, minKey: 1, maxKey: 7, sstable: BigTableReader(path='/data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-22-big-Data.db')) {code} {code} INFO [CompactionExecutor:5] 2017-11-07 14:52:51,381 PerSSTableIndexWriter.java:279 - Scheduling index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-SI_i_doc.db INFO [SASI-General:2] 2017-11-07 14:52:51,412 PerSSTableIndexWriter.java:330 - Index flush to /data/cassandra/data/lubo_test/t_doc-64343790c31611e7a46403e2ed27ae86/mc-23-big-SI_i_doc.db took 31 ms. INFO [CompactionExecutor:5] 2017-11-07 14:52:51,413 CompactionManager.java:1488 - Anticompaction completed successfully, anticompacted from 0 to 2 sstable(s). INFO [CompactionExecutor:5] 2017-11-07 14:52:51,413 CompactionManager.java:694 - [repair #f1539d30-c3c2-11e7-8fe4-090a7aa7154d] Completed anticompaction successfully INFO [InternalResponseStage:14] 2017-11-07 14:52:51,782 RepairRunnable.java:340 - Repair command #1 finished in 1 second {code} The second index on the second SSTable does not seem to be opened/finished. And the only known keys are between [1 to 7] which matches with the query result: {code:SQL} cassandra@cqlsh> SELECT * from lubo_test.t_doc where r = 0; id | r | cid +---+-- 6 | 0 | 66f68be0-c316-11e7-a464-03e2ed27ae86 7 | 0 | 66f74f30-c316-11e7-a464-03e2ed27ae86 10 | 0 | 66faaa90-c316-11e7-a464-03e2ed27ae86 4 | 0 | 66f46900-c316-11e7-a464-03e2ed27ae86 3 | 0 | 66f37ea0-c316-11e7-a464-03e2ed27ae86 5 | 0 | 66f5a180-c316-11e7-a464-03e2ed27ae86 2 | 0 | 66f29440-c316-11e7-a464-03e2ed27ae86 1 | 0 | 66ea56e0-c316-11e7-a464-03e2ed27ae86 (8 rows) {code} > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 r
[jira] [Updated] (CASSANDRA-12838) Extend native protocol flags and add supported versions to the SUPPORTED response
[ https://issues.apache.org/jira/browse/CASSANDRA-12838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-12838: Labels: client-impacting protocolv5 (was: client-impacting) > Extend native protocol flags and add supported versions to the SUPPORTED > response > - > > Key: CASSANDRA-12838 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12838 > Project: Cassandra > Issue Type: Sub-task > Components: CQL >Reporter: Stefania >Assignee: Stefania > Labels: client-impacting, protocolv5 > Fix For: 3.10 > > > We already use 7 bits for the flags of the QUERY message, and since they are > encoded with a fixed size byte, we may be forced to change the structure of > the message soon, and I'd like to do this in version 5 but without wasting > bytes on the wire. Therefore, I propose to convert fixed flag's bytes to > unsigned vints, as defined in CASSANDRA-9499. The only exception would be the > flags in the frame, which should stay as fixed size. > Up to 7 bits, vints are encoded the same as bytes are, so no immediate change > would be required in the drivers, although they should plan to support vint > flags if supporting version 5. Moving forward, when a new flag is required > for the QUERY message, and eventually when other flags reach 8 bits in other > messages too, the flag's bitmaps would be automatically encoded with a size > that is big enough to accommodate all flags, but no bigger than required. We > can currently support up to 8 bytes with unsigned vints. > The downside is that drivers need to implement unsigned vint encoding for > version 5, but this is already required by CASSANDRA-11873, and will most > likely be required by CASSANDRA-11622 as well. > I would also like to add the list of versions to the SUPPORTED message, in > order to simplify the handshake for drivers that prefer to send an OPTION > message, rather than rely on receiving an error for an unsupported version in > the STARTUP message. Said error should also contain the full list of > supported versions, not just the min and max, for clarity, and because the > latest version is now a beta version. > Finally, we currently store versions as integer constants in {{Server.java}}, > and we still have a fair bit of hard-coded numbers in the code, especially in > tests. I plan to clean this up by introducing a {{ProtocolVersion}} enum. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13971) Automatic certificate management using Vault
[ https://issues.apache.org/jira/browse/CASSANDRA-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242008#comment-16242008 ] Jeff Mitchell commented on CASSANDRA-13971: --- Woah, I totally forgot I had an Apache JIRA account. You're welcome! > Automatic certificate management using Vault > > > Key: CASSANDRA-13971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13971 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > Fix For: 4.x > > > We've been adding security features during the last years to enable users to > secure their clusters, if they are willing to use them and do so correctly. > Some features are powerful and easy to work with, such as role based > authorization. Other features that require to manage a local keystore are > rather painful to deal with. Think about setting up SSL.. > To be fair, keystore related issues and certificate handling hasn't been > invented by us. We're just following Java standards there. But that doesn't > mean that we absolutely have to, if there are better options. I'd like to > give it a shoot and find out if we can automate certificate/key handling > (PKI) by using external APIs. In this case, the implementation will be based > on [Vault|https://vaultproject.io]. But certificate management services > offered by cloud providers may also be able to handle the use-case and I > intend to create a generic, pluggable API for that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13996) Close DataInputBuffer in MetadataSerializer
[ https://issues.apache.org/jira/browse/CASSANDRA-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242007#comment-16242007 ] Aleksey Yeschenko commented on CASSANDRA-13996: --- Yeah. I actually saw that and made an explicit call to leave it be, as that close is an obvious no-op, and not having it made the code cleaner. Go ahead if you feel like it, but on principle I'm no fan of make code a bit worse to please tooling or unit tests. Is there an annotation to suppress the warnings instead, maybe? > Close DataInputBuffer in MetadataSerializer > --- > > Key: CASSANDRA-13996 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13996 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > eclipse-warnings complains about this, either introduced by CASSANDRA-13321 > or CASSANDRA-13953 > Patch here: https://github.com/krummas/cassandra/commits/marcuse/closeDIB > https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/416/ > https://circleci.com/gh/krummas/cassandra/170 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13971) Automatic certificate management using Vault
[ https://issues.apache.org/jira/browse/CASSANDRA-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242001#comment-16242001 ] Stefan Podkowinski commented on CASSANDRA-13971: Work here is getting ahead nicely. I'm now done with a first implementation that allows me to authenticate against Vault and retrieve certificates. There's also a dtest that would download vault (a static go executable), spin up an instance and bootstrap a Cassandra cluster with SSL Vault support enabled. There are still some aspects that need some more test coverage, such as certificate renewal for running Cassandra instances. But I don't see any major blockers on the way so far. As for Vault, I've found that Java/JCA is a bit limited when it comes to supported rsa private key encodings and Vault's PKCS#1 encoded keys could not be read using the Java standard classes. But a [PR|https://github.com/hashicorp/vault/pull/3518] has been merged recently that will enable PKCS#8 support in one of the upcoming Vault releases, which is going to solve this issue (thanks [~jeffm]!). > Automatic certificate management using Vault > > > Key: CASSANDRA-13971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13971 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > Fix For: 4.x > > > We've been adding security features during the last years to enable users to > secure their clusters, if they are willing to use them and do so correctly. > Some features are powerful and easy to work with, such as role based > authorization. Other features that require to manage a local keystore are > rather painful to deal with. Think about setting up SSL.. > To be fair, keystore related issues and certificate handling hasn't been > invented by us. We're just following Java standards there. But that doesn't > mean that we absolutely have to, if there are better options. I'd like to > give it a shoot and find out if we can automate certificate/key handling > (PKI) by using external APIs. In this case, the implementation will be based > on [Vault|https://vaultproject.io]. But certificate management services > offered by cloud providers may also be able to handle the use-case and I > intend to create a generic, pluggable API for that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13964) Tracing interferes with digest requests when using RandomPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241998#comment-16241998 ] ASF GitHub Bot commented on CASSANDRA-13964: GitHub user beobal opened a pull request: https://github.com/apache/cassandra-dtest/pull/10 Add test for digest requests with RandomPartitioner and tracing enabled Patch by Sam Tunnicliffe; reviewed by Jason Brown for CASSANDRA-13964 @ptnapoleon: Jason already gave this the once over, but if you have chance I'd appreciate your +1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/beobal/cassandra-dtest 13964 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cassandra-dtest/pull/10.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10 commit edc48bc965e842628413cfd50a7a21071d7b098a Author: Sam Tunnicliffe Date: 2017-10-17T13:50:25Z Add test for digest requests with RandomPartitioner and tracing enabled Patch by Sam Tunnicliffe; reviewed by Jason Brown for CASSANDRA-13964 > Tracing interferes with digest requests when using RandomPartitioner > > > Key: CASSANDRA-13964 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13964 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths, Observability >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > > A {{ThreadLocal}} is used to generate the MD5 digest when a > replica serves a read command and the {{isDigestQuery}} flag is set. The same > threadlocal is also used by {{RandomPartitioner}} to decorate partition keys. > So in a cluster with RP, if tracing is enabled the data digest is corrupted > by the partitioner making tokens for the tracing mutations. This causes a > digest mismatch on the coordinator, triggering a full data read on every read > where CL > 1 (or speculative execution/read repair kick in). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ricardo Bartolome updated CASSANDRA-13999: -- Description: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 21889 C2 org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] J 22464 C2 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] j org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] J 21832 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac [0x7f8e9e67c460+0x4c] v ~StubRoutines::call_stub V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x1056 V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47 V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 V [libjvm.so+0xa7697c] JavaThread::run()+0x11c V [libjvm.so+0x927568] java_start(Thread*)+0x108 C [libpthread.so.0+0x7de5] start_thread+0xc5 {code} For further details, we attached: * JVM error file with all details * cassandra config file (we are using offheap_buffers as memtable_allocation_method) * some lines printed in debug.log when the JVM error file was created and process died h5. Reproducing the issue So far we have been unable to reproduce it. It happens once/twice a week on single nodes. It happens either during high load or low load times. We have seen that when we replace EC2 instances and bootstrap new ones, due to compactions happening on source nodes before stream starts, sometimes more than a single node was affected by this, letting us with 2 out of 3 replicas out and UnavailableExceptions in the cluster. This issue might have relation with CASSANDRA-12590 (Segfault reading secondary index) even this is the write path. Can someone confirm if both issues could be related? h5. Specifics of our scenario: * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra 2.0.9 and there are no records of this also happening, even I was not working on Cassandra) * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) * a total of 176 keyspaces (there is a per-customer pattern) ** Some keyspaces have a single table, while others have 2 or 5 tables ** There is a table that uses standard Secondary Indexes ("emailindex" on "user_info" table) * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 h5. Possible workarounds/solutions that we have in mind (to be validated yet) * switching to heap_buffers (in case offheap_buffers triggers the bug), even we are still pending to measure performance degradation under that scenario. * removing secondary indexes in favour of Materialized Views for this specific case, even we are concerned too about the fact that using MVs introduces new issues that may be present in our current Cassandra 3.9 * Upgrading to 3.11.1 is an option, but we are trying to keep it as last resort given that the cost of migrating is big and we don't have any guarantee that new bugs that affects nodes availability are not introduced. was: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=25
[jira] [Updated] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ricardo Bartolome updated CASSANDRA-13999: -- Environment: * Cassandra 3.9 * Oracle JDK 1.8.0_112 and 1.8.0_131 * Kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 > Segfault during memtable flush > -- > > Key: CASSANDRA-13999 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13999 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths > Environment: * Cassandra 3.9 > * Oracle JDK 1.8.0_112 and 1.8.0_131 > * Kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 >Reporter: Ricardo Bartolome >Priority: Critical > Attachments: > cassandra-jvm-file-error-1509698372-pid16151.log.obfuscated, > cassandra_config.yaml, node_crashing_debug.log > > > We are getting segfaults on a production Cassandra cluster, apparently caused > by Memtable flushes to disk. > {code} > Current thread (0x0cd77920): JavaThread > "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, > stack(0x7f8b7aa53000,0x7f8b7aa94000)] > {code} > Stack > {code} > Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free > space=253k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > J 21889 C2 > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; > (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] > J 22464 C2 > org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 > bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] > j > org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 > j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 > J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ > 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] > J 21832 C2 > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V > (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] > J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ > 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] > J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac > [0x7f8e9e67c460+0x4c] > v ~StubRoutines::call_stub > V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, > JavaCallArguments*, Thread*)+0x1056 > V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, > Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 > V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, > KlassHandle, Symbol*, Symbol*, Thread*)+0x47 > V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 > V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 > V [libjvm.so+0xa7697c] JavaThread::run()+0x11c > V [libjvm.so+0x927568] java_start(Thread*)+0x108 > C [libpthread.so.0+0x7de5] start_thread+0xc5 > {code} > For further details, we attached: > * JVM error file with all details > * cassandra config file (we are using offheap_buffers as > memtable_allocation_method) > * some lines printed in debug.log when the JVM error file was created and > process died > h5. Reproducing the issue > So far we have been unable to reproduce it. It happens once/twice a week on > single nodes. It happens either during high load or low load times. We have > seen that when we replace EC2 instances and bootstrap new ones, due to > compactions happening on source nodes before stream starts, sometimes more > than a single node was affected by this, letting us with 2 out of 3 replicas > out and UnavailableExceptions in the cluster. > This issue might have relation with CASSANDRA-12590 (Segfault reading > secondary index) even this is the write path. Can someone confirm if both > issues could be related? > h5. Specifics of our scenario: > * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra > 2.0.9 and there are no records of this also happening, even I was not working > on Cassandra) > * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) > * a total of 176 keyspaces (there is a per-customer pattern) > ** Some keyspaces have a single table, while others have 2 or 5 tables > ** There is a table that uses standard Secondary Indexes ("emailindex" on > "user_info" table) > * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 > * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and > 3.14.35-28.38.amzn1.x86_64 > h5. Possible workarounds/solutions (to be validated yet) > * switching to heap_buffers (in case offheap_buffers triggers the bug), even > we are still pen
[jira] [Commented] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241970#comment-16241970 ] Ricardo Bartolome commented on CASSANDRA-13999: --- The schema of the tables that contains the index to query from user email is the following: {code} CREATE TABLE customer_user.user_info ( user_id text PRIMARY KEY, user_accept_email boolean, user_email text, user_last_modified timestamp, user_locale text, user_metadata text, user_name text, user_profile_picture text, user_site text, user_timezone text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE INDEX emailindex ON customer_user.user_info (user_email); {code} > Segfault during memtable flush > -- > > Key: CASSANDRA-13999 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13999 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Ricardo Bartolome >Priority: Critical > Attachments: > cassandra-jvm-file-error-1509698372-pid16151.log.obfuscated, > cassandra_config.yaml, node_crashing_debug.log > > > We are getting segfaults on a production Cassandra cluster, apparently caused > by Memtable flushes to disk. > {code} > Current thread (0x0cd77920): JavaThread > "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, > stack(0x7f8b7aa53000,0x7f8b7aa94000)] > {code} > Stack > {code} > Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free > space=253k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > J 21889 C2 > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; > (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] > J 22464 C2 > org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 > bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] > j > org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 > j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 > J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ > 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] > J 21832 C2 > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V > (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] > J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ > 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] > J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac > [0x7f8e9e67c460+0x4c] > v ~StubRoutines::call_stub > V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, > JavaCallArguments*, Thread*)+0x1056 > V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, > Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 > V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, > KlassHandle, Symbol*, Symbol*, Thread*)+0x47 > V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 > V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 > V [libjvm.so+0xa7697c] JavaThread::run()+0x11c > V [libjvm.so+0x927568] java_start(Thread*)+0x108 > C [libpthread.so.0+0x7de5] start_thread+0xc5 > {code} > For further details, we attached: > * JVM error file with all details > * cassandra config file (we are using offheap_buffers as > memtable_allocation_method) > * some lines printed in debug.log when the JVM error file was created and > process died > h5. Reproducing the issue > So far we have been unable to reproduce it. It happens once/twice a week on > single nodes. It happens either during high load or low load times. We have > seen that when we replace EC2 instances and bootstrap new ones, due to > compactions happening on source nodes before stream starts, sometimes more > than a single node was affected by this, letting us with 2 out of 3 replicas > out and UnavailableExceptions in the cluster. > This issue might have relation with
[jira] [Issue Comment Deleted] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-10857: Comment: was deleted (was: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296143 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] +node2 = cluster.nodelist()[1] +node3 = cluster.nodelist()[2] +time.sleep(0.2) --- End diff -- There's no need for this sleep. ) > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-10857: Comment: was deleted (was: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296195 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] --- End diff -- Its much more concise to just write `node1, node2, node3 = cluster.nodelist()` ) > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-10857: Comment: was deleted (was: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296242 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] +node2 = cluster.nodelist()[1] +node3 = cluster.nodelist()[2] +time.sleep(0.2) + +session1 = self.patient_cql_connection(node1) +session2 = self.patient_cql_connection(node2) +session3 = self.patient_cql_connection(node3) +self.create_ks(session1, 'ks', 3) +sessions = [session1, session2, session3] + +for session in sessions: +session.set_keyspace('ks') + +session1.execute(""" +CREATE TABLE test_drop_compact_storage (k int PRIMARY KEY, s1 int) WITH COMPACT STORAGE; +""") +time.sleep(1) --- End diff -- No need for this sleep. ) > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ricardo Bartolome updated CASSANDRA-13999: -- Description: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 21889 C2 org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] J 22464 C2 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] j org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] J 21832 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac [0x7f8e9e67c460+0x4c] v ~StubRoutines::call_stub V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x1056 V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47 V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 V [libjvm.so+0xa7697c] JavaThread::run()+0x11c V [libjvm.so+0x927568] java_start(Thread*)+0x108 C [libpthread.so.0+0x7de5] start_thread+0xc5 {code} For further details, we attached: * JVM error file with all details * cassandra config file (we are using offheap_buffers as memtable_allocation_method) * some lines printed in debug.log when the JVM error file was created and process died h5. Reproducing the issue So far we have been unable to reproduce it. It happens once/twice a week on single nodes. It happens either during high load or low load times. We have seen that when we replace EC2 instances and bootstrap new ones, due to compactions happening on source nodes before stream starts, sometimes more than a single node was affected by this, letting us with 2 out of 3 replicas out and UnavailableExceptions in the cluster. This issue might have relation with CASSANDRA-12590 (Segfault reading secondary index) even this is the write path. Can someone confirm if both issues could be related? h5. Specifics of our scenario: * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra 2.0.9 and there are no records of this also happening, even I was not working on Cassandra) * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) * a total of 176 keyspaces (there is a per-customer pattern) ** Some keyspaces have a single table, while others have 2 or 5 tables ** There is a table that uses standard Secondary Indexes ("emailindex" on "user_info" table) * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 h5. Possible workarounds/solutions (to be validated yet) * switching to heap_buffers (in case offheap_buffers triggers the bug), even we are still pending to measure performance degradation under that scenario. * removing secondary indexes in favour of Materialized Views for this specific case, even we are concerned too about the fact that using MVs introduces new issues that may be present in our current Cassandra 3.9 * Upgrading to 3.11.1 is an option, but we are trying to keep it as last resort given that the cost of migrating is big and we don't have any guarantee that new bugs that affects nodes availability are not introduced. was: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=
[jira] [Updated] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ricardo Bartolome updated CASSANDRA-13999: -- Description: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 21889 C2 org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] J 22464 C2 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] j org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] J 21832 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac [0x7f8e9e67c460+0x4c] v ~StubRoutines::call_stub V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x1056 V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47 V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 V [libjvm.so+0xa7697c] JavaThread::run()+0x11c V [libjvm.so+0x927568] java_start(Thread*)+0x108 C [libpthread.so.0+0x7de5] start_thread+0xc5 {code} For further details, we attached: * JVM error file with all details * cassandra config file (we are using offheap_buffers as memtable_allocation_method) * some lines printed in debug.log when the JVM error file was created and process died h5. Reproducing the issue So far we have been unable to reproduce it. It happens once/twice a week on single nodes. It happens either during high load or low load times. We have seen that when we replace EC2 instances and bootstrap new ones, due to compactions happening on source nodes before stream starts, sometimes more than a single node was affected by this, letting us with 2 out of 3 replicas out and UnavailableExceptions in the cluster. It issue might have relation with CASSANDRA-12590 (Segfault reading secondary index) even this is the write path. Can someone confirm if both issues can be related? h5. Specifics of our scenario: * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra 2.0.9 and there are no records of this also happening, even I was not working on Cassandra) * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) * a total of 176 keyspaces (there is a per-customer pattern) ** Some keyspaces have a single table, while others have 2 or 5 tables ** There is a table that uses standard Secondary Indexes ("emailindex" on "user_info" table) * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 h5. Possible workarounds/solutions (to be validated yet) * switching to heap_buffers (in case offheap_buffers triggers the bug), even we are still pending to measure performance degradation under that scenario. * removing secondary indexes in favour of Materialized Views for this specific case, even we are concerned too about the fact that using MVs introduces new issues that may be present in our current Cassandra 3.9 * Upgrading to 3.11.1 is an option, but we are trying to keep it as last resort given that the cost of migrating is big and we don't have any guarantee that new bugs that affects nodes availability are not introduced. was: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=comp
[jira] [Updated] (CASSANDRA-13999) Segfault during memtable flush
[ https://issues.apache.org/jira/browse/CASSANDRA-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ricardo Bartolome updated CASSANDRA-13999: -- Description: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 21889 C2 org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] J 22464 C2 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] j org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] J 21832 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac [0x7f8e9e67c460+0x4c] v ~StubRoutines::call_stub V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x1056 V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47 V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 V [libjvm.so+0xa7697c] JavaThread::run()+0x11c V [libjvm.so+0x927568] java_start(Thread*)+0x108 C [libpthread.so.0+0x7de5] start_thread+0xc5 {code} h5. Reproducing the issue So far we have been unable to reproduce it. It happens once/twice a week on single nodes. It happens either during high load or low load times. We have seen that when we replace EC2 instances and bootstrap new ones, due to compactions happening on source nodes before stream starts, sometimes more than a single node was affected by this, letting us with 2 out of 3 replicas out and UnavailableExceptions in the cluster. For further details, we attached: * JVM error file with all details * cassandra config file (we are using offheap_buffers as memtable_allocation_method) * some lines printed in debug.log when the JVM error file was created and process died Specifics of our scenario: * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra 2.0.9 and there are no records of this also happening, even I was not working on Cassandra) * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) * a total of 176 keyspaces (there is a per-customer pattern) ** Some keyspaces have a single table, while others have 2 or 5 tables ** There is a table that uses standard Secondary Indexes ("emailindex" on "user_info" table) * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 It issue might have relation with CASSANDRA-12590 (Segfault reading secondary index) even this is the write path. Can someone confirm if both issues can be related? In our side we are thinking about: * switching to heap_buffers (in case offheap_buffers triggers the bug), even we are still pending to measure performance degradation under that scenario. * removing secondary indexes in favour of Materialized Views for this specific case, even we are concerned too about the fact that using MVs introduces new issues that may be present in our current Cassandra 3.9 * Upgrading to 3.11.1 is an option, but we are trying to keep it as last resort given that the cost of migrating is big and we don't have any guarantee that new bugs that affects nodes availability are not introduced. was: We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpr
[jira] [Created] (CASSANDRA-13999) Segfault during memtable flush
Ricardo Bartolome created CASSANDRA-13999: - Summary: Segfault during memtable flush Key: CASSANDRA-13999 URL: https://issues.apache.org/jira/browse/CASSANDRA-13999 Project: Cassandra Issue Type: Bug Components: Local Write-Read Paths Reporter: Ricardo Bartolome Priority: Critical Attachments: cassandra-jvm-file-error-1509698372-pid16151.log.obfuscated, cassandra_config.yaml, node_crashing_debug.log We are getting segfaults on a production Cassandra cluster, apparently caused by Memtable flushes to disk. {code} Current thread (0x0cd77920): JavaThread "PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, stack(0x7f8b7aa53000,0x7f8b7aa94000)] {code} Stack {code} Stack: [0x7f8b7aa53000,0x7f8b7aa94000], sp=0x7f8b7aa924a0, free space=253k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J 21889 C2 org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry; (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec] J 22464 C2 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents()V (383 bytes) @ 0x7f8e9f17b988 [0x7f8e9f17b5c0+0x3c8] j org.apache.cassandra.db.Memtable$FlushRunnable.call()Lorg/apache/cassandra/io/sstable/SSTableMultiWriter;+1 j org.apache.cassandra.db.Memtable$FlushRunnable.call()Ljava/lang/Object;+1 J 18865 C2 java.util.concurrent.FutureTask.run()V (126 bytes) @ 0x7f8e9d3c9540 [0x7f8e9d3c93a0+0x1a0] J 21832 C2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (225 bytes) @ 0x7f8e9f16856c [0x7f8e9f168400+0x16c] J 6720 C1 java.util.concurrent.ThreadPoolExecutor$Worker.run()V (9 bytes) @ 0x7f8e9def73c4 [0x7f8e9def72c0+0x104] J 22079 C2 java.lang.Thread.run()V (17 bytes) @ 0x7f8e9e67c4ac [0x7f8e9e67c460+0x4c] v ~StubRoutines::call_stub V [libjvm.so+0x691d16] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x1056 V [libjvm.so+0x692221] JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x321 V [libjvm.so+0x6926c7] JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)+0x47 V [libjvm.so+0x72da50] thread_entry(JavaThread*, Thread*)+0xa0 V [libjvm.so+0xa76833] JavaThread::thread_main_inner()+0x103 V [libjvm.so+0xa7697c] JavaThread::run()+0x11c V [libjvm.so+0x927568] java_start(Thread*)+0x108 C [libpthread.so.0+0x7de5] start_thread+0xc5 {code} For further details, we attached: * JVM error file with all details * cassandra config file (we are using offheap_buffers as memtable_allocation_method) * some lines printed in debug.log when the JVM error file was created and process died Specifics of our scenario: * Cassandra 3.9 on Amazon Linux (previous to this, we were running Cassandra 2.0.9 and there are no records of this also happening, even I was not working on Cassandra) * 12 x i3.2xlarge EC2 instances (8 core, 64GB RAM) * a total of 176 keyspaces (there is a per-customer pattern) ** Some keyspaces have a single table, while others have 2 or 5 tables ** There is a table that uses standard Secondary Indexes ("emailindex" on "user_info" table) * It happens on both Oracle JDK 1.8.0_112 and 1.8.0_131 * It happens in both kernel 4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64 * It happens once/twice a week on single nodes. It happens either during high load or low load times. We have seen that when we replace EC2 instances and bootstrap new ones, due to compactions happening on source nodes before stream starts, sometimes more than a single node was affected by this, letting us with 2 out of 3 replicas out and UnavailableExceptions in the cluster. It issue might have relation with CASSANDRA-12590 (Segfault reading secondary index) even this is the write path. Can someone confirm if both issues can be related? In our side we are thinking about: * switching to heap_buffers (in case offheap_buffers triggers the bug), even we are still pending to measure performance degradation under that scenario. * removing secondary indexes in favour of Materialized Views for this specific case, even we are concerned too about the fact that using MVs introduces new issues that may be present in our current Cassandra 3.9 * Upgrading to 3.11.1 is an option, but we are trying to keep it as last resort given that the cost of migrating is big and we don't have any guarantee that new bugs that affects nodes availability are not introduced. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apach
[jira] [Updated] (CASSANDRA-13992) Don't send new_metadata_id for conditional updates
[ https://issues.apache.org/jira/browse/CASSANDRA-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves updated CASSANDRA-13992: - Status: Awaiting Feedback (was: Open) > Don't send new_metadata_id for conditional updates > -- > > Key: CASSANDRA-13992 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13992 > Project: Cassandra > Issue Type: Bug >Reporter: Olivier Michallat >Priority: Minor > > This is a follow-up to CASSANDRA-10786. > Given the table > {code} > CREATE TABLE foo (k int PRIMARY KEY) > {code} > And the prepared statement > {code} > INSERT INTO foo (k) VALUES (?) IF NOT EXISTS > {code} > The result set metadata changes depending on the outcome of the update: > * if the row didn't exist, there is only a single column \[applied] = true > * if it did, the result contains \[applied] = false, plus the current value > of column k. > The way this was handled so far is that the PREPARED response contains no > result set metadata, and therefore all EXECUTE messages have SKIP_METADATA = > false, and the responses always include the full (and correct) metadata. > CASSANDRA-10786 still sends the PREPARED response with no metadata, *but the > response to EXECUTE now contains a {{new_metadata_id}}*. The driver thinks it > is because of a schema change, and updates its local copy of the prepared > statement's result metadata. > The next EXECUTE is sent with SKIP_METADATA = true, but the server appears to > ignore that, and still sends the metadata in the response. So each response > includes the correct metadata, the driver uses it, and there is no visible > issue for client code. > The only drawback is that the driver updates its local copy of the metadata > unnecessarily, every time. We can work around that by only updating if we had > metadata before, at the cost of an extra volatile read. But I think the best > thing to do would be to never send a {{new_metadata_id}} in for a conditional > update. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13992) Don't send new_metadata_id for conditional updates
[ https://issues.apache.org/jira/browse/CASSANDRA-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241847#comment-16241847 ] Kurt Greaves commented on CASSANDRA-13992: -- bq. The next EXECUTE is sent with SKIP_METADATA = true, but the server appears to ignore that I believe this is because METADATA_CHANGED will take precedence. If C* thinks the metadata changed it will set the METADATA_CHANGED flag and the driver should need to update it's metadata. TBH this isn't super clear from the spec but appears to be what the code achieves [here|https://github.com/apache/cassandra/blob/922dbdb658b1693973926026b213153d05b4077c/src/java/org/apache/cassandra/transport/messages/ExecuteMessage.java#L174]. I may have no idea what I'm talking about but I think the simplest solution to bq. never send a new_metadata_id in for a conditional update. would be to simply always use the same digest for any LWT. I think the following patch achieves this without breaking anything but I haven't confirmed if it actually fixes the driver issue yet. If someone with more understanding of the protocol and what not could have a glance and let me know if this makes sense or point me in the right direction. [trunk|https://github.com/apache/cassandra/compare/trunk...kgreav:13992-trunk] > Don't send new_metadata_id for conditional updates > -- > > Key: CASSANDRA-13992 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13992 > Project: Cassandra > Issue Type: Bug >Reporter: Olivier Michallat >Priority: Minor > > This is a follow-up to CASSANDRA-10786. > Given the table > {code} > CREATE TABLE foo (k int PRIMARY KEY) > {code} > And the prepared statement > {code} > INSERT INTO foo (k) VALUES (?) IF NOT EXISTS > {code} > The result set metadata changes depending on the outcome of the update: > * if the row didn't exist, there is only a single column \[applied] = true > * if it did, the result contains \[applied] = false, plus the current value > of column k. > The way this was handled so far is that the PREPARED response contains no > result set metadata, and therefore all EXECUTE messages have SKIP_METADATA = > false, and the responses always include the full (and correct) metadata. > CASSANDRA-10786 still sends the PREPARED response with no metadata, *but the > response to EXECUTE now contains a {{new_metadata_id}}*. The driver thinks it > is because of a schema change, and updates its local copy of the prepared > statement's result metadata. > The next EXECUTE is sent with SKIP_METADATA = true, but the server appears to > ignore that, and still sends the metadata in the response. So each response > includes the correct metadata, the driver uses it, and there is no visible > issue for client code. > The only drawback is that the driver updates its local copy of the metadata > unnecessarily, every time. We can work around that by only updating if we had > metadata before, at the cost of an extra volatile read. But I think the best > thing to do would be to never send a {{new_metadata_id}} in for a conditional > update. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13992) Don't send new_metadata_id for conditional updates
[ https://issues.apache.org/jira/browse/CASSANDRA-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kurt Greaves reassigned CASSANDRA-13992: Assignee: Kurt Greaves > Don't send new_metadata_id for conditional updates > -- > > Key: CASSANDRA-13992 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13992 > Project: Cassandra > Issue Type: Bug >Reporter: Olivier Michallat >Assignee: Kurt Greaves >Priority: Minor > > This is a follow-up to CASSANDRA-10786. > Given the table > {code} > CREATE TABLE foo (k int PRIMARY KEY) > {code} > And the prepared statement > {code} > INSERT INTO foo (k) VALUES (?) IF NOT EXISTS > {code} > The result set metadata changes depending on the outcome of the update: > * if the row didn't exist, there is only a single column \[applied] = true > * if it did, the result contains \[applied] = false, plus the current value > of column k. > The way this was handled so far is that the PREPARED response contains no > result set metadata, and therefore all EXECUTE messages have SKIP_METADATA = > false, and the responses always include the full (and correct) metadata. > CASSANDRA-10786 still sends the PREPARED response with no metadata, *but the > response to EXECUTE now contains a {{new_metadata_id}}*. The driver thinks it > is because of a schema change, and updates its local copy of the prepared > statement's result metadata. > The next EXECUTE is sent with SKIP_METADATA = true, but the server appears to > ignore that, and still sends the metadata in the response. So each response > includes the correct metadata, the driver uses it, and there is no visible > issue for client code. > The only drawback is that the driver updates its local copy of the metadata > unnecessarily, every time. We can work around that by only updating if we had > metadata before, at the cost of an extra volatile read. But I think the best > thing to do would be to never send a {{new_metadata_id}} in for a conditional > update. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ludovic Boutros updated CASSANDRA-13403: Attachment: 3_nodes_compaction.log 4_nodes_compaction.log > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241829#comment-16241829 ] Ludovic Boutros commented on CASSANDRA-13403: - [~ifesdjeen], in order to reproduce, I'm using a small C* 3.10 cluster with at least 4 nodes and 256 tokens. Here is my keyspace declaration : {code:sql} CREATE KEYSPACE lubo_test WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3'} AND durable_writes = true; CREATE TABLE lubo_test.t_doc ( id text, r int, cid timeuuid, PRIMARY KEY (id, r, cid) ) WITH CLUSTERING ORDER BY (r DESC, cid DESC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE CUSTOM INDEX i_doc ON lubo_test.t_doc (r) USING 'org.apache.cassandra.index.sasi.SASIIndex'; {code} I'have added some docs : {code:sql} INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '1', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '2', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '3', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '4', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '5', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '6', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '7', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '8', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '9', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '10', 0, now()); INSERT INTO lubo_test.t_doc ( id , r , cid ) VALUES ( '11', 0, now()); {code} Then this query without repair : {code:sql} cassandra@cqlsh> SELECT * from lubo_test.t_doc where r = 0; id | r | cid +---+-- 6 | 0 | 66f68be0-c316-11e7-a464-03e2ed27ae86 7 | 0 | 66f74f30-c316-11e7-a464-03e2ed27ae86 9 | 0 | 66f97210-c316-11e7-a464-03e2ed27ae86 10 | 0 | 66faaa90-c316-11e7-a464-03e2ed27ae86 4 | 0 | 66f46900-c316-11e7-a464-03e2ed27ae86 3 | 0 | 66f37ea0-c316-11e7-a464-03e2ed27ae86 5 | 0 | 66f5a180-c316-11e7-a464-03e2ed27ae86 8 | 0 | 66f83990-c316-11e7-a464-03e2ed27ae86 2 | 0 | 66f29440-c316-11e7-a464-03e2ed27ae86 11 | 0 | 66fb6de0-c316-11e7-a464-03e2ed27ae86 1 | 0 | 66ea56e0-c316-11e7-a464-03e2ed27ae86 (11 rows) {code} If I fire a "nodetool repair --full", then I have : {code:sql} cassandra@cqlsh> SELECT * from lubo_test.t_doc where r = 0; id | r | cid +---+-- 6 | 0 | 66f68be0-c316-11e7-a464-03e2ed27ae86 7 | 0 | 66f74f30-c316-11e7-a464-03e2ed27ae86 10 | 0 | 66faaa90-c316-11e7-a464-03e2ed27ae86 4 | 0 | 66f46900-c316-11e7-a464-03e2ed27ae86 3 | 0 | 66f37ea0-c316-11e7-a464-03e2ed27ae86 5 | 0 | 66f5a180-c316-11e7-a464-03e2ed27ae86 2 | 0 | 66f29440-c316-11e7-a464-03e2ed27ae86 1 | 0 | 66ea56e0-c316-11e7-a464-03e2ed27ae86 (8 rows) {code} I can fire a "rebuild_index" on each node to fix the problem. I've checked the debug log differences between 3 and 4 nodes. It seems that with 3 nodes, the Anticompaction process is not done. You can see in the log "mutating repairedAt instead of anticompacting". Currently, I would bet that when anticompacting, the SASI Index is not rebuilt correctly, but that's just a bet. I've attached the log extractions, if you need more, just ask. > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: sasi > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32',
[jira] [Assigned] (CASSANDRA-13986) Fix native protocol v5 spec for new_metadata_id position in Rows response
[ https://issues.apache.org/jira/browse/CASSANDRA-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-13986: --- Assignee: Alex Petrov > Fix native protocol v5 spec for new_metadata_id position in Rows response > - > > Key: CASSANDRA-13986 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13986 > Project: Cassandra > Issue Type: Bug >Reporter: Olivier Michallat >Assignee: Alex Petrov >Priority: Trivial > > There's a mistake in the protocol specification for CASSANDRA-10786. In > `native_protocol_v5.spec`, section 4.2.5.2: > {code} > 4.2.5.2. Rows > Indicates a set of rows. The rest of the body of a Rows result is: > > where: > - is composed of: > > [][][?...] > {code} > The last line should be: > {code} > > [][][?...] > {code} > That is, if there is both a paging state and a new metadata id, the paging > state comes *first*, not second. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes
[ https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241711#comment-16241711 ] Paulo Motta commented on CASSANDRA-13948: - bq. Ok I was able to reproduce it on one node. I've attached the trace log. It's unfiltered since I didn't managed to filter only to org.apache.cassandra.db.compaction I wasn't able to track down the root cause of this condition from the logs, but a similar issue was reported on CASSANDRA-12743, so I think this is some kind of race condition showing up due to the amount of concurrent compactions happening and is not a consequence of this fix, so I prefer to investigate this separately. If you still see this issue please feel free to reopen CASSANDRA-12743 with details. bq. However I'm still facing issues with compactions. These are big nodes with with a big CF, holding many SSTables and pending compactions. According the thread dump it seems to be stuck around getNextBackgroundTask. Compactions are still being processed for the other keyspace. Beside that the node is running normally. Some nodetool commands takes time to proceed like compactionstats. Debug log doesn't show any error. After having a look at the thread dump, it turns out that my previous patch generated a lock contention between compaction and cleanup, because each removed SSTable from cleanup generated a {{SSTableDeletingNotification}} and my previous patch submitted a new compaction task after each received notification which competed with the next {{SSTableDeletingNotification}} for the {{writeLock}} - making things slow overall, so I updated the patch to only submit a new compaction after receiving a flush notification as it was before, so this should be fixed now. [~llambiel] would you mind trying the latest version now? [~krummas] this should be ready for review now, the latest version already got a clean CI run, but I resubmitted a new internal CI run after doing the minor fix above and will update here when ready. Summary of changes: 1) Reload compaction strategies when JBOD disk boundary changes ([commit|https://github.com/pauloricardomg/cassandra/commit/6cab7e0a31a638cc4a957c4ecfa592035d874058]) 2) Ensure compaction strategies do not loop indefinitely when not able to acquire Tracker lock ([commit|https://github.com/pauloricardomg/cassandra/commit/3ef833d1e56c25f67bc8a3b49acf97b2efdf401d]) 3) Only enable compaction strategies after gossip settles to prevent unnecessary relocation work ([commit|https://github.com/pauloricardomg/cassandra/commit/eaf63dc3d52566ce0c4f91bbfec478305597f014]) 4) Do not reload compaction strategies when receiving notifications and log warning when an SSTable is added multiple times to LCS ([commit|https://github.com/pauloricardomg/cassandra/commit/3e61df70025e704ee0c9d6ee8754ccdd38f5ab6d]) Patches * [3.11|https://github.com/pauloricardomg/cassandra/tree/3.11-13948] * [trunk|https://github.com/pauloricardomg/cassandra/tree/trunk-13948] I wonder if now that CSM caches the disk boundaries we can make the handling of notifications use the readLock instead of the writeLock, to reduce contention when there is a high number of concurrent compactors, do you see any potential problems with this? Even if the notification handling races with getNextBackground task, as long as the individual compaction strategies are synchronized getNextBackground task should get a consistent view of the strategy sstables when there is a concurrent notification from the tracker. > Reload compaction strategies when JBOD disk boundary changes > > > Key: CASSANDRA-13948 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13948 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Paulo Motta >Assignee: Paulo Motta > Fix For: 3.11.x, 4.x > > Attachments: debug.log, threaddump.txt, trace.log > > > The thread dump below shows a race between an sstable replacement by the > {{IndexSummaryRedistribution}} and > {{AbstractCompactionTask.getNextBackgroundTask}}: > {noformat} > Thread 94580: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=175 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() > @bci=1, line=836 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, > int) @bci=67, line=870 (Compiled frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) > @bci=17, line=1199 (Compiled frame) > - java.util.concurrent.locks.Reentrant
[jira] [Commented] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241663#comment-16241663 ] ASF GitHub Bot commented on CASSANDRA-10857: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296242 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] +node2 = cluster.nodelist()[1] +node3 = cluster.nodelist()[2] +time.sleep(0.2) + +session1 = self.patient_cql_connection(node1) +session2 = self.patient_cql_connection(node2) +session3 = self.patient_cql_connection(node3) +self.create_ks(session1, 'ks', 3) +sessions = [session1, session2, session3] + +for session in sessions: +session.set_keyspace('ks') + +session1.execute(""" +CREATE TABLE test_drop_compact_storage (k int PRIMARY KEY, s1 int) WITH COMPACT STORAGE; +""") +time.sleep(1) --- End diff -- No need for this sleep. > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241661#comment-16241661 ] ASF GitHub Bot commented on CASSANDRA-10857: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296195 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] --- End diff -- Its much more concise to just write `node1, node2, node3 = cluster.nodelist()` > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
[ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241662#comment-16241662 ] ASF GitHub Bot commented on CASSANDRA-10857: Github user ptnapoleon commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/9#discussion_r149296143 --- Diff: cql_tests.py --- @@ -698,6 +719,54 @@ def many_columns_test(self): ",".join(map(lambda i: "c_{}".format(i), range(width))) + " FROM very_wide_table", [[i for i in range(width)]]) +@since("3.11", max_version="3.X") +def drop_compact_storage_flag_test(self): +""" +Test for CASSANDRA-10857, verifying the schema change +distribution across the other nodes. + +""" + +cluster = self.cluster + +cluster.populate(3).start() +node1 = cluster.nodelist()[0] +node2 = cluster.nodelist()[1] +node3 = cluster.nodelist()[2] +time.sleep(0.2) --- End diff -- There's no need for this sleep. > Allow dropping COMPACT STORAGE flag from tables in 3.X > -- > > Key: CASSANDRA-10857 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10857 > Project: Cassandra > Issue Type: Improvement > Components: CQL, Distributed Metadata >Reporter: Aleksey Yeschenko >Assignee: Alex Petrov >Priority: Blocker > Labels: client-impacting > Fix For: 4.0, 3.0.x, 3.11.x > > > Thrift allows users to define flexible mixed column families - where certain > columns would have explicitly pre-defined names, potentially non-default > validation types, and be indexed. > Example: > {code} > create column family foo > and default_validation_class = UTF8Type > and column_metadata = [ > {column_name: bar, validation_class: Int32Type, index_type: KEYS}, > {column_name: baz, validation_class: UUIDType, index_type: KEYS} > ]; > {code} > Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and > {{UUIDType}}, respectively, and be indexed. Columns with any other name will > be validated by {{UTF8Type}} and will not be indexed. > With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns > internally. However, being {{WITH COMPACT STORAGE}}, the table will only > expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column > not named {{bar}} and {{baz}}) right now requires going through Thrift. > This is blocking Thrift -> CQL migration for users who have mixed > dynamic/static column families. That said, it *shouldn't* be hard to allow > users to drop the {{compact}} flag to expose the table as it is internally > now, and be able to access all columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org