[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323716#comment-14323716 ] Anton Slutsky commented on CASSANDRA-4914: -- I think, it may not be all that complicated, at least in some cases. If we consider the avg function for example, any record in the resultset of interest has a non-zero probability of being exactly the average value, kind of by definition :-), and nothing prevents us from grabbing the very first record and looking at it from that point of view. The key here is, of course, to figure out what that non-zero probability is, but that can also be approximated with some accuracy by sampling a little bit beyond the first record. If we are smart about how we sample and if we have an idea as to how big the actual resultset is, reasonably close approximation of the average value can be achieved and the probability of it being the true average can be computed with common techniques. Along the same lines, "sum" can be thought of as an integral over the shape approximated by the avg, which can also be approximated with some probability of being correct. Of course, there are many problems with the above from the statistical point of view. For one, resultsets are often ordered in some way, so sampling cannot be assumed to be random, which is not good. Anyway, I dont know if this is the right use case, but I really need aggregate functions for what I'm trying to do and right now I have to fire up a hadoop cluster to get simple aggregates computed, which is a major pain and takes forever. I'll give it a shot in my own code and see if I can come up with a reasonable approach. Perhaps others will see this discussion and suggest some ideas. Thanks, Anton > Aggregation functions in CQL > > > Key: CASSANDRA-4914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 > Project: Cassandra > Issue Type: New Feature >Reporter: Vijay >Assignee: Benjamin Lerer > Labels: cql, docs > Fix For: 3.0 > > Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, > CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt > > > The requirement is to do aggregation of data in Cassandra (Wide row of column > values of int, double, float etc). > With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for > the columns within a row). > Example: > SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; > > empid | deptid | first_name | last_name | salary > ---+++---+ >130 | 3 | joe| doe | 10.1 >130 | 2 | joe| doe |100 >130 | 1 | joe| doe | 1e+03 > > SELECT sum(salary), empid FROM emp WHERE empID IN (130); > > sum(salary) | empid > -+ >1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8813) MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type)
[ https://issues.apache.org/jira/browse/CASSANDRA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323693#comment-14323693 ] Chhavi Gangwal commented on CASSANDRA-8813: --- map is of type comments and I am storing as well as retrieving int .The map is stored by executing a query directly via cqlclient. The issue occurs when I try to retrieve the map containing int value . My code looks something like this ByteBuffer valueByteBuffer = ByteBuffer.wrap((byte[]) columnValue); MapType mapType = MapType.getInstance((AbstractType) keyClassInstance, (AbstractType) valueClassInstance); Map rawMap = new HashMap(); rawMap.putAll((Map) mapType.compose(valueByteBuffer)); > MapType.compose throws java.lang.IllegalArgumentException: null when either > of the key or value pair in map type object is of type int (Int32Type) > -- > > Key: CASSANDRA-8813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8813 > Project: Cassandra > Issue Type: Bug > Components: API, Drivers (now out of tree) >Reporter: Chhavi Gangwal > Fix For: 2.1.4 > > > {code}java.lang.IllegalArgumentException: null at > java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) > ~[cassandra-all-2.1.2.jar:2.1.2]at > org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) > ~[cassandra-all-2.1.2.jar:2.1.2] > {code} > The issue mainly occurs due to forced readBytes function in > CollectionSerializer with version 3 for all collection types as well as UDT -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323676#comment-14323676 ] Jonathan Ellis commented on CASSANDRA-4914: --- That's really a different use case. It could be an interesting one, but I'm not sure the infrastructure is there yet to support it. > Aggregation functions in CQL > > > Key: CASSANDRA-4914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 > Project: Cassandra > Issue Type: New Feature >Reporter: Vijay >Assignee: Benjamin Lerer > Labels: cql, docs > Fix For: 3.0 > > Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, > CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt > > > The requirement is to do aggregation of data in Cassandra (Wide row of column > values of int, double, float etc). > With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for > the columns within a row). > Example: > SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; > > empid | deptid | first_name | last_name | salary > ---+++---+ >130 | 3 | joe| doe | 10.1 >130 | 2 | joe| doe |100 >130 | 1 | joe| doe | 1e+03 > > SELECT sum(salary), empid FROM emp WHERE empID IN (130); > > sum(salary) | empid > -+ >1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4914) Aggregation functions in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323669#comment-14323669 ] Anton Slutsky commented on CASSANDRA-4914: -- Hello all, I noticed that some of the aggregate functions discussed on this thread made it into the trunk. I'm a little concerned with the implementation. It looks like aggregates, such as sum, avg, etc. are implemented in code by basically looping through the result set pages and computing the desired aggregates in code. I'm worried that, since Cassandra is meant for large volumes of data, this is not at all a feasible implementation for real world cases. I tried using avg on a more or less sizable dataset and observed two things -- first, my select statement would time out even with bumped up read timeout setting and second, CPU that's running the average computation is quite busy. Obviously, there's only so much that can be done in terms of computing these aggregates without resorting to some sort of distributed computation framework, but I'd like to suggest a slightly different approach. I wonder if we can just rethink how we think about aggregate functions in context of large data. Perhaps, what we could do is consider a probabilistic aggregates instead of raw computable ones? That is, instead of striving to compute an aggregate on an entire resultset, maybe we can compute the aggregate with a stated probability of that aggregate being true. For example: select probabilistic_avg(my_col) from my_table; would return something like a map: {"avg":101.1, "prob":0.78} where "avg" is our probabilistic avg and "prob" is the probability of it being what we say it is. Of course, that wont be as good as the real thing, but it still has value in many cases, I think. And it can be implemented in a scalable way with some scratch system tables. I'm happy to give it a stab if this is of interest to anyone. > Aggregation functions in CQL > > > Key: CASSANDRA-4914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4914 > Project: Cassandra > Issue Type: New Feature >Reporter: Vijay >Assignee: Benjamin Lerer > Labels: cql, docs > Fix For: 3.0 > > Attachments: CASSANDRA-4914-V2.txt, CASSANDRA-4914-V3.txt, > CASSANDRA-4914-V4.txt, CASSANDRA-4914-V5.txt, CASSANDRA-4914.txt > > > The requirement is to do aggregation of data in Cassandra (Wide row of column > values of int, double, float etc). > With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for > the columns within a row). > Example: > SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC; > > empid | deptid | first_name | last_name | salary > ---+++---+ >130 | 3 | joe| doe | 10.1 >130 | 2 | joe| doe |100 >130 | 1 | joe| doe | 1e+03 > > SELECT sum(salary), empid FROM emp WHERE empID IN (130); > > sum(salary) | empid > -+ >1110.1| 130 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8488) Filter by UDF
[ https://issues.apache.org/jira/browse/CASSANDRA-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-8488: -- Description: Allow user-defined functions in WHERE clause with ALLOW FILTERING. (was: Allow UDF in WHERE clause with ALLOW FILTERING.) > Filter by UDF > - > > Key: CASSANDRA-8488 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8488 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis > Labels: client-impacting, cql, udf > > Allow user-defined functions in WHERE clause with ALLOW FILTERING. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-6434) Repair-aware gc grace period
[ https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6434: -- Comment: was deleted (was: Marking patch available; [~blambov] to review) > Repair-aware gc grace period > - > > Key: CASSANDRA-6434 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6434 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: sankalp kohli >Assignee: Marcus Eriksson > Fix For: 3.0 > > > Since the reason for gcgs is to ensure that we don't purge tombstones until > every replica has been notified, it's redundant in a world where we're > tracking repair times per sstable (and repairing frequentily), i.e., a world > where we default to incremental repair a la CASSANDRA-5351. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8810) incorrect indexing of list collection
[ https://issues.apache.org/jira/browse/CASSANDRA-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8810: --- Description: in a table with one indexed field type list ,data retrieval is not working properly: I have a simple table with an indexed list field, but it shows unexpected behavior when I query the list. {code} create table test (whole text PRIMARY KEY, parts list); create index on test (parts); insert into test (whole,parts) values('a', ['a']); insert into test (whole,parts) values('b', ['b']); insert into test (whole,parts) values('c', ['c']); insert into test (whole,parts) values('a.a', ['a','a']); insert into test (whole,parts) values('a.b', ['a','b']); insert into test (whole,parts) values('a.c', ['a','c']); insert into test (whole,parts) values('b.a', ['b','a']); insert into test (whole,parts) values('b.b', ['b','b']); insert into test (whole,parts) values('b.c', ['b','c']); insert into test (whole,parts) values('c.c', ['c','c']); insert into test (whole,parts) values('c.b', ['c','b']); insert into test (whole,parts) values('c.a', ['c','a']);{code} This is expected behavior: -- select * from test where parts contains 'a' ALLOW FILTERING; whole | parts ---+ a | ['a'] b.a | ['b', 'a'] a.c | ['a', 'c'] a.b | ['a', 'b'] a.a | ['a', 'a'] c.a | ['c', 'a'] >From the following query I expect a subset of the previous query result, but >it returns no data --- select * from test where parts contains 'a' and parts contains 'b' ALLOW FILTERING; whole | parts ---+--- was: in a table with one indexed field type list ,data retrieval is not working properly: I have a simple table with an indexed list field, but it shows unexpected behavior when I query the list. create table test (whole text PRIMARY KEY, parts list); create index on test (parts); insert into test (whole,parts) values('a', ['a']); insert into test (whole,parts) values('b', ['b']); insert into test (whole,parts) values('c', ['c']); insert into test (whole,parts) values('a.a', ['a','a']); insert into test (whole,parts) values('a.b', ['a','b']); insert into test (whole,parts) values('a.c', ['a','c']); insert into test (whole,parts) values('b.a', ['b','a']); insert into test (whole,parts) values('b.b', ['b','b']); insert into test (whole,parts) values('b.c', ['b','c']); insert into test (whole,parts) values('c.c', ['c','c']); insert into test (whole,parts) values('c.b', ['c','b']); insert into test (whole,parts) values('c.a', ['c','a']); This is expected behavior: -- select * from test where parts contains 'a' ALLOW FILTERING; whole | parts ---+ a | ['a'] b.a | ['b', 'a'] a.c | ['a', 'c'] a.b | ['a', 'b'] a.a | ['a', 'a'] c.a | ['c', 'a'] >From the following query I expect a subset of the previous query result, but >it returns no data --- select * from test where parts contains 'a' and parts contains 'b' ALLOW FILTERING; whole | parts ---+--- > incorrect indexing of list collection > - > > Key: CASSANDRA-8810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8810 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: windows 8 >Reporter: 007reader > Fix For: 2.1.4 > > > in a table with one indexed field type list ,data retrieval is not > working properly: > I have a simple table with an indexed list field, but it shows > unexpected behavior when I query the list. > {code} > create table test (whole text PRIMARY KEY, parts list); > create index on test (parts); > insert into test (whole,parts) values('a', ['a']); > insert into test (whole,parts) values('b', ['b']); > insert into test (whole,parts) values('c', ['c']); > insert into test (whole,parts) values('a.a', ['a','a']); > insert into test (whole,parts) values('a.b', ['a','b']); > insert into test (whole,parts) values('a.c', ['a','c']); > insert into test (whole,parts) values('b.a', ['b','a']); > insert into test (whole,parts) values('b.b', ['b','b']); > insert into test (whole,parts) values('b.c', ['b','c']); > insert into test (whole,parts) values('c.c', ['c','c']); > insert into test (whole,parts) values('c.b', ['c','b']); > insert into test (whole,parts) values('c.a', ['c','a']);{code} > This is expected behavior: > -- > select * from test where parts contains 'a' ALLOW FILTERING; > whole | parts > ---+ > a | ['a'] >b.a | ['b', 'a'] >a.c | ['a', 'c'] >a.b | ['a', 'b'] >a.a | ['a', 'a'] >c
[jira] [Updated] (CASSANDRA-8810) incorrect indexing of list collection
[ https://issues.apache.org/jira/browse/CASSANDRA-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8810: --- Fix Version/s: (was: 2.1.2) 2.1.4 > incorrect indexing of list collection > - > > Key: CASSANDRA-8810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8810 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: windows 8 >Reporter: 007reader > Fix For: 2.1.4 > > > in a table with one indexed field type list ,data retrieval is not > working properly: > I have a simple table with an indexed list field, but it shows > unexpected behavior when I query the list. > > create table test (whole text PRIMARY KEY, parts list); > create index on test (parts); > insert into test (whole,parts) values('a', ['a']); > insert into test (whole,parts) values('b', ['b']); > insert into test (whole,parts) values('c', ['c']); > insert into test (whole,parts) values('a.a', ['a','a']); > insert into test (whole,parts) values('a.b', ['a','b']); > insert into test (whole,parts) values('a.c', ['a','c']); > insert into test (whole,parts) values('b.a', ['b','a']); > insert into test (whole,parts) values('b.b', ['b','b']); > insert into test (whole,parts) values('b.c', ['b','c']); > insert into test (whole,parts) values('c.c', ['c','c']); > insert into test (whole,parts) values('c.b', ['c','b']); > insert into test (whole,parts) values('c.a', ['c','a']); > This is expected behavior: > -- > select * from test where parts contains 'a' ALLOW FILTERING; > whole | parts > ---+ > a | ['a'] >b.a | ['b', 'a'] >a.c | ['a', 'c'] >a.b | ['a', 'b'] >a.a | ['a', 'a'] >c.a | ['c', 'a'] > From the following query I expect a subset of the previous query result, but > it returns no data > --- > select * from test where parts contains 'a' and parts contains 'b' ALLOW > FILTERING; > whole | parts > ---+--- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8813) MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type)
[ https://issues.apache.org/jira/browse/CASSANDRA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323547#comment-14323547 ] Philip Thompson commented on CASSANDRA-8813: What is the schema for the map column? It's unclear if it is defined as int, or if you are passing in an int. What driver are you using? This is happening when reading, not writing? Could you attach reproduction steps if you have them? > MapType.compose throws java.lang.IllegalArgumentException: null when either > of the key or value pair in map type object is of type int (Int32Type) > -- > > Key: CASSANDRA-8813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8813 > Project: Cassandra > Issue Type: Bug > Components: API, Drivers (now out of tree) >Reporter: Chhavi Gangwal > Fix For: 2.1.4 > > > {code}java.lang.IllegalArgumentException: null at > java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) > ~[cassandra-all-2.1.2.jar:2.1.2]at > org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) > ~[cassandra-all-2.1.2.jar:2.1.2] > {code} > The issue mainly occurs due to forced readBytes function in > CollectionSerializer with version 3 for all collection types as well as UDT -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8812: --- Assignee: Joshua McKenzie > JVM Crashes on Windows x86 > -- > > Key: CASSANDRA-8812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 > Project: Cassandra > Issue Type: Bug > Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 >Reporter: Amichai Rothman >Assignee: Joshua McKenzie > Attachments: crashtest.tgz > > > Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash > due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached > test project can recreate the crash - sometimes it works successfully, > sometimes there's a Java exception in the log, and sometimes the hotspot JVM > crash shows up (regardless of whether the JUnit test results in success - you > can ignore that). Run it a bunch of times to see the various outcomes. It > also contains a sample hotspot error log. > Note that both when the Java exception is thrown and when the JVM crashes, > the stack trace is almost the same - they both eventually occur when the > PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses > the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then > the Java exception is thrown, and if it's in one of the buffer.put() calls > before it, then the JVM crashes. This possibly exposes a JVM bug as well in > this case. So it basically looks like a race condition which results in the > buffer sometimes being used after it is no longer valid. > I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, > as well as on a modern.ie virtualbox image of Windows 7 32-bit running the > JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit > dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on > 2.1.0) doesn't make a difference. At some point in my testing I've also seen > a Java-level exception on Linux, but I can't recreate it at the moment with > this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8813) MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type)
[ https://issues.apache.org/jira/browse/CASSANDRA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8813: --- Fix Version/s: 2.1.4 > MapType.compose throws java.lang.IllegalArgumentException: null when either > of the key or value pair in map type object is of type int (Int32Type) > -- > > Key: CASSANDRA-8813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8813 > Project: Cassandra > Issue Type: Bug > Components: API, Drivers (now out of tree) >Reporter: Chhavi Gangwal > Fix For: 2.1.4 > > > {code}java.lang.IllegalArgumentException: null at > java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) > ~[cassandra-all-2.1.2.jar:2.1.2]at > org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) > ~[cassandra-all-2.1.2.jar:2.1.2] > {code} > The issue mainly occurs due to forced readBytes function in > CollectionSerializer with version 3 for all collection types as well as UDT -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8813) MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type)
[ https://issues.apache.org/jira/browse/CASSANDRA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-8813: --- Description: {code}java.lang.IllegalArgumentException: null at java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0]at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) ~[cassandra-all-2.1.2.jar:2.1.2]at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) ~[cassandra-all-2.1.2.jar:2.1.2] {code} The issue mainly occurs due to forced readBytes function in CollectionSerializer with version 3 for all collection types as well as UDT was: java.lang.IllegalArgumentException: null at java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) ~[cassandra-all-2.1.2.jar:2.1.2]at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) ~[cassandra-all-2.1.2.jar:2.1.2] The issue mainly occurs due to forced readBytes function in CollectionSerializer with version 3 for all collection types as well as UDT > MapType.compose throws java.lang.IllegalArgumentException: null when either > of the key or value pair in map type object is of type int (Int32Type) > -- > > Key: CASSANDRA-8813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8813 > Project: Cassandra > Issue Type: Bug > Components: API, Drivers (now out of tree) >Reporter: Chhavi Gangwal > Fix For: 2.1.4 > > > {code}java.lang.IllegalArgumentException: null at > java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) > ~[cassandra-all-2.1.2.jar:2.1.2]at > org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) > ~[cassandra-all-2.1.2.jar:2.1.2] at > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) > ~[cassandra-all-2.1.2.jar:2.1.2] > {code} > The issue mainly occurs due to forced readBytes function in > CollectionSerializer with version 3 for all collection types as well as UDT -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323420#comment-14323420 ] sankalp kohli commented on CASSANDRA-8815: -- This can also be fixed by adding "files.clear()" to the last line of STT.abort(). Or adding if(aborted) return to start of complete method. > Race in sstable ref counting during streaming failures > > > Key: CASSANDRA-8815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: sankalp kohli >Assignee: Benedict > > We have a seen a machine in Prod whose all read threads are blocked(spinning) > on trying to acquire the reference lock on stables. There are also some > stream sessions which are doing the same. > On looking at the heap dump, we could see that a live sstable which is part > of the View has a ref count = 0. This sstable is also not compacting or is > part of any failed compaction. > On looking through the code, we could see that if ref goes to zero and the > stable is part of the View, all reader threads will spin forever. > On further looking through the code of streaming, we could see that if > StreamTransferTask.complete is called after closeSession has been called due > to error in OutgoingMessageHandler, it will double decrement the ref count of > an sstable. > This race can happen and we see through exception in logs that closeSession > was triggered by OutgoingMessageHandler. > The fix for this is very simple i think. In StreamTransferTask.abort, we can > remove a file from "files” before decrementing the ref count. This will avoid > this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-8815: --- Assignee: Benedict (was: sankalp kohli) > Race in sstable ref counting during streaming failures > > > Key: CASSANDRA-8815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: sankalp kohli >Assignee: Benedict > > We have a seen a machine in Prod whose all read threads are blocked(spinning) > on trying to acquire the reference lock on stables. There are also some > stream sessions which are doing the same. > On looking at the heap dump, we could see that a live sstable which is part > of the View has a ref count = 0. This sstable is also not compacting or is > part of any failed compaction. > On looking through the code, we could see that if ref goes to zero and the > stable is part of the View, all reader threads will spin forever. > On further looking through the code of streaming, we could see that if > StreamTransferTask.complete is called after closeSession has been called due > to error in OutgoingMessageHandler, it will double decrement the ref count of > an sstable. > This race can happen and we see through exception in logs that closeSession > was triggered by OutgoingMessageHandler. > The fix for this is very simple i think. In StreamTransferTask.abort, we can > remove a file from "files” before decrementing the ref count. This will avoid > this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323402#comment-14323402 ] sankalp kohli commented on CASSANDRA-8815: -- cc [~benedict] > Race in sstable ref counting during streaming failures > > > Key: CASSANDRA-8815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: sankalp kohli >Assignee: sankalp kohli > > We have a seen a machine in Prod whose all read threads are blocked(spinning) > on trying to acquire the reference lock on stables. There are also some > stream sessions which are doing the same. > On looking at the heap dump, we could see that a live sstable which is part > of the View has a ref count = 0. This sstable is also not compacting or is > part of any failed compaction. > On looking through the code, we could see that if ref goes to zero and the > stable is part of the View, all reader threads will spin forever. > On further looking through the code of streaming, we could see that if > StreamTransferTask.complete is called after closeSession has been called due > to error in OutgoingMessageHandler, it will double decrement the ref count of > an sstable. > This race can happen and we see through exception in logs that closeSession > was triggered by OutgoingMessageHandler. > The fix for this is very simple i think. In StreamTransferTask.abort, we can > remove a file from "files” before decrementing the ref count. This will avoid > this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sankalp kohli reassigned CASSANDRA-8815: Assignee: sankalp kohli > Race in sstable ref counting during streaming failures > > > Key: CASSANDRA-8815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: sankalp kohli >Assignee: sankalp kohli > > We have a seen a machine in Prod whose all read threads are blocked(spinning) > on trying to acquire the reference lock on stables. There are also some > stream sessions which are doing the same. > On looking at the heap dump, we could see that a live sstable which is part > of the View has a ref count = 0. This sstable is also not compacting or is > part of any failed compaction. > On looking through the code, we could see that if ref goes to zero and the > stable is part of the View, all reader threads will spin forever. > On further looking through the code of streaming, we could see that if > StreamTransferTask.complete is called after closeSession has been called due > to error in OutgoingMessageHandler, it will double decrement the ref count of > an sstable. > This race can happen and we see through exception in logs that closeSession > was triggered by OutgoingMessageHandler. > The fix for this is very simple i think. In StreamTransferTask.abort, we can > remove a file from "files” before decrementing the ref count. This will avoid > this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
[ https://issues.apache.org/jira/browse/CASSANDRA-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323401#comment-14323401 ] sankalp kohli commented on CASSANDRA-8815: -- This is similar to CASSANDRA-7704 > Race in sstable ref counting during streaming failures > > > Key: CASSANDRA-8815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: sankalp kohli > > We have a seen a machine in Prod whose all read threads are blocked(spinning) > on trying to acquire the reference lock on stables. There are also some > stream sessions which are doing the same. > On looking at the heap dump, we could see that a live sstable which is part > of the View has a ref count = 0. This sstable is also not compacting or is > part of any failed compaction. > On looking through the code, we could see that if ref goes to zero and the > stable is part of the View, all reader threads will spin forever. > On further looking through the code of streaming, we could see that if > StreamTransferTask.complete is called after closeSession has been called due > to error in OutgoingMessageHandler, it will double decrement the ref count of > an sstable. > This race can happen and we see through exception in logs that closeSession > was triggered by OutgoingMessageHandler. > The fix for this is very simple i think. In StreamTransferTask.abort, we can > remove a file from "files” before decrementing the ref count. This will avoid > this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8815) Race in sstable ref counting during streaming failures
sankalp kohli created CASSANDRA-8815: Summary: Race in sstable ref counting during streaming failures Key: CASSANDRA-8815 URL: https://issues.apache.org/jira/browse/CASSANDRA-8815 Project: Cassandra Issue Type: Bug Components: Core Reporter: sankalp kohli We have a seen a machine in Prod whose all read threads are blocked(spinning) on trying to acquire the reference lock on stables. There are also some stream sessions which are doing the same. On looking at the heap dump, we could see that a live sstable which is part of the View has a ref count = 0. This sstable is also not compacting or is part of any failed compaction. On looking through the code, we could see that if ref goes to zero and the stable is part of the View, all reader threads will spin forever. On further looking through the code of streaming, we could see that if StreamTransferTask.complete is called after closeSession has been called due to error in OutgoingMessageHandler, it will double decrement the ref count of an sstable. This race can happen and we see through exception in logs that closeSession was triggered by OutgoingMessageHandler. The fix for this is very simple i think. In StreamTransferTask.abort, we can remove a file from "files” before decrementing the ref count. This will avoid this race. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[Cassandra Wiki] Trivial Update of "HowToContribute" by MichaelShuler
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification. The "HowToContribute" page has been changed by MichaelShuler: https://wiki.apache.org/cassandra/HowToContribute?action=diff&rev1=60&rev2=61 Setting up and running system tests: === Running the Unit Tests === - Simply run `ant` to run all unit tests. To run a specific test class, run `ant -Dtest.name=`. To run a specific test method, run `ant testsome -Dtest.name= -Dtest.methods=`. + Simply run `ant test` to run all unit tests. To run a specific test class, run `ant -Dtest.name=`. To run a specific test method, run `ant testsome -Dtest.name= -Dtest.methods=`. You can also run tests in parallel: `ant test -Dtest.runners=4`.
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323297#comment-14323297 ] Amichai Rothman commented on CASSANDRA-8812: I don't know if it's related or not, but it's suspicious that the segment's sync method in some cases closes itself without it being removed from its associated segment manager... > JVM Crashes on Windows x86 > -- > > Key: CASSANDRA-8812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 > Project: Cassandra > Issue Type: Bug > Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 >Reporter: Amichai Rothman > Attachments: crashtest.tgz > > > Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash > due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached > test project can recreate the crash - sometimes it works successfully, > sometimes there's a Java exception in the log, and sometimes the hotspot JVM > crash shows up (regardless of whether the JUnit test results in success - you > can ignore that). Run it a bunch of times to see the various outcomes. It > also contains a sample hotspot error log. > Note that both when the Java exception is thrown and when the JVM crashes, > the stack trace is almost the same - they both eventually occur when the > PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses > the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then > the Java exception is thrown, and if it's in one of the buffer.put() calls > before it, then the JVM crashes. This possibly exposes a JVM bug as well in > this case. So it basically looks like a race condition which results in the > buffer sometimes being used after it is no longer valid. > I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, > as well as on a modern.ie virtualbox image of Windows 7 32-bit running the > JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit > dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on > 2.1.0) doesn't make a difference. At some point in my testing I've also seen > a Java-level exception on Linux, but I can't recreate it at the moment with > this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7724) Native-Transport threads get stuck in StorageProxy.preparePaxos with no one making progress
[ https://issues.apache.org/jira/browse/CASSANDRA-7724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323283#comment-14323283 ] Anton Lebedevich commented on CASSANDRA-7724: - it doesn't reproduce on 2.1.1 > Native-Transport threads get stuck in StorageProxy.preparePaxos with no one > making progress > --- > > Key: CASSANDRA-7724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7724 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux 3.13.11-4 #4 SMP PREEMPT x86_64 Intel(R) Core(TM) > i7 CPU 950 @ 3.07GHz GenuineIntel > java version "1.8.0_05" > Java(TM) SE Runtime Environment (build 1.8.0_05-b13) > Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) > cassandra 2.0.9 >Reporter: Anton Lebedevich > Attachments: aggregateddump.txt, cassandra.threads2 > > > We've got a lot of write timeouts (cas) when running > "INSERT INTO cas_demo(pri_id, sec_id, flag, something) VALUES(?, ?, ?, ?) IF > NOT EXISTS" > from 16 connections in parallel using the same pri_id and different sec_id. > Doing the same from 4 connections in parallel works ok. > All configuration values are at their default values. > CREATE TABLE cas_demo ( > pri_id varchar, > sec_id varchar, > flag boolean, > something set, > PRIMARY KEY (pri_id, sec_id) > ); > CREATE INDEX cas_demo_flag ON cas_demo(flag); > Full thread dump is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8814) Formatting of code blocks in CQL doc in github is a little messed up
[ https://issues.apache.org/jira/browse/CASSANDRA-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Krupansky updated CASSANDRA-8814: -- Description: Although the html version of the CQL doc on the website looks fine, the textile conversion of the source files in github looks a little messed up. In particular, the "p." paragraph directives that terminate "bc.." block code directives are not properly recognized and then the following text gets subsumed into the code block. The directives look fine, as per my read of the textile doc, but it appears that the textile converter used by github requires that there be a blank line before the "p." directive to end the code block. It also requires a space after the dot for "p. ". If you go to the github pages for the CQL doc for trunk, 2.1, and 2.0, you will see stray "p." directives as well as "\_\_Sample\_\_" text in the code blocks, but only where the syntax code block was multiple lines. This is not a problem where the "bc." directive is used with a single dot for a single line, as opposed to the "bc.." directive used with a double dot for a block of lines. Or in the case of the CREATE KEYSPACE section you see all of the notes crammed into what should be the "Sample" box. See: https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.1.2/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.0.11/doc/cql3/CQL.textile This problem ("p." not recognized to terminate a code block unless followed by a space and preceded by a blank line) actually occurs for the interactive textile formatter as well: http://txstyle.org/doc/4/block-code was: Although the html version of the CQL doc on the website looks fine, the textile conversion of the source files in github looks a little messed up. In particular, the "p." paragraph directives that terminate "bc.." block code directives are not properly recognized and then the following text gets subsumed into the code block. The directives look fine, as per my read of the textile doc, but it appears that the textile converter used by github requires that there be a blank line before the "p." directive to end the code block. It also requires a space after the dot for "p. ". If you go to the github pages for the CQL doc for trunk, 2.1, and 2.0, you will see stray "p." directives as well as "\_\_Sample\_\_" text in the code blocks, but only where the syntax code block was multiple lines. This is not a problem where the "bc." directive is used with a single dot for a single line, as opposed to the "bc.." directive used with a double dot for a block of lines. Or in the case of the CREATE KEYSPACE section you see all of the notes crammed into what should be the "Sample" box. See: https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.1.2/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.0.11/doc/cql3/CQL.textile This problem ("p." not recognized to termined a code block unless followed by a space and preceded by a blank line) actually occurs for the interactive textile formatter as well: http://txstyle.org/doc/4/block-code > Formatting of code blocks in CQL doc in github is a little messed up > > > Key: CASSANDRA-8814 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8814 > Project: Cassandra > Issue Type: Task > Components: Documentation & website >Reporter: Jack Krupansky >Priority: Minor > > Although the html version of the CQL doc on the website looks fine, the > textile conversion of the source files in github looks a little messed up. In > particular, the "p." paragraph directives that terminate "bc.." block code > directives are not properly recognized and then the following text gets > subsumed into the code block. The directives look fine, as per my read of the > textile doc, but it appears that the textile converter used by github > requires that there be a blank line before the "p." directive to end the code > block. It also requires a space after the dot for "p. ". > If you go to the github pages for the CQL doc for trunk, 2.1, and 2.0, you > will see stray "p." directives as well as "\_\_Sample\_\_" text in the code > blocks, but only where the syntax code block was multiple lines. This is not > a problem where the "bc." directive is used with a single dot for a single > line, as opposed to the "bc.." directive used with a double dot for a block > of lines. Or in the case of the CREATE KEYSPACE section you see all of the > notes crammed into what should be the "Sample" box. > See: > https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile > https://github.co
[jira] [Created] (CASSANDRA-8814) Formatting of code blocks in CQL doc in github is a little messed up
Jack Krupansky created CASSANDRA-8814: - Summary: Formatting of code blocks in CQL doc in github is a little messed up Key: CASSANDRA-8814 URL: https://issues.apache.org/jira/browse/CASSANDRA-8814 Project: Cassandra Issue Type: Task Components: Documentation & website Reporter: Jack Krupansky Priority: Minor Although the html version of the CQL doc on the website looks fine, the textile conversion of the source files in github looks a little messed up. In particular, the "p." paragraph directives that terminate "bc.." block code directives are not properly recognized and then the following text gets subsumed into the code block. The directives look fine, as per my read of the textile doc, but it appears that the textile converter used by github requires that there be a blank line before the "p." directive to end the code block. It also requires a space after the dot for "p. ". If you go to the github pages for the CQL doc for trunk, 2.1, and 2.0, you will see stray "p." directives as well as "\_\_Sample\_\_" text in the code blocks, but only where the syntax code block was multiple lines. This is not a problem where the "bc." directive is used with a single dot for a single line, as opposed to the "bc.." directive used with a double dot for a block of lines. Or in the case of the CREATE KEYSPACE section you see all of the notes crammed into what should be the "Sample" box. See: https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.1.2/doc/cql3/CQL.textile https://github.com/apache/cassandra/blob/cassandra-2.0.11/doc/cql3/CQL.textile This problem ("p." not recognized to termined a code block unless followed by a space and preceded by a blank line) actually occurs for the interactive textile formatter as well: http://txstyle.org/doc/4/block-code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8067) NullPointerException in KeyCacheSerializer
[ https://issues.apache.org/jira/browse/CASSANDRA-8067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-8067: - Assignee: Aleksey Yeschenko > NullPointerException in KeyCacheSerializer > -- > > Key: CASSANDRA-8067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8067 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Eric Leleu >Assignee: Aleksey Yeschenko > Fix For: 2.1.1 > > > Hi, > I have this stack trace in the logs of Cassandra server (v2.1) > {code} > ERROR [CompactionExecutor:14] 2014-10-06 23:32:02,098 > CassandraDaemon.java:166 - Exception in thread > Thread[CompactionExecutor:14,1,main] > java.lang.NullPointerException: null > at > org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:475) > ~[apache-cassandra-2.1.0.jar:2.1.0] > at > org.apache.cassandra.service.CacheService$KeyCacheSerializer.serialize(CacheService.java:463) > ~[apache-cassandra-2.1.0.jar:2.1.0] > at > org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:225) > ~[apache-cassandra-2.1.0.jar:2.1.0] > at > org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1061) > ~[apache-cassandra-2.1.0.jar:2.1.0] > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source) ~[na:1.7.0] > at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) > ~[na:1.7.0] > at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [na:1.7.0] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > [na:1.7.0] > at java.lang.Thread.run(Unknown Source) [na:1.7.0] > {code} > It may not be critical because this error occured in the AutoSavingCache. > However the line 475 is about the CFMetaData so it may hide bigger issue... > {code} > 474 CFMetaData cfm = > Schema.instance.getCFMetaData(key.desc.ksname, key.desc.cfname); > 475 cfm.comparator.rowIndexEntrySerializer().serialize(entry, > out); > {code} > Regards, > Eric -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8813) MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type)
Chhavi Gangwal created CASSANDRA-8813: - Summary: MapType.compose throws java.lang.IllegalArgumentException: null when either of the key or value pair in map type object is of type int (Int32Type) Key: CASSANDRA-8813 URL: https://issues.apache.org/jira/browse/CASSANDRA-8813 Project: Cassandra Issue Type: Bug Components: API, Drivers (now out of tree) Reporter: Chhavi Gangwal java.lang.IllegalArgumentException: null at java.nio.Buffer.limit(Buffer.java:267) ~[na:1.7.0] at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543) ~[cassandra-all-2.1.2.jar:2.1.2]at org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:122) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:99) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66) ~[cassandra-all-2.1.2.jar:2.1.2] The issue mainly occurs due to forced readBytes function in CollectionSerializer with version 3 for all collection types as well as UDT -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7282) Faster Memtable map
[ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323080#comment-14323080 ] Benedict commented on CASSANDRA-7282: - bq. This benchmark uses Longs as keys Thanks for spotting that bq. However, if even the author of the code can make this mistake so easily It's worth noting that I was recovering from brain surgery performed a couple of weeks prior to posting this benchmark, so it's probably not _quite_ as indicative of the problem as it might appear. I agree it would be preferable to use a different method, though. The problem is doing so neatly and without penalty. The best option is probably to make all Token implement a HashComparable interface, and throw UnsupportedOperationException if they don't really implement it, but that is also pretty ugly. I'm pretty agnostic to the solution to this particular problem, so I'll defer to strong opinions. A potentially more damning criticism of that benchmark is that it assumes a uniform address space for the hashes. As the number of nodes in a cluster grows, the entropy in the top bits decreases, and so the performance of this map could degrade. Besides the aforementioned improvement to bound worst case behaviour at O(lg(N)) we could also normalise the top order bits across the owned ranges. Possibly there are some other strategies, but I need to think on that. > Faster Memtable map > --- > > Key: CASSANDRA-7282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7282 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Labels: performance > Fix For: 3.0 > > Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, > run1.svg, writes.svg > > > Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in > our memtables. Maintaining this is an O(lg(n)) operation; since the vast > majority of users use a hash partitioner, it occurs to me we could maintain a > hybrid ordered list / hash map. The list would impose the normal order on the > collection, but a hash index would live alongside as part of the same data > structure, simply mapping into the list and permitting O(1) lookups and > inserts. > I've chosen to implement this initial version as a linked-list node per item, > but we can optimise this in future by storing fatter nodes that permit a > cache-line's worth of hashes to be checked at once, further reducing the > constant factor costs for lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8811) nodetool rebuild raises EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp resolved CASSANDRA-8811. - Resolution: Invalid Then I'd recommend to throttle streaming traffic (via nodetool) before. Resolving this one as 'invalid' - feel free to reopen if there's anything else missing. > nodetool rebuild raises EOFException > > > Key: CASSANDRA-8811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8811 > Project: Cassandra > Issue Type: Bug > Environment: Debian 7 Wheezy >Reporter: Rafał Furmański > Labels: nodetool > > {noformat} > root@db1:~# nodetool rebuild -- Amsterdam > error: null > -- StackTrace -- > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:267) > at > sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214) > at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) > at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown > Source) > at > javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022) > at > javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292) > at com.sun.proxy.$Proxy7.rebuild(Unknown Source) > at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:929) > at > org.apache.cassandra.tools.NodeTool$Rebuild.execute(NodeTool.java:1595) > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:249) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:163) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7282) Faster Memtable map
[ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322942#comment-14322942 ] Branimir Lambov commented on CASSANDRA-7282: bq. To add some more data to this discussion, I decided quickly to isolate just the CSLM and NBHOM for comparison. This benchmark uses Longs as keys, whose hashCode() does not satisfy the requirements of the NBHOM. The results aren't materially different when this is fixed (NBHOM is still dramatically faster). However, if even the author of the code can make this mistake so easily I don't think reusing {{hashCode()}} for the NBHOM ordering key is acceptable. > Faster Memtable map > --- > > Key: CASSANDRA-7282 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7282 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Labels: performance > Fix For: 3.0 > > Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, > run1.svg, writes.svg > > > Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in > our memtables. Maintaining this is an O(lg(n)) operation; since the vast > majority of users use a hash partitioner, it occurs to me we could maintain a > hybrid ordered list / hash map. The list would impose the normal order on the > collection, but a hash index would live alongside as part of the same data > structure, simply mapping into the list and permitting O(1) lookups and > inserts. > I've chosen to implement this initial version as a linked-list node per item, > but we can optimise this in future by storing fatter nodes that permit a > cache-line's worth of hashes to be checked at once, further reducing the > constant factor costs for lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8752) invalid counter shard detected in Version 2.1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322814#comment-14322814 ] Aleksey Yeschenko commented on CASSANDRA-8752: -- Hey, Stefan. Soon, within a few days. > invalid counter shard detected in Version 2.1.2 > --- > > Key: CASSANDRA-8752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8752 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: SUSE11 SP1, Cassandra 2.1.2, java version "1.7.0_55". > 4 node cluster, vnode = 1, replication = 2 >Reporter: Kevin Ye >Assignee: Aleksey Yeschenko > > I was doing counter test (first +100 several times, then -33) on a 4 nodes > cluster while below log appear at 2 nodes.There is no concurrent access to > same counter. > WARN [CompactionExecutor:757] 2015-02-02 13:02:33,375 > CounterContext.java:431 - invalid global counter shard detected; > (9cca9262-934a-4275-963b-66802471b0c2, 1, -33) and > (9cca9262-934a-4275-963b-66802471b0c2, 1, 100) differ only in count; will > pick highest to self-heal on compaction > Anyone has encounter this problem? I thought Cassandra 2.1.2 had solved this > counter problem, but it appeared. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amichai Rothman updated CASSANDRA-8812: --- Description: Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. was: Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur in CommitLogSegment.sync when accessing the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. > JVM Crashes on Windows x86 > -- > > Key: CASSANDRA-8812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 > Project: Cassandra > Issue Type: Bug > Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 >Reporter: Amichai Rothman > Attachments: crashtest.tgz > > > Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash > due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached > test project can recreate the crash - sometimes it works successfully, > sometimes there's a Java exception in the log, and sometimes the hotspot JVM > crash shows up (regardless of whether the JUnit test results in success - you > can ignore that). Run it a bunch of times to see the various outcomes. It > also contains a sample hotspot error log. > Note that both when the Java exception is thrown and when the JVM crashes, > the stack trace is almost the same - they both eventually occur when the > PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses > the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then > the Java exception is thrown, and if it's in one of the buffer.put() calls > before it, then the JVM crashes. This possibly exposes a JVM bug as well in > this case. So it basically looks like a race condition which results in the > buffer sometimes being used after it is no longer valid. > I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, > as well as on a modern.ie virtualbox image of Windows 7 32-bi
[jira] [Created] (CASSANDRA-8812) JVM Crashes on Windows x86
Amichai Rothman created CASSANDRA-8812: -- Summary: JVM Crashes on Windows x86 Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur in CommitLogSegment.sync when accessing the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8752) invalid counter shard detected in Version 2.1.2
[ https://issues.apache.org/jira/browse/CASSANDRA-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322737#comment-14322737 ] Stefan Matei commented on CASSANDRA-8752: - Hi Aleksey, When is 2.1.3 scheduled for release? Do you have a specific date? Best Regards, Stefan > invalid counter shard detected in Version 2.1.2 > --- > > Key: CASSANDRA-8752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8752 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: SUSE11 SP1, Cassandra 2.1.2, java version "1.7.0_55". > 4 node cluster, vnode = 1, replication = 2 >Reporter: Kevin Ye >Assignee: Aleksey Yeschenko > > I was doing counter test (first +100 several times, then -33) on a 4 nodes > cluster while below log appear at 2 nodes.There is no concurrent access to > same counter. > WARN [CompactionExecutor:757] 2015-02-02 13:02:33,375 > CounterContext.java:431 - invalid global counter shard detected; > (9cca9262-934a-4275-963b-66802471b0c2, 1, -33) and > (9cca9262-934a-4275-963b-66802471b0c2, 1, 100) differ only in count; will > pick highest to self-heal on compaction > Anyone has encounter this problem? I thought Cassandra 2.1.2 had solved this > counter problem, but it appeared. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8811) nodetool rebuild raises EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322725#comment-14322725 ] Rafał Furmański commented on CASSANDRA-8811: Looks like node ran out of memory during this operation and that's why I've got this error. > nodetool rebuild raises EOFException > > > Key: CASSANDRA-8811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8811 > Project: Cassandra > Issue Type: Bug > Environment: Debian 7 Wheezy >Reporter: Rafał Furmański > Labels: nodetool > > {noformat} > root@db1:~# nodetool rebuild -- Amsterdam > error: null > -- StackTrace -- > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:267) > at > sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214) > at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) > at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown > Source) > at > javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022) > at > javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292) > at com.sun.proxy.$Proxy7.rebuild(Unknown Source) > at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:929) > at > org.apache.cassandra.tools.NodeTool$Rebuild.execute(NodeTool.java:1595) > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:249) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:163) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8751) C* should always listen to both ssl/non-ssl ports
[ https://issues.apache.org/jira/browse/CASSANDRA-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322714#comment-14322714 ] Mike Adamson commented on CASSANDRA-8751: - Why not have a single socket supporting TLS. The socket could / would then support encrypted and unencrypted connections.This could be controlled by configuration as to whether unencrypted connections are allowed. > C* should always listen to both ssl/non-ssl ports > - > > Key: CASSANDRA-8751 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8751 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Minh Do >Assignee: Minh Do >Priority: Critical > > Since there is always one thread dedicated on server socket listener and it > does not use much resource, we should always have these two listeners up no > matter what users set for internode_encryption. > The reason behind this is that we need to switch back and forth between > different internode_encryption modes and we need C* servers to keep running > in transient state or during mode switching. Currently this is not possible. > For example, we have a internode_encryption=dc cluster in a multi-region AWS > environment and want to set internode_encryption=all by rolling restart C* > nodes. However, the node with internode_encryption=all does not open to > listen to non-ssl port. As a result, we have a splitted brain cluster here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-8803) Implement transitional mode in C* that will accept both encrypted and non-encrypted client traffic
[ https://issues.apache.org/jira/browse/CASSANDRA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Adamson updated CASSANDRA-8803: Comment: was deleted (was: Why not have a single socket supporting TLS. The socket could / would then support encrypted and unencrypted connections.This could be controlled by configuration as to whether unencrypted connections are allowed. ) > Implement transitional mode in C* that will accept both encrypted and > non-encrypted client traffic > -- > > Key: CASSANDRA-8803 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8803 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Vishy Kasar > > We have some non-secure clusters taking live traffic in production from > active clients. We want to enable client to node encryption on these > clusters. Once we set the client_encryption_options enabled to true in yaml > and bounce a cassandra node in the ring, the existing clients that do not do > SSL will fail to connect to that node. > There does not seem to be a good way to roll this change with out taking an > outage. Can we implement a transitional mode in C* that will accept both > encrypted and non-encrypted client traffic? We would enable this during > transition and turn it off after both server and client start talking SSL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8803) Implement transitional mode in C* that will accept both encrypted and non-encrypted client traffic
[ https://issues.apache.org/jira/browse/CASSANDRA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322712#comment-14322712 ] Mike Adamson commented on CASSANDRA-8803: - Why not have a single socket supporting TLS. The socket could / would then support encrypted and unencrypted connections.This could be controlled by configuration as to whether unencrypted connections are allowed. > Implement transitional mode in C* that will accept both encrypted and > non-encrypted client traffic > -- > > Key: CASSANDRA-8803 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8803 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Vishy Kasar > > We have some non-secure clusters taking live traffic in production from > active clients. We want to enable client to node encryption on these > clusters. Once we set the client_encryption_options enabled to true in yaml > and bounce a cassandra node in the ring, the existing clients that do not do > SSL will fail to connect to that node. > There does not seem to be a good way to roll this change with out taking an > outage. Can we implement a transitional mode in C* that will accept both > encrypted and non-encrypted client traffic? We would enable this during > transition and turn it off after both server and client start talking SSL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8508) Manage duplicate SSTableReader lifetimes with RefCount instead of replacement chain
[ https://issues.apache.org/jira/browse/CASSANDRA-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-8508. - Resolution: Duplicate > Manage duplicate SSTableReader lifetimes with RefCount instead of replacement > chain > --- > > Key: CASSANDRA-8508 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8508 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Priority: Minor > Fix For: 3.0 > > > Also possible to benefit from CASSANDRA-7705; should make things easier to > reason about, and easier to debug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-6633) Dynamic Resize of Bloom Filters
[ https://issues.apache.org/jira/browse/CASSANDRA-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict reassigned CASSANDRA-6633: --- Assignee: Benedict > Dynamic Resize of Bloom Filters > --- > > Key: CASSANDRA-6633 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6633 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Labels: performance > Fix For: 3.0 > > > Dynamic resizing would be useful. The simplest way to achieve this is to have > separate address spaces for each hash function, so that we may > increase/decrease accuracy by simply loading/unloading another function (we > could even do interesting stuff in future like alternating the functions we > select if we find we're getting more false positives than should be expected); > Faster loading/unloading would help this, and we could achieve this by > mmapping the bloom filter representation on systems that we can mlock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8763) Trickle fsync should not be optional, but should have a large default interval
[ https://issues.apache.org/jira/browse/CASSANDRA-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-8763: Fix Version/s: (was: 2.1.4) 3.0 > Trickle fsync should not be optional, but should have a large default interval > -- > > Key: CASSANDRA-8763 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8763 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Priority: Minor > Fix For: 3.0 > > > The reason to avoid "trickle fsync" is that it permits more efficient > flushing; however once we get above a few hundred MBs, it really doesn't make > a great deal of difference. Contrarily, it can cause runaway utilisation of > memory by dirty pages in the page cache, damaging performance for other > components and potentially invoking the kernel OOM killer. I suggest we pick > an amount of memory proportional to the size of the heap (or check the actual > amount of memory in the system), and divide this by the number of flush > threads we have, and use this as the default trickle fsync interval, and we > _always_ "trickle" fsync, ignoring the ability to disable it. We only permit > on override of the size of the interval (if you want to disable it, you can > set the interval absurdly large). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322704#comment-14322704 ] Benedict commented on CASSANDRA-8805: - /cc [~thobbs] > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.1.4 > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-8805: Description: Operations like repair that may operate over all sstables cancel compactions before beginning, and fail if there are any files marked compacting after doing so. Redistribution of index summaries is not a compaction, so is not cancelled by this action, but does mark sstables as compacting, so such an action will fail to initiate if there is an index summary redistribution in progress. It seems that IndexSummaryManager needs to register itself as interruptible along with compactions (AFAICT no other actions that may markCompacting are not themselves compactions). (was: Operations that require running without ongoing compactions cancel those compactions before beginning. Unfortunately some actions are not really compactions, and so will not be cancelled by this action. Redistribution of index summaries is one such action (there may be others). It seems that any operation that may markCompacting needs to register itself as interruptible.) > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.1.4 > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8811) nodetool rebuild raises EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322671#comment-14322671 ] Rafał Furmański commented on CASSANDRA-8811: Sure. I've added new, one-node Cassandra DC named 'Analytics' and ran this command to rebuild the data from my second DC (Amsterdam). > nodetool rebuild raises EOFException > > > Key: CASSANDRA-8811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8811 > Project: Cassandra > Issue Type: Bug > Environment: Debian 7 Wheezy >Reporter: Rafał Furmański > Labels: nodetool > > {noformat} > root@db1:~# nodetool rebuild -- Amsterdam > error: null > -- StackTrace -- > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:267) > at > sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214) > at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) > at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown > Source) > at > javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022) > at > javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292) > at com.sun.proxy.$Proxy7.rebuild(Unknown Source) > at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:929) > at > org.apache.cassandra.tools.NodeTool$Rebuild.execute(NodeTool.java:1595) > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:249) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:163) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8811) nodetool rebuild raises EOFException
[ https://issues.apache.org/jira/browse/CASSANDRA-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322667#comment-14322667 ] Robert Stupp commented on CASSANDRA-8811: - [~rfurmanski] can you please provide more details how to reproduce this. > nodetool rebuild raises EOFException > > > Key: CASSANDRA-8811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8811 > Project: Cassandra > Issue Type: Bug > Environment: Debian 7 Wheezy >Reporter: Rafał Furmański > Labels: nodetool > > {noformat} > root@db1:~# nodetool rebuild -- Amsterdam > error: null > -- StackTrace -- > java.io.EOFException > at java.io.DataInputStream.readByte(DataInputStream.java:267) > at > sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214) > at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) > at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) > at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown > Source) > at > javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022) > at > javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292) > at com.sun.proxy.$Proxy7.rebuild(Unknown Source) > at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:929) > at > org.apache.cassandra.tools.NodeTool$Rebuild.execute(NodeTool.java:1595) > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:249) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:163) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-8806) Some queries with Token restrictions require ALLOW FILTERING and should not
[ https://issues.apache.org/jira/browse/CASSANDRA-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-8806. - Resolution: Invalid This is working as expected. The token restriction in this query really has no bearing on whether we require filtering or not, since for all we know, a token restriction can still be selecting the whole ring. So this is the same than: {noformat} SELECT * FROM test WHERE c > 0; {noformat} And what this query does is iterating over all the partition of the database but only returning those row that have {{c > 0}}. It doesn't really matter that each partition will be eliminated relatively quickly if it has nothing for {{c > 0}}, it still is looked at and the result is that even if very little row in the database match the query, that query may take a very very long time, and that's exactly the kind of thing {{ALLOW FILTERING}} is here to warn about. > Some queries with Token restrictions require ALLOW FILTERING and should not > --- > > Key: CASSANDRA-8806 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8806 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > > Queries like {{SELECT * FROM test WHERE token(a, b) > token(0, 0) AND c > > 10}} require ALLOW FILTERING and should not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7970) JSON support for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322549#comment-14322549 ] Sylvain Lebresne commented on CASSANDRA-7970: - Sorry for the lack of communication. I'll have a good look at this but that might only be next week if that's ok (got a long flight next Sunday so I'll review it then). That said a couple of early remarks (that might be somewhat off since I haven't checked the patch) based on the comments on this ticket so far. bq. I've made the column name declaration optional with doing INSERT JSON I'd actually have a preference for not allowing the column name declaration at all as it doesn't buy us anything and imo having 2 forms is more confusing than anything. Even if we later want to allow both {{VALUES}} and {{JSON}} (which I'm actually kind of against but we can argue later since we've at least agreed on postoning that option), we can introduce back the names declaration later. bq. toJson() can only be used in the selection clause of a SELECT statement, because it can accept any type and the exact argument type must be known. Not 100% sure I see where the problem is on this one, at least in theory. Even if some of our literals can be of multiple types (typically numeric literals), they will always translate to the same thing in JSON anyway so that shouldn't be a problem. As for bind markers, we can do what we do for other functions when their is an ambiguity and require the user to provide a type-cast. Is it just that it's not convenient to do with the current code, or is there something more fundamental I'm missing? bq. fromJson() can only be used in INSERT/UPDATE/DELETE statements because the receiving type must be known in order to parse the JSON correctly. That one I understand, but I'm not sure a per-statement restriction is necessary the most appropriate because I suppose there is a problem with functions too since we allow overloading (namely, we can have 2 {{foo}} method, one taking a {{list}} as argument, and the other taking a {{int}}, so {{foo(fromJson(z))}} would be problematic). So the most logical way to handle this for me would be to generalize slightly the notion of "some type" that we already have due to bind marker. Typically, both a bind marker type and {{fromJson}} return type would be "some type", and when the type checker encounter one and can't resolve it to a single type, it would reject it asking the user to type-cast explicitely. Similarly, {{toJon()}} argument could be "some type". Again, we already do this for bind markers, it's a just a bit adhoc so it would just be a matter of generalizing it a bit. > JSON support for CQL > > > Key: CASSANDRA-7970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7970 > Project: Cassandra > Issue Type: New Feature > Components: API >Reporter: Jonathan Ellis >Assignee: Tyler Hobbs > Labels: client-impacting, cql3.3, docs-impacting > Fix For: 3.0 > > Attachments: 7970-trunk-v1.txt > > > JSON is popular enough that not supporting it is becoming a competitive > weakness. We can add JSON support in a way that is compatible with our > performance goals by *mapping* JSON to an existing schema: one JSON documents > maps to one CQL row. > Thus, it is NOT a goal to support schemaless documents, which is a misfeature > [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON > document from a service or a user into a CQL row, with all the validation > that entails. > Since we are not looking to support schemaless documents, we will not be > adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will > map the JSON to UDT, collections, and primitive CQL types. > Here's how this might look: > {code} > CREATE TYPE address ( > street text, > city text, > zip_code int, > phones set > ); > CREATE TABLE users ( > id uuid PRIMARY KEY, > name text, > addresses map > ); > INSERT INTO users JSON > {‘id’: 4b856557-7153, >‘name’: ‘jbellis’, >‘address’: {“home”: {“street”: “123 Cassandra Dr”, > “city”: “Austin”, > “zip_code”: 78747, > “phones”: [2101234567]}}}; > SELECT JSON id, address FROM users; > {code} > (We would also want to_json and from_json functions to allow mapping a single > column's worth of data. These would not require extra syntax.) > [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/ > [2] https://blog.compose.io/schema-less-is-usually-a-lie/ > [3] http://dl.acm.org/citation.cfm?id=2481247 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8811) nodetool rebuild raises EOFException
Rafał Furmański created CASSANDRA-8811: -- Summary: nodetool rebuild raises EOFException Key: CASSANDRA-8811 URL: https://issues.apache.org/jira/browse/CASSANDRA-8811 Project: Cassandra Issue Type: Bug Environment: Debian 7 Wheezy Reporter: Rafał Furmański {noformat} root@db1:~# nodetool rebuild -- Amsterdam error: null -- StackTrace -- java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:214) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161) at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source) at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1022) at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:292) at com.sun.proxy.$Proxy7.rebuild(Unknown Source) at org.apache.cassandra.tools.NodeProbe.rebuild(NodeProbe.java:929) at org.apache.cassandra.tools.NodeTool$Rebuild.execute(NodeTool.java:1595) at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:249) at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:163) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8779) Able to unintentionally nest tuples during insert
[ https://issues.apache.org/jira/browse/CASSANDRA-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14322522#comment-14322522 ] Sylvain Lebresne commented on CASSANDRA-8779: - I like it. bq. or we could be extra safe and require it for Execute messages I'd have a small preference for leaving it to query with parameters (feels wasteful in other cases if we make it mandatory, and it doesn't buy much safety (but add complexity) if it's optional) and make it non-optional for v4 onwards. > Able to unintentionally nest tuples during insert > - > > Key: CASSANDRA-8779 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8779 > Project: Cassandra > Issue Type: Bug > Environment: Linux Mint 64-bit | ruby-driver 2.1 | java-driver 2.1 | > C* 2.1.2 >Reporter: Kishan Karunaratne >Assignee: Tyler Hobbs > > If I insert a tuple using an extra pair of ()'s, C* will let me do the > insert, but (incorrectly) creates a nested tuple as the first tuple value. > Upon doing a select statement, the result is jumbled and has weird binary in > it (which I wasn't able to copy into here). > Example using ruby-driver: > {noformat} > session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b > frozen>)") > complete = Cassandra::Tuple.new('foo', 123, true) > session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", arguments: > [complete])# extra ()'s here > result = session.execute("SELECT b FROM mytable WHERE a=0").first > p result['b'] > {noformat} > Output: > {noformat} > # > {noformat} > Bug also confirmed using java-driver. > Example using java-driver: > {noformat} > session.execute("CREATE TABLE mytable (a int PRIMARY KEY, b > frozen>)"); > TupleType t = TupleType.of(DataType.ascii(), DataType.cint(), > DataType.cboolean()); > TupleValue complete = t.newValue("foo", 123, true); > session.execute("INSERT INTO mytable (a, b) VALUES (0, (?))", complete); // > extra ()'s here > TupleValue r = session.execute("SELECT b FROM mytable WHERE > a=0").one().getTupleValue("b"); > System.out.println(r); > {noformat} > Output: > {noformat} > ('foo{', null, null) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)