[jira] [Commented] (CASSANDRA-10303) streaming for 'nodetool rebuild' fails after adding a datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744870#comment-14744870 ] zhaoyan commented on CASSANDRA-10303: - I am sorry, I has no environment to reproduce this with a fresh cluster. > streaming for 'nodetool rebuild' fails after adding a datacenter > - > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.7.0_60] > at > org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.8.jar:2.1.8] > ... 1 common frames omitted > {code} > i must restart node to stop current rebuild, and rebuild agagin and again to > success -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...
[ https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744868#comment-14744868 ] Dawid Szejnfeld commented on CASSANDRA-10292: - Hi David, actually I just upgraded from 2.1.8 to next 2.1.9 stable version and that's it, the problem disappeared after it. That's all I did. > java.lang.AssertionError: attempted to delete non-existing file CommitLog... > > > Key: CASSANDRA-10292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10292 > Project: Cassandra > Issue Type: Bug > Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 > nodes cluster >Reporter: Dawid Szejnfeld >Priority: Critical > > From time to time some nodes are stopping to work due to error in logs like > this: > INFO [CompactionExecutor:2475] 2015-09-09 12:36:50,363 > CompactionTask.java:274 - Compacted 4 sstables to > [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38 > 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,]. 419 bytes to 42 > (~10% of original) in 33ms = 0.001214MB/s. 4 total partitions merged to 1. > Partition merge counts were {2:2, } > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 > ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, > 0 (0%) off-heap > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - > Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, > 0%/0% of on/off-heap limit) > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - > Completed flushing > /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe > nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position > ReplayPosition(segmentId=1441362636571, position=33554415) > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 > - Stopping gossiper > WARN [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 > - Stopping gossip by operator request > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - > Announcing shutdown > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 > - Stopping RPC server > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - > Stop listening to thrift clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 > - Stopping native transport > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop > listening for CQL clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-4-1441362636316.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > [apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85] > After I create missing commit log file and restart cassandra service > everything is OK then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10303) streaming for 'nodetool rebuild' fails after adding a datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744867#comment-14744867 ] zhaoyan commented on CASSANDRA-10303: - my data is very big. one node is 1T data I have six nodes in one cluster > streaming for 'nodetool rebuild' fails after adding a datacenter > - > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.7.0_60] > at > org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.8.jar:2.1.8] > ... 1 common frames omitted > {code} > i must restart node to stop current rebuild, and rebuild agagin and again to > success -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10303) streaming for 'nodetool rebuild' fails after adding a datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744864#comment-14744864 ] zhaoyan commented on CASSANDRA-10303: - I restart and rebuild agagin and agagin rebuild one node success after many try times; > streaming for 'nodetool rebuild' fails after adding a datacenter > - > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.7.0_60] > at > org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.8.jar:2.1.8] > ... 1 common frames omitted > {code} > i must restart node to stop current rebuild, and rebuild agagin and again to > success -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10336) SinglePartitionSliceCommandTest.staticColumnsAreReturned flappy (3.0)
Robert Stupp created CASSANDRA-10336: Summary: SinglePartitionSliceCommandTest.staticColumnsAreReturned flappy (3.0) Key: CASSANDRA-10336 URL: https://issues.apache.org/jira/browse/CASSANDRA-10336 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp UTest {{SinglePartitionSliceCommandTest.staticColumnsAreReturned}} seems to flap. The test failed during build [#110|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_testall/110], 112, 115, 116, 118, 123, 125 and 128 (didn't check before build 95) It _might_ be related to the changes of CASSANDRA-10232 - but I don't get why it flaps. /cc [~slebresne] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10318) Update cqlsh COPY for new internal driver serialization interface
[ https://issues.apache.org/jira/browse/CASSANDRA-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744796#comment-14744796 ] Stefania commented on CASSANDRA-10318: -- There is another problem in addition to the serialization interface change, these two lines were also broken: {code} conn._callbacks[request_id] = partial(callback, current_record) conn.deque.append(binary_message) {code} They must be replaced with: {code} conn._requests[request_id] = (partial(callback, current_record), ProtocolHandler.decode_message) conn.push(binary_message) {code} However, I think we can just call {{conn.send_msg}}. [~aholmber] can you take a look at [this commit|https://github.com/stef1927/cassandra/commit/439332ae64290b87a7abd8474b580f49af1d7959] and confirm that {{conn.send_msg}} is equivalent to the code that is commented out? (The cluster is created with DEFAULT_PROTOCOL_VERSION and no compression). CI: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10318-3.0-dtest/ > Update cqlsh COPY for new internal driver serialization interface > - > > Key: CASSANDRA-10318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10318 > Project: Cassandra > Issue Type: Bug >Reporter: Adam Holmberg >Assignee: Stefania > Fix For: 3.0.0 rc1 > > > A recent driver update changed some of the internal serialization interface. > cqlsh relies on this for the copy command and will need to be updated. > Previously used > {code} > cassandra.protocol.QueryMesage.to_binary > {code} > now should use > {code} > cassandra.protocol.ProtocolHandler.encode_message(...) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744732#comment-14744732 ] Blake Eggleston commented on CASSANDRA-10238: - Great, everything else LGTM. > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds
[ https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744717#comment-14744717 ] Stefania commented on CASSANDRA-8072: - Without having looked in much detail at this yet, it seems very similar to CASSANDRA-10205. It's worth trying out that patch to see if it solves this as well. Summary of 10205: after a node is decommissioned, its data wiped, and the node restarted, it never receives GOSSIP replies from seeds during the shadow round and fails to start. This is because the node is not marked as dead and the socket is not closed properly. Marking the node as dead even when the status is LEFT, closes the socket and the test of 10205 passes. The patch is for 3.0 but the same problem occurs on 2.0+. > Exception during startup: Unable to gossip with any seeds > - > > Key: CASSANDRA-8072 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8072 > Project: Cassandra > Issue Type: Bug >Reporter: Ryan Springer >Assignee: Stefania > Fix For: 2.1.x > > Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, > cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, > cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, > casandra-system-log-with-assert-patch.log, screenshot-1.png, > trace_logs.tar.bz2 > > > When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster > in either ec2 or locally, an error occurs sometimes with one of the nodes > refusing to start C*. The error in the /var/log/cassandra/system.log is: > ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) > Exception encountered during startup > java.lang.RuntimeException: Unable to gossip with any seeds > at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200) > at > org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444) > at > org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:609) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:502) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585) > INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java > (line 1279) Announcing shutdown > INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 > MessagingService.java (line 701) Waiting for messaging service to quiesce > INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 > MessagingService.java (line 941) MessagingService has terminated the accept() > thread > This errors does not always occur when provisioning a 2-node cluster, but > probably around half of the time on only one of the nodes. I haven't been > able to reproduce this error with DSC 2.0.9, and there have been no code or > definition file changes in Opscenter. > I can reproduce locally with the above steps. I'm happy to test any proposed > fixes since I'm the only person able to reproduce reliably so far. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10264) Unable to use conditions on static columns for DELETE
[ https://issues.apache.org/jira/browse/CASSANDRA-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744708#comment-14744708 ] Stefania commented on CASSANDRA-10264: -- The error comes from code added by CASSANDRA-6237 in {{StatementRestrictions.java}}, I think [~blerer] should take a look first. > Unable to use conditions on static columns for DELETE > - > > Key: CASSANDRA-10264 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10264 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.2.0 >Reporter: DOAN DuyHai > > {noformat} > cqlsh:test> create table static_table(id int, stat int static, ord int, val > text, primary key(id,ord)); > cqlsh:test> insert into static_table (id,stat,ord,val) VALUES ( 1, 1, 1, '1'); > cqlsh:test> delete from static_table where id=1 and ord=1 if stat != 1; > Invalid syntax at line 1, char 55 > delete from static_table where id=1 and ord=1 if stat != 1; > ^ > {noformat} > Same error if using =, <, <=, >= or > condition > According to [~thobbs] the syntax should work. Plus, the error message is > wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10264) Unable to use conditions on static columns for DELETE
[ https://issues.apache.org/jira/browse/CASSANDRA-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-10264: - Assignee: Benjamin Lerer > Unable to use conditions on static columns for DELETE > - > > Key: CASSANDRA-10264 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10264 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.2.0 >Reporter: DOAN DuyHai >Assignee: Benjamin Lerer > > {noformat} > cqlsh:test> create table static_table(id int, stat int static, ord int, val > text, primary key(id,ord)); > cqlsh:test> insert into static_table (id,stat,ord,val) VALUES ( 1, 1, 1, '1'); > cqlsh:test> delete from static_table where id=1 and ord=1 if stat != 1; > Invalid syntax at line 1, char 55 > delete from static_table where id=1 and ord=1 if stat != 1; > ^ > {noformat} > Same error if using =, <, <=, >= or > condition > According to [~thobbs] the syntax should work. Plus, the error message is > wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744706#comment-14744706 ] Stefania commented on CASSANDRA-10238: -- Thanks, PFS is covered by our tests in 2.0, GPFS will only be tested in 2.1+. > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-10238: - Assignee: Stefania (was: Brandon Williams) > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744700#comment-14744700 ] Brandon Williams commented on CASSANDRA-10238: -- PFS was used for the initial report. > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-10238: Assignee: Brandon Williams (was: Stefania) > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744697#comment-14744697 ] Stefania commented on CASSANDRA-10238: -- CI results will eventually appear here: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.0-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.0-dtest/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.1-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.1-dtest/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.2-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-2.2-dtest/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-3.0-testall/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10238-3.0-dtest/ > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744695#comment-14744695 ] Stefania commented on CASSANDRA-10238: -- Thanks for the review. I was relying on Gossip to eventually propagate the changes of other hosts. I now closed this window for {{YamlFileNetworkTopologySnitch}} and {{PropertyFileSnitch}} by updating all endpoints whose dc or rack have changed in the configuration file. I've also prepared the 2.0, 2.2 and 3.0 patches, see attached. They are identical except: * {{GossipingPropertyFileSnitch}} does not seem capable of updating its configuration in 2.0 * {{YamlFileNetworkTopologySnitch}} was deleted in 2.2. [~brandon.williams] what was the snitch in use when the original problem in 2.0 was reported? > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10335) Collect flight recordings of canonical bulk read workload in CI
Ariel Weisberg created CASSANDRA-10335: -- Summary: Collect flight recordings of canonical bulk read workload in CI Key: CASSANDRA-10335 URL: https://issues.apache.org/jira/browse/CASSANDRA-10335 Project: Cassandra Issue Type: Sub-task Reporter: Ariel Weisberg Flight recorder to track GC, IO stalls, lock contention, idle threads. Don't need CPU profiling since that will be covered by flame graphs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10334) Generate flame graphs from canonical bulk reading workload running in CI
Ariel Weisberg created CASSANDRA-10334: -- Summary: Generate flame graphs from canonical bulk reading workload running in CI Key: CASSANDRA-10334 URL: https://issues.apache.org/jira/browse/CASSANDRA-10334 Project: Cassandra Issue Type: Sub-task Reporter: Ariel Weisberg Flame graphs for CPU utilization. Bonus points if we can get source code annotated with cache misses or at least total counts of cache misses for the entire run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10333) Collect SAR metrics from CI jobs running canonical bulk reading workload
Ariel Weisberg created CASSANDRA-10333: -- Summary: Collect SAR metrics from CI jobs running canonical bulk reading workload Key: CASSANDRA-10333 URL: https://issues.apache.org/jira/browse/CASSANDRA-10333 Project: Cassandra Issue Type: Sub-task Reporter: Ariel Weisberg sar to track block device metrics, interrupts, context switches etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10332) Run canonical bulk reading workload nightly in CI and provide a dashboard of the result
Ariel Weisberg created CASSANDRA-10332: -- Summary: Run canonical bulk reading workload nightly in CI and provide a dashboard of the result Key: CASSANDRA-10332 URL: https://issues.apache.org/jira/browse/CASSANDRA-10332 Project: Cassandra Issue Type: Sub-task Reporter: Ariel Weisberg Run on trunk and release branches. Make it possible to opt in other branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10331) Establish and implement canonical bulk reading workload(s)
Ariel Weisberg created CASSANDRA-10331: -- Summary: Establish and implement canonical bulk reading workload(s) Key: CASSANDRA-10331 URL: https://issues.apache.org/jira/browse/CASSANDRA-10331 Project: Cassandra Issue Type: Sub-task Reporter: Ariel Weisberg Implement a client, use stress, or extend stress to a bulk reading workload that is indicative of the performance we are trying to improve. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10330) Gossipinfo could return more useful information
[ https://issues.apache.org/jira/browse/CASSANDRA-10330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-10330: - Attachment: 10330.txt Patch to do this. > Gossipinfo could return more useful information > --- > > Key: CASSANDRA-10330 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10330 > Project: Cassandra > Issue Type: Improvement >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Minor > Attachments: 10330.txt > > > For instance, the version for each state, which can be useful for diagnosing > the reason for any missing states. Also instead of just omitting the TOKENS > state, let's indicate whether the state was actually present or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10107) Windows dtest 3.0: TestScrub and TestScrubIndexes failures
[ https://issues.apache.org/jira/browse/CASSANDRA-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-10107: --- Component/s: (was: dows dtest 3.0: TestScrub / TestScrubIndexes failures) > Windows dtest 3.0: TestScrub and TestScrubIndexes failures > -- > > Key: CASSANDRA-10107 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10107 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie > Labels: Windows > > scrub_test.py:TestScrub.test_standalone_scrub > scrub_test.py:TestScrub.test_standalone_scrub_essential_files_only > scrub_test.py:TestScrubIndexes.test_standalone_scrub > Somewhat different messages between CI and local, but consistent on env. > Locally, I see: > {noformat} > dtest: DEBUG: ERROR 20:41:20 This platform does not support atomic directory > streams (SecureDirectoryStream); race conditions when loading sstable files > could occurr > {noformat} > Consistently fails, both on CI and locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8661) We don't detect OOM when allocating a Memory object
[ https://issues.apache.org/jira/browse/CASSANDRA-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-8661: -- Component/s: (was: e) > We don't detect OOM when allocating a Memory object > --- > > Key: CASSANDRA-8661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8661 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 2.1.3, 2.0.13 > > Attachments: 8661 > > > This affects OffHeapBitSet, and is a likely explanation for the SIGSEGV that > was reported to IRC last night. > We also don't check this in NativeAllocator, so I've made a change there as > well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10067) Hadoop2 jobs throw java.lang.IncompatibleClassChangeError
[ https://issues.apache.org/jira/browse/CASSANDRA-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744514#comment-14744514 ] Jim Witschey commented on CASSANDRA-10067: -- Maybe a merge error? https://github.com/apache/cassandra/commit/304260d04e9a7f30aa11a13d40083448812c22a8#diff-5fd299acb7c9c1991848ae139ed2285fL58 This merge came shortly after the patch was committed; see the file's history here: https://github.com/apache/cassandra/commits/trunk/src/java/org/apache/cassandra/hadoop/cql3/CqlInputFormat.java > Hadoop2 jobs throw java.lang.IncompatibleClassChangeError > - > > Key: CASSANDRA-10067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10067 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Ashley Taylor > Fix For: 2.1.x, 2.2.x > > > CqlInputFormat throws a java.lang.IncompatibleClassChangeError when run with > Hadoop2. An earlier commit addressing this problem seems not to have been > merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10067) Hadoop2 jobs throw java.lang.IncompatibleClassChangeError
[ https://issues.apache.org/jira/browse/CASSANDRA-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744494#comment-14744494 ] Brandon Williams commented on CASSANDRA-10067: -- As I recall when I looked at this the contents were in the versions CASSANDRA-7229 was tagged for, and the commit id was still part of the current branches, but I couldn't figure out how or when the actual content was removed. > Hadoop2 jobs throw java.lang.IncompatibleClassChangeError > - > > Key: CASSANDRA-10067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10067 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Ashley Taylor > Fix For: 2.1.x, 2.2.x > > > CqlInputFormat throws a java.lang.IncompatibleClassChangeError when run with > Hadoop2. An earlier commit addressing this problem seems not to have been > merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10330) Gossipinfo could return more useful information
Brandon Williams created CASSANDRA-10330: Summary: Gossipinfo could return more useful information Key: CASSANDRA-10330 URL: https://issues.apache.org/jira/browse/CASSANDRA-10330 Project: Cassandra Issue Type: Improvement Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor For instance, the version for each state, which can be useful for diagnosing the reason for any missing states. Also instead of just omitting the TOKENS state, let's indicate whether the state was actually present or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10067) Hadoop2 jobs throw java.lang.IncompatibleClassChangeError
[ https://issues.apache.org/jira/browse/CASSANDRA-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744490#comment-14744490 ] Jim Witschey commented on CASSANDRA-10067: -- I can confirm that the contents of the patch attached to CASSANDRA-7229 aren't in the codebase for 2.1, 2.2, 3.0, or trunk. [~brandon.williams] if you remember: was that intentional? > Hadoop2 jobs throw java.lang.IncompatibleClassChangeError > - > > Key: CASSANDRA-10067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10067 > Project: Cassandra > Issue Type: Bug > Components: Hadoop >Reporter: Ashley Taylor > Fix For: 2.1.x, 2.2.x > > > CqlInputFormat throws a java.lang.IncompatibleClassChangeError when run with > Hadoop2. An earlier commit addressing this problem seems not to have been > merged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10329) Improve CAS testing during range movements
Brandon Williams created CASSANDRA-10329: Summary: Improve CAS testing during range movements Key: CASSANDRA-10329 URL: https://issues.apache.org/jira/browse/CASSANDRA-10329 Project: Cassandra Issue Type: Test Components: Tests Reporter: Brandon Williams Assignee: Ryan McGuire I've heard reports of increased timeouts with CAS specifically during topology changes. Let's beef up the CAS testing during these to see if there are any problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10264) Unable to use conditions on static columns for DELETE
[ https://issues.apache.org/jira/browse/CASSANDRA-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744466#comment-14744466 ] Jim Witschey commented on CASSANDRA-10264: -- Calling this via the Python driver on Cassandra 2.1 and up results in the error message {{Invalid restriction on clustering column ord since the DELETE statement modifies only static columns}}. I'll need a developer to have a look to see if this behavior is correct. I have reproduced this through cqlsh, though, so I think this is either an issue in the bundled driver or in cqlsh. [~Stefania] can you take this or recommend someone I can assign this to? I've reproduced this on a scratch branch: https://github.com/mambocab/cassandra-dtest/tree/CASSANDRA-10264 > Unable to use conditions on static columns for DELETE > - > > Key: CASSANDRA-10264 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10264 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.2.0 >Reporter: DOAN DuyHai > > {noformat} > cqlsh:test> create table static_table(id int, stat int static, ord int, val > text, primary key(id,ord)); > cqlsh:test> insert into static_table (id,stat,ord,val) VALUES ( 1, 1, 1, '1'); > cqlsh:test> delete from static_table where id=1 and ord=1 if stat != 1; > Invalid syntax at line 1, char 55 > delete from static_table where id=1 and ord=1 if stat != 1; > ^ > {noformat} > Same error if using =, <, <=, >= or > condition > According to [~thobbs] the syntax should work. Plus, the error message is > wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10238) Consolidating racks violates the RF contract
[ https://issues.apache.org/jira/browse/CASSANDRA-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744428#comment-14744428 ] Blake Eggleston commented on CASSANDRA-10238: - I think there's still a window of time where the RF contract violation is possible. In {{YamlFileNetworkTopologySnitch}} and {{PropertyFileSnitch}}, {{StorageService.instance.updateTopology}} is only called for the local node, although any of the nodes could have been updated. Assuming topology file changes are deployed to all nodes simultaneously, this would mean that the snitch and TokenMetadata will be out of sync until the node which has moved picks up the changes in it's topology file and it's new application state propagates to the out of sync node. Calling {{StorageService.instance.updateTopology}} on any endpoint that changed should fix that > Consolidating racks violates the RF contract > > > Key: CASSANDRA-10238 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10238 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Brandon Williams >Assignee: Stefania >Priority: Critical > > I have only tested this on 2.0 so far, but I suspect it will affect multiple > versions. > Repro: > * create a datacenter with rf>1 > * create more than one rack in this datacenter > * consolidate these racks into 1 > * getendpoints will reveal the RF in practice is 1, even though other tools > will report the original RF that was set > Restarting Cassandra will resolve this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10278) CQLSH version is not supported by Remote
[ https://issues.apache.org/jira/browse/CASSANDRA-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744419#comment-14744419 ] Jim Witschey commented on CASSANDRA-10278: -- [~puneet] Is this still a problem for you? I can't reproduce locally with ccm: {code} $ ccm create repro-10278 -v binary:2.1.9 -n 3 ; ccm start --wait-for-binary-proto ; ccm node1 cqlsh Current cluster is now: repro-10278 Connected to repro-10278 at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 2.1.9 | CQL spec 3.2.0 | Native protocol v3] Use HELP for help. cqlsh> $ {code} This indicates to me that Cassandra 2.1.9's binary distribution uses the correct CQL protocol version, so I don't think your explanation is valid. Carl pointed out that {{cqlsh}} connects to 9042 by default. What command did you use to start {{cqlsh}}? Unless you specified the port as 9046 when you started {{cqlsh}}, you likely connected to a 2.2 node on 9042. > CQLSH version is not supported by Remote > > > Key: CASSANDRA-10278 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10278 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Puneet Pant > Labels: cqlsh > Fix For: 2.1.x > > > Although the CASSANDRA 2.1.9 sever runs successfully as: > INFO 14:27:31 Starting listening for CQL clients on > localhost/127.0.0.1:9046... > INFO 14:27:31 Binding thrift service to localhost/127.0.0.1:9160 > INFO 14:27:31 Listening for thrift clients... > But CQLSH can not connect to it. This is because the shipped version of > CQLSH with cassandra is 3.2.0 whereas the supported by remote is 3.3.0 so the > default shipped version of CQL need to be upgraded in cassandra or it should > provide backward compatibility. > Here is the CQLSH error: > {code} > Connection error: ('Unable to connect to any servers', {'127.0.0.1': > ProtocolError("cql_version '3.2.0' is not supported by remote (w/ native > protocol). Supported versions: [u'3.3.0']",)}) > {code} > Note:9046 is just changed to fix conflicts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9446) Failure detector should ignore local pauses per endpoint
[ https://issues.apache.org/jira/browse/CASSANDRA-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744413#comment-14744413 ] sankalp kohli commented on CASSANDRA-9446: -- I think we can commit the patch. +1 > Failure detector should ignore local pauses per endpoint > > > Key: CASSANDRA-9446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Stefania >Priority: Minor > Attachments: 9446.txt, 9644-v2.txt > > > In CASSANDRA-9183, we added a feature to ignore local pauses. But it will > only not mark 2 endpoints as down. > We should do this per endpoint as suggested by Brandon in CASSANDRA-9183. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9664) Allow MV's select statements to be more complex
[ https://issues.apache.org/jira/browse/CASSANDRA-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744332#comment-14744332 ] Adam Holmberg edited comment on CASSANDRA-9664 at 9/14/15 9:44 PM: --- [~thobbs] we will need to rebase again. New changes [here|https://github.com/iamaleksey/cassandra/commits/9921-3.0] brought in changes from trunk including updates to the {{system_schema.columns}} table. Meanwhile, no dev is blocked -- I'm working rebased locally. was (Author: aholmber): [~thobbs] are you willing to rebase again? New changes [here|https://github.com/iamaleksey/cassandra/commits/9921-3.0] brought in changes from trunk including updates to the {{system_schema.columns}} table. > Allow MV's select statements to be more complex > --- > > Key: CASSANDRA-9664 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9664 > Project: Cassandra > Issue Type: New Feature >Reporter: Carl Yeksigian >Assignee: Tyler Hobbs > Labels: client-impacting, doc-impacting > Fix For: 3.0.0 rc1 > > > [Materialized Views|https://issues.apache.org/jira/browse/CASSANDRA-6477] add > support for a syntax which includes a {{SELECT}} statement, but only allows > selection of direct columns, and does not allow any filtering to take place. > We should add support to the MV {{SELECT}} statement to bring better parity > with the normal CQL {{SELECT}} statement, specifically simple functions in > the selected columns, as well as specifying a {{WHERE}} clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9664) Allow MV's select statements to be more complex
[ https://issues.apache.org/jira/browse/CASSANDRA-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744332#comment-14744332 ] Adam Holmberg commented on CASSANDRA-9664: -- [~thobbs] are you willing to rebase again? New changes [here|https://github.com/iamaleksey/cassandra/commits/9921-3.0] brought in changes from trunk including updates to the {{system_schema.columns}} table. > Allow MV's select statements to be more complex > --- > > Key: CASSANDRA-9664 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9664 > Project: Cassandra > Issue Type: New Feature >Reporter: Carl Yeksigian >Assignee: Tyler Hobbs > Labels: client-impacting, doc-impacting > Fix For: 3.0.0 rc1 > > > [Materialized Views|https://issues.apache.org/jira/browse/CASSANDRA-6477] add > support for a syntax which includes a {{SELECT}} statement, but only allows > selection of direct columns, and does not allow any filtering to take place. > We should add support to the MV {{SELECT}} statement to bring better parity > with the normal CQL {{SELECT}} statement, specifically simple functions in > the selected columns, as well as specifying a {{WHERE}} clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10298) Replaced dead node stayed in gossip forever
[ https://issues.apache.org/jira/browse/CASSANDRA-10298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744325#comment-14744325 ] Dikang Gu commented on CASSANDRA-10298: --- [~mambocab], no they are complaining different nodes, so it looks like different issues to me. > Replaced dead node stayed in gossip forever > --- > > Key: CASSANDRA-10298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10298 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > Attachments: CASSANDRA-10298.patch > > > The dead node stayed in the nodetool status, > DN 10.210.165.55379.76 GB 256 ? null > And in the log, it throws NPE when trying to remove it. > 2015-09-10_06:41:22.92453 ERROR 06:41:22 Exception in thread > Thread[GossipStage:1,5,main] > 2015-09-10_06:41:22.92454 java.lang.NullPointerException: null > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.utils.UUIDGen.decompose(UUIDGen.java:100) > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.db.HintedHandOffManager.deleteHintsForEndpoint(HintedHandOffManager.java:201) > > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.service.StorageService.excise(StorageService.java:1886) > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.service.StorageService.excise(StorageService.java:1902) > 2015-09-10_06:41:22.92456 at > org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1805) > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.service.StorageService.onChange(StorageService.java:1473) > > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2099) > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009) > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1085) > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) > > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) > 2015-09-10_06:41:22.92459 at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_45] > 2015-09-10_06:41:22.92460 at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[na:1.7.0_45] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9870) Improve cassandra-stress graphing
[ https://issues.apache.org/jira/browse/CASSANDRA-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744309#comment-14744309 ] Benedict commented on CASSANDRA-9870: - How is this progressing? When do you think we'll have some example graphs to take a look at? > Improve cassandra-stress graphing > - > > Key: CASSANDRA-9870 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9870 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Benedict >Assignee: Shawn Kumar > Attachments: reads.svg > > > CASSANDRA-7918 introduces graph output from a stress run, but these graphs > are a little limited. Attached to the ticket is an example of some improved > graphs which can serve as the *basis* for some improvements, which I will > briefly describe. They should not be taken as the exact end goal, but we > should aim for at least their functionality. Preferably with some Javascript > advantages thrown in, such as the hiding of datasets/graphs for clarity. Any > ideas for improvements are *definitely* encouraged. > Some overarching design principles: > * Display _on *one* screen_ all of the information necessary to get a good > idea of how two or more branches compare to each other. Ideally we will > reintroduce this, painting multiple graphs onto one screen, stretched to fit. > * Axes must be truncated to only the interesting dimensions, to ensure there > is no wasted space. > * Each graph displaying multiple kinds of data should use colour _and shape_ > to help easily distinguish the different datasets. > * Each graph should be tailored to the data it is representing, and we should > have multiple views of each data. > The data can roughly be partitioned into three kinds: > * throughput > * latency > * gc > These can each be viewed in different ways: > * as a continuous plot of: > ** raw data > ** scaled/compared to a "base" branch, or other metric > ** cumulatively > * as box plots > ** ideally, these will plot median, outer quartiles, outer deciles and > absolute limits of the distribution, so the shape of the data can be best > understood > Each compresses the information differently, losing different information, so > that collectively they help to understand the data. > Some basic rules for presentation that work well: > * Latency information should be plotted to a logarithmic scale, to avoid high > latencies drowning out low ones > * GC information should be plotted cumulatively, to avoid differing > throughputs giving the impression of worse GC. It should also have a line > that is rescaled by the amount of work (number of operations) completed > * Throughput should be plotted as the actual numbers > To walk the graphs top-left to bottom-right, we have: > * Spot throughput comparison of branches to the baseline branch, as an > improvement ratio (which can of course be negative, but is not in this > example) > * Raw throughput of all branches (no baseline) > * Raw throughput as a box plot > * Latency percentiles, compared to baseline. The percentage improvement at > any point in time vs baseline is calculated, and then multiplied by the > overall median for the entire run. This simply permits the non-baseline > branches to scatter their wins/loss around a relatively clustered line for > each percentile. It's probably the most "dishonest" graph but comparing > something like latency where each data point can have very high variance is > difficult, and this gives you an idea of clustering of improvements/losses. > * Latency percentiles, raw, each with a different shape; lowest percentiles > plotted as a solid line as they vary least, with higher percentiles each > getting their own subtly different shape to scatter. > * Latency box plots > * GC time, plotted cumulatively and also scaled by work done > * GC Mb, plotted cumulatively and also scaled by work done > * GC time, raw > * GC time as a box plot > These do mostly introduce the concept of a "baseline" branch. It may be that, > ideally, this baseline be selected by a dropdown so the javascript can > transform the output dynamically. This would permit more interesting > comparisons to be made on the fly. > There are also some complexities, such as deciding which datapoints to > compare against baseline when times get out-of-whack (due to GC, etc, causing > a lack of output for a period). The version I uploaded does a merge of the > times, permitting a small degree of variance, and ignoring those datapoints > we cannot pair. One option here might be to change stress' behaviour to > always print to a strict schedule, instead of trying to get absolutely > accurate apportionment of timings. If this makes things much simpler, it can > be done. > As previously stated, but may be lost in the w
[jira] [Commented] (CASSANDRA-9921) Combine MV schema definition with MV table definition
[ https://issues.apache.org/jira/browse/CASSANDRA-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744276#comment-14744276 ] Adam Holmberg commented on CASSANDRA-9921: -- Driver branch [here|https://github.com/datastax/python-driver/tree/380]@36b884d available if anyone is interested. It does not drop MVs from the model due to the event issue mentioned above. > Combine MV schema definition with MV table definition > - > > Key: CASSANDRA-9921 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9921 > Project: Cassandra > Issue Type: Improvement >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian > Labels: client-impacting, materializedviews > Fix For: 3.0.0 rc1 > > Attachments: 9921-unit-test.txt > > > Prevent MV from reusing {{system_schema.tables}} and instead move those > properties into the {{system_schema.materializedviews}} table to keep them > separate entities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9961) cqlsh should have DESCRIBE MATERIALIZED VIEW
[ https://issues.apache.org/jira/browse/CASSANDRA-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744268#comment-14744268 ] Adam Holmberg commented on CASSANDRA-9961: -- See driver branch [here|https://github.com/datastax/python-driver/tree/380]@36b884d for work in progress. We're still ironing some things out on CASSANDRA-9921 and CASSANDRA-10328, but the meta API and keywords required for this ticket are present there. > cqlsh should have DESCRIBE MATERIALIZED VIEW > > > Key: CASSANDRA-9961 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9961 > Project: Cassandra > Issue Type: Improvement >Reporter: Carl Yeksigian >Assignee: Stefania > Labels: client-impacting, materializedviews > Fix For: 3.0.0 rc1 > > > cqlsh doesn't currently produce describe output that can be used to recreate > a MV. Needs to add a new {{DESCRIBE MATERIALIZED VIEW}} command, and also add > to {{DESCRIBE KEYSPACE}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10298) Replaced dead node stayed in gossip forever
[ https://issues.apache.org/jira/browse/CASSANDRA-10298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744236#comment-14744236 ] Jim Witschey commented on CASSANDRA-10298: -- [~dikanggu] Is this related to CASSANDRA-10321, and if so, how? You mentioned [in this comment|https://issues.apache.org/jira/browse/CASSANDRA-10321?focusedCommentId=14744120&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14744120] that the node was running, but marked as having Thrift and Gossip unavailable. Is that the same node you tried to remove? and could that unavailable state be the result of the failed removal? > Replaced dead node stayed in gossip forever > --- > > Key: CASSANDRA-10298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10298 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > Attachments: CASSANDRA-10298.patch > > > The dead node stayed in the nodetool status, > DN 10.210.165.55379.76 GB 256 ? null > And in the log, it throws NPE when trying to remove it. > 2015-09-10_06:41:22.92453 ERROR 06:41:22 Exception in thread > Thread[GossipStage:1,5,main] > 2015-09-10_06:41:22.92454 java.lang.NullPointerException: null > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.utils.UUIDGen.decompose(UUIDGen.java:100) > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.db.HintedHandOffManager.deleteHintsForEndpoint(HintedHandOffManager.java:201) > > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.service.StorageService.excise(StorageService.java:1886) > 2015-09-10_06:41:22.92455 at > org.apache.cassandra.service.StorageService.excise(StorageService.java:1902) > 2015-09-10_06:41:22.92456 at > org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1805) > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.service.StorageService.onChange(StorageService.java:1473) > > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2099) > 2015-09-10_06:41:22.92457 at > org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009) > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1085) > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) > > 2015-09-10_06:41:22.92458 at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) > 2015-09-10_06:41:22.92459 at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_45] > 2015-09-10_06:41:22.92460 at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > ~[na:1.7.0_45] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10303) streaming for 'nodetool rebuild' fails after adding a datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744225#comment-14744225 ] Jim Witschey commented on CASSANDRA-10303: -- [~zhaoyan] I'm unable to reproduce locally with [ccm|https://github.com/pcmanus/ccm/]. Have you been able to reproduce this with a fresh cluster? In the failure you described in your comment, after changing configurations with {{sysctl}}, were you able to get the rebuild to succeed by running it again? > streaming for 'nodetool rebuild' fails after adding a datacenter > - > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.7.0_60] > at > org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.8.jar:2.1.8] > ... 1 common frames omitted > {code} > i must restart node to stop current rebuild, and rebuild agagin and again to > success -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9758) nodetool compactionhistory NPE
[ https://issues.apache.org/jira/browse/CASSANDRA-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-9758: -- Fix Version/s: (was: 3.x) 2.2.x 2.1.x > nodetool compactionhistory NPE > -- > > Key: CASSANDRA-9758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9758 > Project: Cassandra > Issue Type: Bug >Reporter: Pierre N. >Priority: Minor > Fix For: 2.1.x, 2.2.x > > Attachments: 0001-fix-npe-inline.patch, 9758.txt > > > nodetool compactionhistory may trigger NPE : > {code} > admin@localhost:~$ nodetool compactionhistory > Compaction History: > error: null > -- StackTrace -- > java.lang.NullPointerException > at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330) > at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:515) > at > org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78) > at > org.apache.cassandra.db.SystemKeyspace.getCompactionHistory(SystemKeyspace.java:422) > at > org.apache.cassandra.db.compaction.CompactionManager.getCompactionHistory(CompactionManager.java:1490) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > at > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) > at > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464) > at > javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) > at > javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) > at sun.rmi.transport.Transport$2.run(Transport.java:202) > at sun.rmi.transport.Transport$2.run(Transport.java:199) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:198) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:681) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > {code} > admin@localhost:~$ cqlsh -e "select * from syste
[jira] [Comment Edited] (CASSANDRA-10230) Remove coordinator batchlog from materialized views
[ https://issues.apache.org/jira/browse/CASSANDRA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743721#comment-14743721 ] Joel Knighton edited comment on CASSANDRA-10230 at 9/14/15 8:35 PM: I'm just now finishing up some tests for this. The test methodology is as follows: 1. Tune the failure detector so that network partitions for the durations in the test will not cause nodes to realize the cluster is partitioned. 2. Partition the cluster so nodes 1 and 2 can communicate and nodes 3 and 4 can communicate. 3. Write to the cluster at CL.ONE. 4. Heal the partition for three seconds, completely partition the cluster, perform a read with a client connected to each node as coordinator, so that we know reads only come from the coordinator. Currently, testing shows that both hints and batchlogs successfully propagate all writes to all replicas of the base tables without data loss. If both hints and batchlogs are disabled, writes will never propagate (since read repair is impossible). Graphs showing a rough estimate of propagation time (time from read on final replica - time from read on first replica) will be uploaded shortly, along with a link to the Jepsen test. EDIT: Because of how the tests work, the graphs showing the relative latency of these approaches doesn't provide much information, only that they converge in at most a couple hundred seconds each. The test is available [here|https://github.com/riptano/jepsen/blob/c07b54041223f9836fc9c359239a1622b64b3415/cassandra/test/cassandra/mv_test.clj#L61] was (Author: jkni): I'm just now finishing up some tests for this. The test methodology is as follows: 1. Tune the failure detector so that network partitions for the durations in the test will not cause nodes to realize the cluster is partitioned. 2. Partition the cluster so nodes 1 and 2 can communicate and nodes 3 and 4 can communicate. 3. Write to the cluster at CL.ONE. 4. Heal the partition for three seconds, completely partition the cluster, perform a read with a client connected to each node as coordinator, so that we know reads only come from the coordinator. Currently, testing shows that both hints and batchlogs successfully propagate all writes to all replicas of the base tables without data loss. If both hints and batchlogs are disabled, writes will never propagate (since read repair is impossible). Graphs showing a rough estimate of propagation time (time from read on final replica - time from read on first replica) will be uploaded shortly, along with a link to the Jepsen test. > Remove coordinator batchlog from materialized views > --- > > Key: CASSANDRA-10230 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10230 > Project: Cassandra > Issue Type: Improvement >Reporter: T Jake Luciani >Assignee: Joel Knighton > Fix For: 3.0.0 rc1 > > > We are considering removing or making optional the coordinator batchlog. > The batchlog primary serves as a way to quickly reach consistency between > base and view since we don't have any kind of read repair between base and > view. But we do have repair so as long as you don't lose nodes while writing > at CL.ONE you will be eventually consistent. > I've committed to the 3.0 branch a way to disable the coordinator with > {{-Dcassandra.mv_disable_coordinator_batchlog=true}} > The majority of the performance hit to throughput is currently the batchlog > as shown by this chart. > http://cstar.datastax.com/graph?stats=f794245a-4d9d-11e5-9def-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=498.52&ymin=0&ymax=50142.4 > I'd like to have tests run with and without this flag to validate how quickly > we achieve quorum consistency without repair writing with CL.ONE. Once we > can see there is little/no impact we can permanently remove the coordinator > batchlog. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9758) nodetool compactionhistory NPE
[ https://issues.apache.org/jira/browse/CASSANDRA-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744210#comment-14744210 ] Yuki Morishita commented on CASSANDRA-9758: --- Ok, null value for {{rows_merged}} does happen when dropping SSTable. So attached patch makes sense for fix. I pushed my patch to github: [2.1|https://github.com/yukim/cassandra/tree/9758-2.1] [2.2|https://github.com/yukim/cassandra/tree/9758-2.2] [3.0|https://github.com/yukim/cassandra/tree/9758-3.0] If tests are fine, I will commit to 2.1+. > nodetool compactionhistory NPE > -- > > Key: CASSANDRA-9758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9758 > Project: Cassandra > Issue Type: Bug >Reporter: Pierre N. >Priority: Minor > Fix For: 3.x > > Attachments: 0001-fix-npe-inline.patch, 9758.txt > > > nodetool compactionhistory may trigger NPE : > {code} > admin@localhost:~$ nodetool compactionhistory > Compaction History: > error: null > -- StackTrace -- > java.lang.NullPointerException > at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330) > at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:515) > at > org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78) > at > org.apache.cassandra.db.SystemKeyspace.getCompactionHistory(SystemKeyspace.java:422) > at > org.apache.cassandra.db.compaction.CompactionManager.getCompactionHistory(CompactionManager.java:1490) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > at > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) > at > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464) > at > javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) > at > javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) > at sun.rmi.transport.Transport$2.run(Transport.java:202) > at sun.rmi.transport.Transport$2.run(Transport.java:199) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:198) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:567) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.access$400(TCPTransport.java:619) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:684) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$1.run(TCPTransport.java:681) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.j
[jira] [Updated] (CASSANDRA-10303) streaming for 'nodetool rebuild' fails after adding a datacenter
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Witschey updated CASSANDRA-10303: - Summary: streaming for 'nodetool rebuild' fails after adding a datacenter (was: stream always hang up when use rebuild) > streaming for 'nodetool rebuild' fails after adding a datacenter > - > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) > ~[na:1.7.0_60] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.7.0_60] > at > org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.8.jar:2.1.8] > ... 1 common frames omitted > {code} > i must restart node to stop current rebuild, and rebuild agagin and again to > success -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10267) Failing tests in upgrade_trests.paging_test
[ https://issues.apache.org/jira/browse/CASSANDRA-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744179#comment-14744179 ] Blake Eggleston commented on CASSANDRA-10267: - [~mambocab] let's just keep it in this ticket. Unfortunately, it looks like there are also a bunch of TestCQL tests which are now also failing consistently. > Failing tests in upgrade_trests.paging_test > --- > > Key: CASSANDRA-10267 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10267 > Project: Cassandra > Issue Type: Sub-task >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 3.0.0 rc1 > > > This is a continuation of CASSANDRA-9893 to deal with the failure of the > {{upgrade_trests.paging_test}} tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-10267) Failing tests in upgrade_trests.paging_test
[ https://issues.apache.org/jira/browse/CASSANDRA-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston reassigned CASSANDRA-10267: --- Assignee: Blake Eggleston (was: Sylvain Lebresne) > Failing tests in upgrade_trests.paging_test > --- > > Key: CASSANDRA-10267 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10267 > Project: Cassandra > Issue Type: Sub-task >Reporter: Sylvain Lebresne >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > This is a continuation of CASSANDRA-9893 to deal with the failure of the > {{upgrade_trests.paging_test}} tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] cassandra git commit: Update index file format
Repository: cassandra Updated Branches: refs/heads/trunk 8564f09c1 -> acb8fbb8f Update index file format patch by Robert Stupp; reviewed by Ariel Weisberg for CASSANDRA-10314 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/51b1a1c6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/51b1a1c6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/51b1a1c6 Branch: refs/heads/trunk Commit: 51b1a1c6d3faf2a2bee97fe10c9399119784675d Parents: 16497fd Author: Robert Stupp Authored: Mon Sep 14 22:06:37 2015 +0200 Committer: Robert Stupp Committed: Mon Sep 14 22:06:37 2015 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/ColumnIndex.java| 1 + .../org/apache/cassandra/db/RowIndexEntry.java | 82 +++-- .../columniterator/AbstractSSTableIterator.java | 18 +- .../cassandra/io/sstable/IndexHelper.java | 29 +- .../io/sstable/format/big/BigTableScanner.java | 2 +- .../cassandra/io/util/DataOutputBuffer.java | 10 + .../cassandra/io/util/DataOutputPlus.java | 9 + .../cassandra/io/util/SequentialWriter.java | 5 + .../legacy_ma_clust/ma-1-big-CompressionInfo.db | Bin 83 -> 83 bytes .../legacy_ma_clust/ma-1-big-Data.db| Bin 5045 -> 5049 bytes .../legacy_ma_clust/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../legacy_ma_clust/ma-1-big-Statistics.db | Bin 7045 -> 7045 bytes .../ma-1-big-CompressionInfo.db | Bin 75 -> 75 bytes .../legacy_ma_clust_counter/ma-1-big-Data.db| Bin 4428 -> 4393 bytes .../ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust_counter/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../ma-1-big-Statistics.db | Bin 7054 -> 7054 bytes .../legacy_ma_simple/ma-1-big-Data.db | Bin 85 -> 85 bytes .../legacy_ma_simple/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_simple/ma-1-big-Statistics.db | Bin 4598 -> 4598 bytes .../legacy_ma_simple_counter/ma-1-big-Data.db | Bin 106 -> 106 bytes .../ma-1-big-Digest.crc32 | 2 +- .../ma-1-big-Statistics.db | Bin 4607 -> 4607 bytes .../apache/cassandra/cql3/KeyCacheCqlTest.java | 365 +++ .../apache/cassandra/db/RowIndexEntryTest.java | 142 +++- .../cassandra/io/sstable/IndexHelperTest.java | 8 +- 28 files changed, 621 insertions(+), 59 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index bacedaf..1a1ddeb 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.0-rc1 + * Update index file format (CASSANDRA-10314) * Add "shadowable" row tombstones to deal with mv timestamp issues (CASSANDRA-10261) * CFS.loadNewSSTables() broken for pre-3.0 sstables * Cache selected index in read command to reduce lookups (CASSANDRA-10215) http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/src/java/org/apache/cassandra/db/ColumnIndex.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnIndex.java b/src/java/org/apache/cassandra/db/ColumnIndex.java index b350f90..6b2ef59 100644 --- a/src/java/org/apache/cassandra/db/ColumnIndex.java +++ b/src/java/org/apache/cassandra/db/ColumnIndex.java @@ -122,6 +122,7 @@ public class ColumnIndex { IndexHelper.IndexInfo cIndexInfo = new IndexHelper.IndexInfo(firstClustering, lastClustering, + startPosition, currentPosition() - startPosition, openMarker); columnsIndex.add(cIndexInfo); http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/src/java/org/apache/cassandra/db/RowIndexEntry.java -- diff --git a/src/java/org/apache/cassandra/db/RowIndexEntry.java b/src/java/org/apache/cassandra/db/RowIndexEntry.java index f63e893..7f361d9 100644 --- a/src/java/org/apache/cassandra/db/RowIndexEntry.java +++ b/src/java/org/apache/cassandra/db/RowIndexEntry.java @@ -17,7 +17,6 @@ */ package org.apache.cassandra.db; -import java.io.DataInput; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; @@ -28,7 +27,6 @@ import com.google.common.primitives.Ints; import org.apache.cassandra.config.CFMetaData;
[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into trunk
Merge branch 'cassandra-3.0' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/acb8fbb8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/acb8fbb8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/acb8fbb8 Branch: refs/heads/trunk Commit: acb8fbb8f8582a8cf0c10f6c8a820600aafd290b Parents: 8564f09 51b1a1c Author: Robert Stupp Authored: Mon Sep 14 22:07:18 2015 +0200 Committer: Robert Stupp Committed: Mon Sep 14 22:07:18 2015 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/ColumnIndex.java| 1 + .../org/apache/cassandra/db/RowIndexEntry.java | 82 +++-- .../columniterator/AbstractSSTableIterator.java | 18 +- .../cassandra/io/sstable/IndexHelper.java | 29 +- .../io/sstable/format/big/BigTableScanner.java | 2 +- .../cassandra/io/util/DataOutputBuffer.java | 10 + .../cassandra/io/util/DataOutputPlus.java | 9 + .../cassandra/io/util/SequentialWriter.java | 5 + .../legacy_ma_clust/ma-1-big-CompressionInfo.db | Bin 83 -> 83 bytes .../legacy_ma_clust/ma-1-big-Data.db| Bin 5045 -> 5049 bytes .../legacy_ma_clust/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../legacy_ma_clust/ma-1-big-Statistics.db | Bin 7045 -> 7045 bytes .../legacy_ma_clust/ma-1-big-TOC.txt| 10 +- .../ma-1-big-CompressionInfo.db | Bin 75 -> 75 bytes .../legacy_ma_clust_counter/ma-1-big-Data.db| Bin 4447 -> 4393 bytes .../ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust_counter/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../ma-1-big-Statistics.db | Bin 7054 -> 7054 bytes .../legacy_ma_clust_counter/ma-1-big-TOC.txt| 10 +- .../legacy_ma_simple/ma-1-big-Data.db | Bin 85 -> 85 bytes .../legacy_ma_simple/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_simple/ma-1-big-Statistics.db | Bin 4598 -> 4598 bytes .../legacy_ma_simple/ma-1-big-TOC.txt | 10 +- .../legacy_ma_simple_counter/ma-1-big-Data.db | Bin 106 -> 106 bytes .../ma-1-big-Digest.crc32 | 2 +- .../ma-1-big-Statistics.db | Bin 4607 -> 4607 bytes .../legacy_ma_simple_counter/ma-1-big-TOC.txt | 10 +- .../apache/cassandra/cql3/KeyCacheCqlTest.java | 365 +++ .../apache/cassandra/db/RowIndexEntryTest.java | 142 +++- .../cassandra/io/sstable/IndexHelperTest.java | 8 +- 32 files changed, 641 insertions(+), 79 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/acb8fbb8/CHANGES.txt -- diff --cc CHANGES.txt index b83d74b,1a1ddeb..6e04690 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,8 -1,5 +1,9 @@@ +3.2 + * Add transparent data encryption core classes (CASSANDRA-9945) + + 3.0.0-rc1 + * Update index file format (CASSANDRA-10314) * Add "shadowable" row tombstones to deal with mv timestamp issues (CASSANDRA-10261) * CFS.loadNewSSTables() broken for pre-3.0 sstables * Cache selected index in read command to reduce lookups (CASSANDRA-10215) http://git-wip-us.apache.org/repos/asf/cassandra/blob/acb8fbb8/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java --
cassandra git commit: Update index file format
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 16497fd93 -> 51b1a1c6d Update index file format patch by Robert Stupp; reviewed by Ariel Weisberg for CASSANDRA-10314 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/51b1a1c6 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/51b1a1c6 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/51b1a1c6 Branch: refs/heads/cassandra-3.0 Commit: 51b1a1c6d3faf2a2bee97fe10c9399119784675d Parents: 16497fd Author: Robert Stupp Authored: Mon Sep 14 22:06:37 2015 +0200 Committer: Robert Stupp Committed: Mon Sep 14 22:06:37 2015 +0200 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/ColumnIndex.java| 1 + .../org/apache/cassandra/db/RowIndexEntry.java | 82 +++-- .../columniterator/AbstractSSTableIterator.java | 18 +- .../cassandra/io/sstable/IndexHelper.java | 29 +- .../io/sstable/format/big/BigTableScanner.java | 2 +- .../cassandra/io/util/DataOutputBuffer.java | 10 + .../cassandra/io/util/DataOutputPlus.java | 9 + .../cassandra/io/util/SequentialWriter.java | 5 + .../legacy_ma_clust/ma-1-big-CompressionInfo.db | Bin 83 -> 83 bytes .../legacy_ma_clust/ma-1-big-Data.db| Bin 5045 -> 5049 bytes .../legacy_ma_clust/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../legacy_ma_clust/ma-1-big-Statistics.db | Bin 7045 -> 7045 bytes .../ma-1-big-CompressionInfo.db | Bin 75 -> 75 bytes .../legacy_ma_clust_counter/ma-1-big-Data.db| Bin 4428 -> 4393 bytes .../ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_clust_counter/ma-1-big-Index.db | Bin 157123 -> 157553 bytes .../ma-1-big-Statistics.db | Bin 7054 -> 7054 bytes .../legacy_ma_simple/ma-1-big-Data.db | Bin 85 -> 85 bytes .../legacy_ma_simple/ma-1-big-Digest.crc32 | 2 +- .../legacy_ma_simple/ma-1-big-Statistics.db | Bin 4598 -> 4598 bytes .../legacy_ma_simple_counter/ma-1-big-Data.db | Bin 106 -> 106 bytes .../ma-1-big-Digest.crc32 | 2 +- .../ma-1-big-Statistics.db | Bin 4607 -> 4607 bytes .../apache/cassandra/cql3/KeyCacheCqlTest.java | 365 +++ .../apache/cassandra/db/RowIndexEntryTest.java | 142 +++- .../cassandra/io/sstable/IndexHelperTest.java | 8 +- 28 files changed, 621 insertions(+), 59 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index bacedaf..1a1ddeb 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.0-rc1 + * Update index file format (CASSANDRA-10314) * Add "shadowable" row tombstones to deal with mv timestamp issues (CASSANDRA-10261) * CFS.loadNewSSTables() broken for pre-3.0 sstables * Cache selected index in read command to reduce lookups (CASSANDRA-10215) http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/src/java/org/apache/cassandra/db/ColumnIndex.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnIndex.java b/src/java/org/apache/cassandra/db/ColumnIndex.java index b350f90..6b2ef59 100644 --- a/src/java/org/apache/cassandra/db/ColumnIndex.java +++ b/src/java/org/apache/cassandra/db/ColumnIndex.java @@ -122,6 +122,7 @@ public class ColumnIndex { IndexHelper.IndexInfo cIndexInfo = new IndexHelper.IndexInfo(firstClustering, lastClustering, + startPosition, currentPosition() - startPosition, openMarker); columnsIndex.add(cIndexInfo); http://git-wip-us.apache.org/repos/asf/cassandra/blob/51b1a1c6/src/java/org/apache/cassandra/db/RowIndexEntry.java -- diff --git a/src/java/org/apache/cassandra/db/RowIndexEntry.java b/src/java/org/apache/cassandra/db/RowIndexEntry.java index f63e893..7f361d9 100644 --- a/src/java/org/apache/cassandra/db/RowIndexEntry.java +++ b/src/java/org/apache/cassandra/db/RowIndexEntry.java @@ -17,7 +17,6 @@ */ package org.apache.cassandra.db; -import java.io.DataInput; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; @@ -28,7 +27,6 @@ import com.google.common.primitives.Ints; import org.apache.cassandra.conf
[jira] [Updated] (CASSANDRA-10328) Inconsistent Schema Change Events Between Table and View
[ https://issues.apache.org/jira/browse/CASSANDRA-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-10328: -- Description: I'm seeing inconsistent event delivery when it comes to dropping materialized views (when compared to tables or indexes). For example, create/drop/alter for a table: {code} cassandra@cqlsh:test> create TABLE t (k int PRIMARY KEY , v int); cassandra@cqlsh:test> alter TABLE t add v1 int; cassandra@cqlsh:test> drop TABLE t; {code} And for a view: {code} cassandra@cqlsh:test> create MATERIALIZED VIEW mv as select * from scores WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (game, user, year, month, day, score)WITH CLUSTERING ORDER BY (score desc); cassandra@cqlsh:test> alter materialized view mv with min_index_interval = 100; cassandra@cqlsh:test> drop MATERIALIZED VIEW mv; {code} The latter sequence is missing a table update event, meaning clients cannot tell that a view was dropped. This is on a [branch in-progress|https://github.com/iamaleksey/cassandra/commits/9921-3.0] for CASSANDRA-9921 As a side note, I also believe they keyspace update events are unnecessary in both scenarios. To my knowledge, drivers only use these events to refresh meta on the keyspace definition itself, not the entities it contains. Please let me know if this is worthy of discussion or a distinct ticket. was: I'm seeing inconsistent event delivery when it comes to dropping materialized views (when compared to tables or indexes). For example, create/drop/alter for a table: {code} cassandra@cqlsh:test> create TABLE t (k int PRIMARY KEY , v int); cassandra@cqlsh:test> alter TABLE t add v1 int; cassandra@cqlsh:test> drop TABLE t; {code} And for a view: {code} cassandra@cqlsh:test> create MATERIALIZED VIEW mv as select * from scores WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (game, user, year, month, day, score)WITH CLUSTERING ORDER BY (score desc); cassandra@cqlsh:test> alter materialized view all with min_index_interval = 100; cassandra@cqlsh:test> drop MATERIALIZED VIEW mv; {code} The latter sequence is missing a table update event, meaning clients cannot tell that a view was dropped. This is on a [branch in-progress|https://github.com/iamaleksey/cassandra/commits/9921-3.0] for CASSANDRA-9921 As a side note, I also believe they keyspace update events are unnecessary in both scenarios. To my knowledge, drivers only use these events to refresh meta on the keyspace definition itself, not the entities it contains. Please let me know if this is worthy of discussion or a distinct ticket. > Inconsistent Schema Change Events Between Table and View > > > Key: CASSANDRA-10328 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10328 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Adam Holmberg > > I'm seeing inconsistent event delivery when it comes to dropping materialized > views (when compared to tables or indexes). > For example, create/drop/alter for a table: > {code} > cassandra@cqlsh:test> create TABLE t (k int PRIMARY KEY , v int); > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'CREATED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > cassandra@cqlsh:test> alter TABLE t add v1 int; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > cassandra@cqlsh:test> drop TABLE t; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'DROPPED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > {code} > And for a view: > {code} > cassandra@cqlsh:test> create MATERIALIZED VIEW mv as select * from scores > WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS > NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (game, > user, year, month, day, score)WITH CLUSTERING ORDER BY (score desc); > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'CREATED', 'target_type': > u'TABLE', u'table': u'mv'}, stream_id=-1)> > cassandra@cqlsh:test> alter materialized view mv with min_index_interval = > 100; > event_args={'keyspace'
[jira] [Commented] (CASSANDRA-9921) Combine MV schema definition with MV table definition
[ https://issues.apache.org/jira/browse/CASSANDRA-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744134#comment-14744134 ] Adam Holmberg commented on CASSANDRA-9921: -- We've discussed the above, and table change events are sufficient to get either tables or views in one phase, provided that the 'table' refers to view_name when applicable. Presently blocked on https://issues.apache.org/jira/browse/CASSANDRA-10328, which prevents clients from detecting dropped views. > Combine MV schema definition with MV table definition > - > > Key: CASSANDRA-9921 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9921 > Project: Cassandra > Issue Type: Improvement >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian > Labels: client-impacting, materializedviews > Fix For: 3.0.0 rc1 > > Attachments: 9921-unit-test.txt > > > Prevent MV from reusing {{system_schema.tables}} and instead move those > properties into the {{system_schema.materializedviews}} table to keep them > separate entities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10328) Inconsistent Schema Change Events Between Table and View
[ https://issues.apache.org/jira/browse/CASSANDRA-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744126#comment-14744126 ] Adam Holmberg commented on CASSANDRA-10328: --- It was suggested to /cc [~thobbs] and [~slebresne] to begin the discussion. > Inconsistent Schema Change Events Between Table and View > > > Key: CASSANDRA-10328 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10328 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Adam Holmberg > > I'm seeing inconsistent event delivery when it comes to dropping materialized > views (when compared to tables or indexes). > For example, create/drop/alter for a table: > {code} > cassandra@cqlsh:test> create TABLE t (k int PRIMARY KEY , v int); > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'CREATED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > cassandra@cqlsh:test> alter TABLE t add v1 int; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > cassandra@cqlsh:test> drop TABLE t; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'DROPPED', 'target_type': > u'TABLE', u'table': u't'}, stream_id=-1)> > {code} > And for a view: > {code} > cassandra@cqlsh:test> create MATERIALIZED VIEW mv as select * from scores > WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS > NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (game, > user, year, month, day, score)WITH CLUSTERING ORDER BY (score desc); > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'CREATED', 'target_type': > u'TABLE', u'table': u'mv'}, stream_id=-1)> > cassandra@cqlsh:test> alter materialized view all with min_index_interval = > 100; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'TABLE', u'table': u'all'}, stream_id=-1)> > cassandra@cqlsh:test> drop MATERIALIZED VIEW mv; > event_args={'keyspace': u'test', 'change_type': u'UPDATED', 'target_type': > u'KEYSPACE'}, stream_id=-1)> > {code} > The latter sequence is missing a table update event, meaning clients cannot > tell that a view was dropped. > This is on a [branch > in-progress|https://github.com/iamaleksey/cassandra/commits/9921-3.0] for > CASSANDRA-9921 > As a side note, I also believe they keyspace update events are unnecessary in > both scenarios. To my knowledge, drivers only use these events to refresh > meta on the keyspace definition itself, not the entities it contains. Please > let me know if this is worthy of discussion or a distinct ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10328) Inconsistent Schema Change Events Between Table and View
Adam Holmberg created CASSANDRA-10328: - Summary: Inconsistent Schema Change Events Between Table and View Key: CASSANDRA-10328 URL: https://issues.apache.org/jira/browse/CASSANDRA-10328 Project: Cassandra Issue Type: Bug Components: Core Reporter: Adam Holmberg I'm seeing inconsistent event delivery when it comes to dropping materialized views (when compared to tables or indexes). For example, create/drop/alter for a table: {code} cassandra@cqlsh:test> create TABLE t (k int PRIMARY KEY , v int); cassandra@cqlsh:test> alter TABLE t add v1 int; cassandra@cqlsh:test> drop TABLE t; {code} And for a view: {code} cassandra@cqlsh:test> create MATERIALIZED VIEW mv as select * from scores WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULLPRIMARY KEY (game, user, year, month, day, score)WITH CLUSTERING ORDER BY (score desc); cassandra@cqlsh:test> alter materialized view all with min_index_interval = 100; cassandra@cqlsh:test> drop MATERIALIZED VIEW mv; {code} The latter sequence is missing a table update event, meaning clients cannot tell that a view was dropped. This is on a [branch in-progress|https://github.com/iamaleksey/cassandra/commits/9921-3.0] for CASSANDRA-9921 As a side note, I also believe they keyspace update events are unnecessary in both scenarios. To my knowledge, drivers only use these events to refresh meta on the keyspace definition itself, not the entities it contains. Please let me know if this is worthy of discussion or a distinct ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10321) Gossip to dead nodes caused CPU usage to be 100%
[ https://issues.apache.org/jira/browse/CASSANDRA-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744120#comment-14744120 ] Dikang Gu edited comment on CASSANDRA-10321 at 9/14/15 7:50 PM: [~mambocab], the cluster is using NetworkTopologyStrategy, and is crossing 3 datacenters. You can see the dead node in the last line of the log, "2401:db00:2020:716b:face:0:21:0" is the dead node I referred to, and the cassandra process on it is still running, but somehow, the "Thrift active" and "Gossip active" are false in the nodetool info output. Yeah, we shouldn't expect it caused 100% cpu usage on some nodes, right? was (Author: dikanggu): [~mambocab], the cluster is using NetworkTopologyStrategy, and is crossing 3 datacenters. You can see the dead node in the last line of the log, "2401:db00:2020:716b:face:0:21:0" is the dead node I referred to, and the cassandra process on it is still running, but some now, the "Thrift active" and "Gossip active" are false in the nodetool info output. Yeah, we shouldn't expect it caused 100% cpu usage on some nodes, right? > Gossip to dead nodes caused CPU usage to be 100% > - > > Key: CASSANDRA-10321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10321 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > For one node, the cpu usage jumped to 100%, and logs are full of: > 2015-09-14_16:34:45.56407 WARN 16:34:45 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:46.66616 WARN 16:34:46 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:47.76830 WARN 16:34:47 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:48.87043 WARN 16:34:48 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:49.97253 WARN 16:34:49 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:51.07462 WARN 16:34:51 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:52.17669 WARN 16:34:52 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:53.27880 WARN 16:34:53 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:54.38090 WARN 16:34:54 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:55.48301 WARN 16:34:55 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:56.58509 WARN 16:34:56 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:57.68721 WARN 16:34:57 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:58.78932 WARN 16:34:58 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:59.89142 WARN 16:34:59 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:00.99352 WARN 16:35:00 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:02.09563 WARN 16:35:02 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:03.19775 WARN 16:35:03 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:04.29982 WARN 16:35:04 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:05.40187 WARN 16:35:05 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:06.50369 WARN 16:35:06 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:07.60577 WARN 16:35:07 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:08.70779 WARN 16:35:08 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:09.80968 WARN 16:35:09 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:10.91157 WARN 16:35:10 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:12.01365 WARN 16:35:12 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:13.11569 WARN 16:35:13 Gossip stage has 32
[jira] [Commented] (CASSANDRA-10321) Gossip to dead nodes caused CPU usage to be 100%
[ https://issues.apache.org/jira/browse/CASSANDRA-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744120#comment-14744120 ] Dikang Gu commented on CASSANDRA-10321: --- [~mambocab], the cluster is using NetworkTopologyStrategy, and is crossing 3 datacenters. You can see the dead node in the last line of the log, "2401:db00:2020:716b:face:0:21:0" is the dead node I referred to, and the cassandra process on it is still running, but some now, the "Thrift active" and "Gossip active" are false in the nodetool info output. Yeah, we shouldn't expect it caused 100% cpu usage on some nodes, right? > Gossip to dead nodes caused CPU usage to be 100% > - > > Key: CASSANDRA-10321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10321 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > For one node, the cpu usage jumped to 100%, and logs are full of: > 2015-09-14_16:34:45.56407 WARN 16:34:45 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:46.66616 WARN 16:34:46 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:47.76830 WARN 16:34:47 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:48.87043 WARN 16:34:48 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:49.97253 WARN 16:34:49 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:51.07462 WARN 16:34:51 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:52.17669 WARN 16:34:52 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:53.27880 WARN 16:34:53 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:54.38090 WARN 16:34:54 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:55.48301 WARN 16:34:55 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:56.58509 WARN 16:34:56 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:57.68721 WARN 16:34:57 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:58.78932 WARN 16:34:58 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:59.89142 WARN 16:34:59 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:00.99352 WARN 16:35:00 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:02.09563 WARN 16:35:02 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:03.19775 WARN 16:35:03 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:04.29982 WARN 16:35:04 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:05.40187 WARN 16:35:05 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:06.50369 WARN 16:35:06 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:07.60577 WARN 16:35:07 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:08.70779 WARN 16:35:08 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:09.80968 WARN 16:35:09 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:10.91157 WARN 16:35:10 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:12.01365 WARN 16:35:12 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:13.11569 WARN 16:35:13 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:14.21757 WARN 16:35:14 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:15.31942 WARN 16:35:15 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:16.42132 WARN 16:35:16 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:17.52332 WARN
[jira] [Commented] (CASSANDRA-10321) Gossip to dead nodes caused CPU usage to be 100%
[ https://issues.apache.org/jira/browse/CASSANDRA-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744104#comment-14744104 ] Jim Witschey commented on CASSANDRA-10321: -- [~dikanggu] What was the state and topology of the cluster? You mentioned in the title that the gossip is with a dead node or nodes, but didn't mention it in your description. Do you know of steps to reproduce? What behavior do you expect? I assume the unexpected part of this issue is the high CPU usage, but I just wanted to confirm. > Gossip to dead nodes caused CPU usage to be 100% > - > > Key: CASSANDRA-10321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10321 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu > > For one node, the cpu usage jumped to 100%, and logs are full of: > 2015-09-14_16:34:45.56407 WARN 16:34:45 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:46.66616 WARN 16:34:46 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:47.76830 WARN 16:34:47 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:48.87043 WARN 16:34:48 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:49.97253 WARN 16:34:49 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:51.07462 WARN 16:34:51 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:52.17669 WARN 16:34:52 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:53.27880 WARN 16:34:53 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:54.38090 WARN 16:34:54 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:55.48301 WARN 16:34:55 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:56.58509 WARN 16:34:56 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:57.68721 WARN 16:34:57 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:58.78932 WARN 16:34:58 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:34:59.89142 WARN 16:34:59 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:00.99352 WARN 16:35:00 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:02.09563 WARN 16:35:02 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:03.19775 WARN 16:35:03 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:04.29982 WARN 16:35:04 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:05.40187 WARN 16:35:05 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:06.50369 WARN 16:35:06 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:07.60577 WARN 16:35:07 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:08.70779 WARN 16:35:08 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:09.80968 WARN 16:35:09 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:10.91157 WARN 16:35:10 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:12.01365 WARN 16:35:12 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:13.11569 WARN 16:35:13 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:14.21757 WARN 16:35:14 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:15.31942 WARN 16:35:15 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:16.42132 WARN 16:35:16 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be marked down) > 2015-09-14_16:35:17.52332 WARN 16:35:17 Gossip stage has 32 pending tasks; > skipping status check (no nodes will be
[jira] [Commented] (CASSANDRA-10327) Performance regression in 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-10327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744082#comment-14744082 ] Benedict commented on CASSANDRA-10327: -- As [~tjake] points out [here|https://issues.apache.org/jira/browse/CASSANDRA-10326?focusedCommentId=14744064&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14744064], it's entirely possible this is caused by 2.1 failing part way through the write load. > Performance regression in 2.2 > - > > Key: CASSANDRA-10327 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10327 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.2.2 > > > Related to CASSANDRA-10326, one of the read-only workloads _appears_ to show > a regression in 2.2, however it is possible this is simply down to a > different compaction result (this shouldn't be very likely given the use of > LCS, though, and that we wait for compaction to acquiesce, and while the > different is not consistent across both runs, it is consistently worse). > The query is looking up the last item of a partition. > [run1|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=155.43&ymin=0&ymax=13777.5] > [run2|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=74.36&ymin=0&ymax=34078] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10326) Performance is worse in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744070#comment-14744070 ] Benedict edited comment on CASSANDRA-10326 at 9/14/15 7:18 PM: --- 2.2 did not fail however, and is faster than both. 2.1 was faster until it failed (it's not clear why it failed, either, possibly just a too-lengthy pause. I haven't investigated) (note also the no-collections run, in which 2.1 did not fail and was faster) was (Author: benedict): 2.2 did not fail however, and is faster than both. 2.1 was faster until it failed (it's not clear why it failed, either, possibly just a too-lengthy pause. I haven't investigated) > Performance is worse in 3.0 > --- > > Key: CASSANDRA-10326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10326 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Priority: Critical > Fix For: 3.0.x > > > Performance is generally turning out to be worse after 8099, despite a number > of unrelated performance enhancements being delivered. This isn't entirely > unexpected, given a great deal of time was spent optimising the old code, > however things appear worse than we had hoped. > My expectation was that workloads making extensive use of CQL constructs > would be faster post-8099, however the latest tests performed with very large > CQL rows, including use of collections, still exhibit performance below that > of 2.1 and 2.2. > Eventually, as the dataset size grows large enough and the locality of access > is just right, the reduction in size of our dataset will yield a window > during which some users will perform better due simply to improved page cache > hit rates. We seem to see this in some of the tests. However we should be at > least as fast (and really faster) off the bat. > The following are some large partition benchmark results, with as many as 40K > rows per partition, running LCS. There are a number of parameters we can > modify to see how behaviour changes and under what scenarios we might still > be faster, but the picture painted isn't brilliant, and is consistent, so we > should really try and figure out what's up before GA. > [trades-with-flags (collections), > blade11b|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4387.02&ymin=0&ymax=122951.4] > [trades-with-flags (collections), > blade11|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4424.75&ymin=0&ymax=130158.6] > [trades (no collections), > blade11|http://cstar.datastax.com/graph?stats=9b7da48e-570c-11e5-90fe-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2682.46&ymin=0&ymax=142547.9] > [~slebresne]: will you have time to look into this before GA? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10326) Performance is worse in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744070#comment-14744070 ] Benedict commented on CASSANDRA-10326: -- 2.2 did not fail however, and is faster than both. 2.1 was faster until it failed (it's not clear why it failed, either, possibly just a too-lengthy pause. I haven't investigated) > Performance is worse in 3.0 > --- > > Key: CASSANDRA-10326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10326 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Priority: Critical > Fix For: 3.0.x > > > Performance is generally turning out to be worse after 8099, despite a number > of unrelated performance enhancements being delivered. This isn't entirely > unexpected, given a great deal of time was spent optimising the old code, > however things appear worse than we had hoped. > My expectation was that workloads making extensive use of CQL constructs > would be faster post-8099, however the latest tests performed with very large > CQL rows, including use of collections, still exhibit performance below that > of 2.1 and 2.2. > Eventually, as the dataset size grows large enough and the locality of access > is just right, the reduction in size of our dataset will yield a window > during which some users will perform better due simply to improved page cache > hit rates. We seem to see this in some of the tests. However we should be at > least as fast (and really faster) off the bat. > The following are some large partition benchmark results, with as many as 40K > rows per partition, running LCS. There are a number of parameters we can > modify to see how behaviour changes and under what scenarios we might still > be faster, but the picture painted isn't brilliant, and is consistent, so we > should really try and figure out what's up before GA. > [trades-with-flags (collections), > blade11b|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4387.02&ymin=0&ymax=122951.4] > [trades-with-flags (collections), > blade11|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4424.75&ymin=0&ymax=130158.6] > [trades (no collections), > blade11|http://cstar.datastax.com/graph?stats=9b7da48e-570c-11e5-90fe-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2682.46&ymin=0&ymax=142547.9] > [~slebresne]: will you have time to look into this before GA? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10326) Performance is worse in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744064#comment-14744064 ] T Jake Luciani commented on CASSANDRA-10326: bq. still exhibit performance below that of 2.1 and 2.2. Note: The 2.1.9 run seems to time out many requests (looking at the console logs) which in turn make it look like it finishes the fastest. The pure read runs looks more concerning. Although I assume 2.1.9 is 2x faster because the writes failed > Performance is worse in 3.0 > --- > > Key: CASSANDRA-10326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10326 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Priority: Critical > Fix For: 3.0.x > > > Performance is generally turning out to be worse after 8099, despite a number > of unrelated performance enhancements being delivered. This isn't entirely > unexpected, given a great deal of time was spent optimising the old code, > however things appear worse than we had hoped. > My expectation was that workloads making extensive use of CQL constructs > would be faster post-8099, however the latest tests performed with very large > CQL rows, including use of collections, still exhibit performance below that > of 2.1 and 2.2. > Eventually, as the dataset size grows large enough and the locality of access > is just right, the reduction in size of our dataset will yield a window > during which some users will perform better due simply to improved page cache > hit rates. We seem to see this in some of the tests. However we should be at > least as fast (and really faster) off the bat. > The following are some large partition benchmark results, with as many as 40K > rows per partition, running LCS. There are a number of parameters we can > modify to see how behaviour changes and under what scenarios we might still > be faster, but the picture painted isn't brilliant, and is consistent, so we > should really try and figure out what's up before GA. > [trades-with-flags (collections), > blade11b|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4387.02&ymin=0&ymax=122951.4] > [trades-with-flags (collections), > blade11|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4424.75&ymin=0&ymax=130158.6] > [trades (no collections), > blade11|http://cstar.datastax.com/graph?stats=9b7da48e-570c-11e5-90fe-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2682.46&ymin=0&ymax=142547.9] > [~slebresne]: will you have time to look into this before GA? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10267) Failing tests in upgrade_trests.paging_test
[ https://issues.apache.org/jira/browse/CASSANDRA-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744050#comment-14744050 ] Jim Witschey commented on CASSANDRA-10267: -- These tests also fail, though intermittently: http://cassci.datastax.com/job/trunk_dtest-skipped-with-require/lastCompletedBuild/testReport/upgrade_tests.paging_test/TestPagingData/static_columns_paging_test/history/ http://cassci.datastax.com/job/trunk_dtest-skipped-with-require/lastCompletedBuild/testReport/upgrade_tests.paging_test/TestPagingWithDeletions/test_multiple_row_deletions/history/ [~bdeggleston] How did you want this managed in Jira? I can open a new ticket for these if you like, or did you want CASSANDRA-9761 reopened? I can also un-tag those tests so they run with normal dtest jobs if you like. > Failing tests in upgrade_trests.paging_test > --- > > Key: CASSANDRA-10267 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10267 > Project: Cassandra > Issue Type: Sub-task >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 3.0.0 rc1 > > > This is a continuation of CASSANDRA-9893 to deal with the failure of the > {{upgrade_trests.paging_test}} tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10327) Performance regression in 2.2
Benedict created CASSANDRA-10327: Summary: Performance regression in 2.2 Key: CASSANDRA-10327 URL: https://issues.apache.org/jira/browse/CASSANDRA-10327 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Fix For: 2.2.2 Related to CASSANDRA-10326, one of the read-only workloads _appears_ to show a regression in 2.2, however it is possible this is simply down to a different compaction result (this shouldn't be very likely given the use of LCS, though, and that we wait for compaction to acquiesce, and while the different is not consistent across both runs, it is consistently worse). The query is looking up the last item of a partition. [run1|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=155.43&ymin=0&ymax=13777.5] [run2|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=74.36&ymin=0&ymax=34078] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10326) Performance is worse in 3.0
Benedict created CASSANDRA-10326: Summary: Performance is worse in 3.0 Key: CASSANDRA-10326 URL: https://issues.apache.org/jira/browse/CASSANDRA-10326 Project: Cassandra Issue Type: Bug Reporter: Benedict Priority: Critical Fix For: 3.0.x Performance is generally turning out to be worse after 8099, despite a number of unrelated performance enhancements being delivered. This isn't entirely unexpected, given a great deal of time was spent optimising the old code, however things appear worse than we had hoped. My expectation was that workloads making extensive use of CQL constructs would be faster post-8099, however the latest tests performed with very large CQL rows, including use of collections, still exhibit performance below that of 2.1 and 2.2. Eventually, as the dataset size grows large enough and the locality of access is just right, the reduction in size of our dataset will yield a window during which some users will perform better due simply to improved page cache hit rates. We seem to see this in some of the tests. However we should be at least as fast (and really faster) off the bat. The following are some large partition benchmark results, with as many as 40K rows per partition, running LCS. There are a number of parameters we can modify to see how behaviour changes and under what scenarios we might still be faster, but the picture painted isn't brilliant, and is consistent, so we should really try and figure out what's up before GA. [trades-with-flags (collections), blade11b|http://cstar.datastax.com/graph?stats=f0a17292-5a13-11e5-847a-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4387.02&ymin=0&ymax=122951.4] [trades-with-flags (collections), blade11|http://cstar.datastax.com/graph?stats=e250-5a13-11e5-ae0d-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=4424.75&ymin=0&ymax=130158.6] [trades (no collections), blade11|http://cstar.datastax.com/graph?stats=9b7da48e-570c-11e5-90fe-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=2682.46&ymin=0&ymax=142547.9] [~slebresne]: will you have time to look into this before GA? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10198) 3.0 hints should be streamed on decomission
[ https://issues.apache.org/jira/browse/CASSANDRA-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744041#comment-14744041 ] Aleksey Yeschenko commented on CASSANDRA-10198: --- bq. Patch has an upper limit to how many times we retry before failing - do we want to retry forever? Not sure. But failure should be more obvious than just an error log line. We should probably fail the decom? > 3.0 hints should be streamed on decomission > --- > > Key: CASSANDRA-10198 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10198 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksey Yeschenko >Assignee: Marcus Eriksson > Fix For: 3.0.0 rc1 > > > CASSANDRA-6230 added all the necessary pieces in the initial release, but > streaming itself didn't make it in time. > Now that hints are stored in flat files, we cannot just stream hints > sstables. Instead we need to handoff hints files. > Essentially we need to rewrite {{StorageService::streamHints}} to be > CASSANDRA-6230 aware. > {{HintMessage}} and {{HintVerbHandler}} can already handle hints targeted for > other nodes (see javadoc for both, it's documented reasonably). > {{HintsDispatcher}} also takes hostId as an argument, and can stream any > hints to any nodes. > The building blocks are all there - we just need > {{StorageService::streamHints}} to pick the optimal candidate for each file > and use {{HintsDispatcher}} to stream the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10324) Create randomized test of DataResolver
[ https://issues.apache.org/jira/browse/CASSANDRA-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-10324: Description: As discussed in CASSANDRA-10266, testing DataResolver with random (and repeatable) sequences of node query combinations, messaging failures, memtable flushes, and compactions would have revealed the bug found there, and possibly others. (was: As discussed in CASSANDRA-10266, testing DataResolver with random (and repeatable) sequences of node query combinations, messaging failures, memtable flushes, and compactions would have revealed the bug found here) > Create randomized test of DataResolver > -- > > Key: CASSANDRA-10324 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10324 > Project: Cassandra > Issue Type: Test >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 3.x > > > As discussed in CASSANDRA-10266, testing DataResolver with random (and > repeatable) sequences of node query combinations, messaging failures, > memtable flushes, and compactions would have revealed the bug found there, > and possibly others. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10324) Create randomized test of DataResolver
[ https://issues.apache.org/jira/browse/CASSANDRA-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-10324: Description: As discussed in CASSANDRA-10266, testing DataResolver with random (and repeatable) sequences of node query combinations, messaging failures, memtable flushes, and compactions would have revealed the bug found here > Create randomized test of DataResolver > -- > > Key: CASSANDRA-10324 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10324 > Project: Cassandra > Issue Type: Test >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 3.x > > > As discussed in CASSANDRA-10266, testing DataResolver with random (and > repeatable) sequences of node query combinations, messaging failures, > memtable flushes, and compactions would have revealed the bug found here -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10325) Create randomized test of Rows
Blake Eggleston created CASSANDRA-10325: --- Summary: Create randomized test of Rows Key: CASSANDRA-10325 URL: https://issues.apache.org/jira/browse/CASSANDRA-10325 Project: Cassandra Issue Type: Test Reporter: Blake Eggleston Assignee: Blake Eggleston Priority: Minor Fix For: 3.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10324) Create randomized test of DataResolver
Blake Eggleston created CASSANDRA-10324: --- Summary: Create randomized test of DataResolver Key: CASSANDRA-10324 URL: https://issues.apache.org/jira/browse/CASSANDRA-10324 Project: Cassandra Issue Type: Test Reporter: Blake Eggleston Assignee: Blake Eggleston Priority: Minor Fix For: 3.x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10266) Introduce direct unit test coverage for Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744023#comment-14744023 ] Blake Eggleston commented on CASSANDRA-10266: - bq. Am I interpreting that sentence correctly as a suggestion we actually follow up with that? Yes, definitely > Introduce direct unit test coverage for Rows > > > Key: CASSANDRA-10266 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10266 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > As with much of the codebase, we have no direct unit test coverage for > {{Rows}}, and we should remedy this given how central it is to behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10322) skipBytes is used extensively, but is slow
[ https://issues.apache.org/jira/browse/CASSANDRA-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744013#comment-14744013 ] Benedict commented on CASSANDRA-10322: -- Patch available [here|https://github.com/belliottsmith/cassandra/tree/10322] > skipBytes is used extensively, but is slow > -- > > Key: CASSANDRA-10322 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10322 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Minor > Fix For: 3.0.x > > > We skip a great deal to avoid materializing data. Ironically, however, > skipping is just as (perhaps more) expensive, as it allocates a temporary > array of the size of the number of bytes we want to skip. > This trivial patch implements {{skipBytes}} more efficiently, and simplifies > {{FileUtils.skipBytesFully}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10323) Add more MaterializedView metrics
T Jake Luciani created CASSANDRA-10323: -- Summary: Add more MaterializedView metrics Key: CASSANDRA-10323 URL: https://issues.apache.org/jira/browse/CASSANDRA-10323 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 3.0.0 rc1 We need to add more metrics to help understand where time is spent in materialized view writes. We currently track the ratio of async base -> view mutations that fail. We should also add * The amount of time spent waiting for the partition lock (contention) * The amount of time spent reading data Any others? [~carlyeks] [~jkni] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10322) skipBytes is used extensively, but is slow
Benedict created CASSANDRA-10322: Summary: skipBytes is used extensively, but is slow Key: CASSANDRA-10322 URL: https://issues.apache.org/jira/browse/CASSANDRA-10322 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 3.0.x We skip a great deal to avoid materializing data. Ironically, however, skipping is just as (perhaps more) expensive, as it allocates a temporary array of the size of the number of bytes we want to skip. This trivial patch implements {{skipBytes}} more efficiently, and simplifies {{FileUtils.skipBytesFully}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10266) Introduce direct unit test coverage for Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744005#comment-14744005 ] Benedict commented on CASSANDRA-10266: -- I guess I missed that crucial sentence where you sneakily suggested covering this _plus more_. Am I interpreting that sentence correctly as a suggestion we actually follow up with that? Either way, let's commit this one (especially to get the bug fix in) and target that as a follow up. > Introduce direct unit test coverage for Rows > > > Key: CASSANDRA-10266 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10266 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > As with much of the codebase, we have no direct unit test coverage for > {{Rows}}, and we should remedy this given how central it is to behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10266) Introduce direct unit test coverage for Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743982#comment-14743982 ] Blake Eggleston commented on CASSANDRA-10266: - I agree with your first point. To your second, I'm confident that I've explored every branch, but not every possible combination of inputs. As far as being future proof, I'm not sure one approach will be more future proof than another, since either approach will need to be updated as changes are made. In any case, I agree that a randomized test will be more thorough. You didn't address my second point though, which is that for a similar effort, we can exercise 10x the amount of code, even more edge cases, and more potential bugs, albeit at the cost of a fully comprehensive test of Rows (randomized testing of DataResolver would exercise merge and diff, but not copy and collectStats). I think both should be implemented, but I don't think they can both be implemented for rc1. WDYT? > Introduce direct unit test coverage for Rows > > > Key: CASSANDRA-10266 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10266 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > As with much of the codebase, we have no direct unit test coverage for > {{Rows}}, and we should remedy this given how central it is to behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10314) Update index file format
[ https://issues.apache.org/jira/browse/CASSANDRA-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743977#comment-14743977 ] Ariel Weisberg commented on CASSANDRA-10314: The addition to DataOutputPlus seems reasonable. I can't come up with a strong objection to that approach over some of the others for having an optional bit of functionality like that. There are things besides files that can provide a pointer, but this is so trivial to refactor/maintain in the future I would just tackle it then. So +1 pending on Cassci. > Update index file format > > > Key: CASSANDRA-10314 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10314 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp > Fix For: 3.0.0 rc1 > > > As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap > key-cache is still a goal, we should change the index file format to meet > off-heap requirements (so I've set fixver to 3.0rc1). > Off-heap (and mmap'd index files) need the offsets of the individual > IndexInfo objects and the at least the offset field of IndexInfo structures. > The format I propose is as follows: > {noformat} > (long) position (vint since 3.0, 64bit before) > (int) serialized size of data that follows (vint since 3.0, 32bit before) > -- following for indexed entries only (so serialized size > 0) > (long) header-length (vint since 3.0) > (int) DeletionTime.localDeletionTime (32 bit int) > (long) DeletionTime.markedForDeletionAt (64 bit long) > (int) number of IndexInfo objects (vint since 3.0, 32bit before) > (*) serialized IndexInfo objects, see below > (*) offsets of serialized IndexInfo objects, since version "ma" (3.0) > Each IndexInfo object's offset is relative to the first IndexInfo > object. > {noformat} > {noformat} > (*) IndexInfo.firstName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (*) IndexInfo.lastName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (long) IndexInfo.offset (vint encoded since 3.0, 64bit int before) > (long) IndexInfo.width (vint encoded since 3.0, 64bit int before) > (bool) IndexInfo.endOpenMarker != null (if 3.0) > (int) IndexInfo.endOpenMarker.localDeletionTime(if 3.0 && > IndexInfo.endOpenMarker != null) > (long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && > IndexInfo.endOpenMarker != null) > {noformat} > Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two > options. > * Serialize both of them or > * Serialize only the offset field plus a _last byte offset_ to be able to > recalculate the width of the last IndexInfo > The first option is probably the simpler one, the second saves a few bytes > (those of the vint encoded width). > EDIT: update vint fields (as per CASSANDRA-10232) > EDIT2: add header-length fields (as per CASSANDRA-10232) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-10319) aggregate sum on Counter type of Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp resolved CASSANDRA-10319. -- Resolution: Duplicate Fix Version/s: (was: 2.2.x) (was: 3.x) This issue is already addressed in CASSANDRA-9977. > aggregate sum on Counter type of Cassandra > -- > > Key: CASSANDRA-10319 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10319 > Project: Cassandra > Issue Type: Wish > Components: Core >Reporter: Xin Ren > > I'm using Cassandra 2.2.0, and I've read "using Counter type". > Is there a way to aggregate on Counter type column for Cassandra, similar to > the following? > {code}SELECT sum(my_counter_column) FROM my_table ;{code} > The above query results in this error: > {code} > InvalidRequest: code=2200 [Invalid query] message="Invalid call to > function sum, none of its type signatures match (known type > signatures: system.sum : (tinyint) -> tinyint, system.sum : (smallint) > -> smallint, system.sum : (int) -> int, system.sum : (bigint) -> bigint, > system.sum : (float) -> float, system.sum : (double) -> > double, system.sum : (decimal) -> decimal, system.sum : (varint) -> > varint)" > {code} > I know I can fetch all data and then do aggregation in the client, but I'm > just wondering if it can be done within Cassandra. Thanks a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10321) Gossip to dead nodes caused CPU usage to be 100%
Dikang Gu created CASSANDRA-10321: - Summary: Gossip to dead nodes caused CPU usage to be 100% Key: CASSANDRA-10321 URL: https://issues.apache.org/jira/browse/CASSANDRA-10321 Project: Cassandra Issue Type: Bug Reporter: Dikang Gu For one node, the cpu usage jumped to 100%, and logs are full of: 2015-09-14_16:34:45.56407 WARN 16:34:45 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:46.66616 WARN 16:34:46 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:47.76830 WARN 16:34:47 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:48.87043 WARN 16:34:48 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:49.97253 WARN 16:34:49 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:51.07462 WARN 16:34:51 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:52.17669 WARN 16:34:52 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:53.27880 WARN 16:34:53 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:54.38090 WARN 16:34:54 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:55.48301 WARN 16:34:55 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:56.58509 WARN 16:34:56 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:57.68721 WARN 16:34:57 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:58.78932 WARN 16:34:58 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:34:59.89142 WARN 16:34:59 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:00.99352 WARN 16:35:00 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:02.09563 WARN 16:35:02 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:03.19775 WARN 16:35:03 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:04.29982 WARN 16:35:04 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:05.40187 WARN 16:35:05 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:06.50369 WARN 16:35:06 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:07.60577 WARN 16:35:07 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:08.70779 WARN 16:35:08 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:09.80968 WARN 16:35:09 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:10.91157 WARN 16:35:10 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:12.01365 WARN 16:35:12 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:13.11569 WARN 16:35:13 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:14.21757 WARN 16:35:14 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:15.31942 WARN 16:35:15 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:16.42132 WARN 16:35:16 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:17.52332 WARN 16:35:17 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:18.62511 WARN 16:35:18 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:19.72697 WARN 16:35:19 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:20.82872 WARN 16:35:20 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:21.93074 WARN 16:35:21 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:23.03281 WARN 16:35:23 Gossip stage has 32 pending tasks; skipping status check (no nodes will be marked down) 2015-09-14_16:35:24.13478 WARN
[jira] [Commented] (CASSANDRA-10292) java.lang.AssertionError: attempted to delete non-existing file CommitLog...
[ https://issues.apache.org/jira/browse/CASSANDRA-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743964#comment-14743964 ] David Loegering commented on CASSANDRA-10292: - Hi Davwd, We are seeing a similar issue with what appears to be different root cause. What was fixed to resolve this issue? > java.lang.AssertionError: attempted to delete non-existing file CommitLog... > > > Key: CASSANDRA-10292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10292 > Project: Cassandra > Issue Type: Bug > Environment: CentOS Linux 7.1.1503, Cassandra 2.1.8 stable version, 6 > nodes cluster >Reporter: Dawid Szejnfeld >Priority: Critical > > From time to time some nodes are stopping to work due to error in logs like > this: > INFO [CompactionExecutor:2475] 2015-09-09 12:36:50,363 > CompactionTask.java:274 - Compacted 4 sstables to > [/mnt/cassandra--storage-machine/data/system/compactions_in_progress-55080ab05d9c38 > 8690a4acb25fe1f77b/system-compactions_in_progress-ka-126,]. 419 bytes to 42 > (~10% of original) in 33ms = 0.001214MB/s. 4 total partitions merged to 1. > Partition merge counts were {2:2, } > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,166 > ColumnFamilyStore.java:912 - Enqueuing flush of settings: 78364 (0%) on-heap, > 0 (0%) off-heap > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,172 Memtable.java:347 - > Writing Memtable-settings@1126939979(0.113KiB serialized bytes, 1850 ops, > 0%/0% of on/off-heap limit) > INFO [MemtableFlushWriter:301] 2015-09-09 12:52:34,174 Memtable.java:382 - > Completed flushing > /mnt/cassandra--storage-machine/data/OpsCenter/settings-464866c04b1311e590698d1a9fd4ba8b/OpsCe > nter-settings-tmp-ka-12-Data.db (0.000KiB) for commitlog position > ReplayPosition(segmentId=1441362636571, position=33554415) > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,194 StorageService.java:453 > - Stopping gossiper > WARN [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 StorageService.java:359 > - Stopping gossip by operator request > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:34,195 Gossiper.java:1410 - > Announcing shutdown > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,195 StorageService.java:458 > - Stopping RPC server > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,196 ThriftServer.java:142 - > Stop listening to thrift clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,204 StorageService.java:463 > - Stopping native transport > INFO [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,422 Server.java:213 - Stop > listening for CQL clients > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-09 12:52:36,423 CommitLog.java:397 - > Failed managing commit log segments. Commit disk failure policy is stop; > terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-4-1441362636316.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:126) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegment.delete(CommitLogSegment.java:343) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:418) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$5.call(CommitLogSegmentManager.java:413) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:152) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > [apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85] > After I create missing commit log file and restart cassandra service > everything is OK then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10315) Cassandra nodes shutting down on COMMIT-LOG-ALLOCATOR error
[ https://issues.apache.org/jira/browse/CASSANDRA-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743957#comment-14743957 ] David Loegering commented on CASSANDRA-10315: - How was this issue fixed and in what version? What is causing the problem? I looked at the duplicate and the errors do not match the errors we are seeing. > Cassandra nodes shutting down on COMMIT-LOG-ALLOCATOR error > --- > > Key: CASSANDRA-10315 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10315 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: 16 GB, Cassandra 2.0.15, >Reporter: David Loegering > > After migrating from 2.0.9 to 2.0.15 all nodes on multiple clusters > Cassandra nodes shutting themselves down every 24-48 hours. The error > reported is: > {code} > ERROR [COMMIT-LOG-ALLOCATOR] 2015-09-11 17:14:48,034 CommitLog.java (line > 420) Failed to allocate new commit log segments. Commit disk failure policy > is stop; terminating thread > java.lang.AssertionError: attempted to delete non-existing file > CommitLog-3-1441961724221.log > at > org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:113) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.discard(CommitLogSegment.java:161) > at > org.apache.cassandra.db.commitlog.CommitLogAllocator$4.run(CommitLogAllocator.java:228) > at > org.apache.cassandra.db.commitlog.CommitLogAllocator$1.runMayThrow(CommitLogAllocator.java:99) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at java.lang.Thread.run(Unknown Source) > {code} > We’ve seen this error before, but now with DEBUG / TRACE on, we see that this > file has 18 entries starting at 5:14pm that look something like this: > {code} > DEBUG [COMMIT-LOG-WRITER] 2015-09-11 17:14:45,372 CommitLog.java (line 245) > Not safe to delete commit log segment > CommitLogSegment(/opt/osi/monarch/chronus/resources/cassandra/commitlog/CommitLog-3-1441961724221.log); > dirty is soe (a77b7765-1e3b-30eb-9f46-2cac8dfe1ac7), min_max_avg_hourly > (f86973a2-f5e6-36b6-9d7f-7fc4e109fb6e), > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10312) Enable custom 2i to opt out of post-streaming rebuilds
[ https://issues.apache.org/jira/browse/CASSANDRA-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743946#comment-14743946 ] Sergio Bossa commented on CASSANDRA-10312: -- [~beobal], I would make it more general (and readable) by renaming the boolean method something as {{Index#shouldBuildIndexBlocking}}, and calling it from all {{SIM#buildIndexBlocking/buildAllIndexesBlocking/rebuildIndexesBlocking}} methods. But this is a minor point, other than that I'm +1. > Enable custom 2i to opt out of post-streaming rebuilds > -- > > Key: CASSANDRA-10312 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10312 > Project: Cassandra > Issue Type: Improvement >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > Fix For: 3.0.0 rc1 > > > Custom 2i implementations may not always want to participate in index rebuild > following streaming operations, as they may prefer to have more control over > when and how the recieved sstables are indexed. > We could add an {{includeInRebuild()}} method to {{Index}} which would be > used to filter the set of indexes built in > {{SecondaryIndexManager#buildAllIndexesBlocking}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743949#comment-14743949 ] Benedict commented on CASSANDRA-8805: - OK, great. Could you post upstream branch merges? > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Carl Yeksigian > Fix For: 2.1.x > > Attachments: 8805-2.1.txt > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10320) Build off-heap key-cache compatible API
Robert Stupp created CASSANDRA-10320: Summary: Build off-heap key-cache compatible API Key: CASSANDRA-10320 URL: https://issues.apache.org/jira/browse/CASSANDRA-10320 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: Robert Stupp Fix For: 3.x (Follow-up to CASSANDRA-10314 and prerequisite for the final goal CASSANDRA-9738.) Ticket is about to extract the API code changes from CASSANDRA-9738 as a sub-task - so everything from CASSANDRA-9738 without the actual off-heap implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10314) Update index file format
[ https://issues.apache.org/jira/browse/CASSANDRA-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743926#comment-14743926 ] Robert Stupp commented on CASSANDRA-10314: -- Added another commit that leverages the file-pointer to calculate the {{IndexInfo}} offsets. It's a simple extension to {{DataOutputPlus}} and works with {{SequentialWriter}} and {{DataOutputBuffer}}. Old and new formats are tested in {{LegacySSTableTest}} ("normal" and counter tables - both with and without clustering keys - so 4 tables). > Update index file format > > > Key: CASSANDRA-10314 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10314 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp > Fix For: 3.0.0 rc1 > > > As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap > key-cache is still a goal, we should change the index file format to meet > off-heap requirements (so I've set fixver to 3.0rc1). > Off-heap (and mmap'd index files) need the offsets of the individual > IndexInfo objects and the at least the offset field of IndexInfo structures. > The format I propose is as follows: > {noformat} > (long) position (vint since 3.0, 64bit before) > (int) serialized size of data that follows (vint since 3.0, 32bit before) > -- following for indexed entries only (so serialized size > 0) > (long) header-length (vint since 3.0) > (int) DeletionTime.localDeletionTime (32 bit int) > (long) DeletionTime.markedForDeletionAt (64 bit long) > (int) number of IndexInfo objects (vint since 3.0, 32bit before) > (*) serialized IndexInfo objects, see below > (*) offsets of serialized IndexInfo objects, since version "ma" (3.0) > Each IndexInfo object's offset is relative to the first IndexInfo > object. > {noformat} > {noformat} > (*) IndexInfo.firstName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (*) IndexInfo.lastName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (long) IndexInfo.offset (vint encoded since 3.0, 64bit int before) > (long) IndexInfo.width (vint encoded since 3.0, 64bit int before) > (bool) IndexInfo.endOpenMarker != null (if 3.0) > (int) IndexInfo.endOpenMarker.localDeletionTime(if 3.0 && > IndexInfo.endOpenMarker != null) > (long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && > IndexInfo.endOpenMarker != null) > {noformat} > Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two > options. > * Serialize both of them or > * Serialize only the offset field plus a _last byte offset_ to be able to > recalculate the width of the last IndexInfo > The first option is probably the simpler one, the second saves a few bytes > (those of the vint encoded width). > EDIT: update vint fields (as per CASSANDRA-10232) > EDIT2: add header-length fields (as per CASSANDRA-10232) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10316) Improve ColumnDefinition comparison performance
[ https://issues.apache.org/jira/browse/CASSANDRA-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743917#comment-14743917 ] Benedict commented on CASSANDRA-10316: -- I forgot that we'd already half done this, by introducing a {{prefixComparison}} to {{ColumnIdentifier}}, however at the time I wasn't entirely clear what the {{position}} variable was for, and so left a huge amount of space for it. This patch shrinks that down to permit at most 4K clustering columns. This is far far in excess of anything necessary (and we can double it again if we need, or more). It then takes the first 6 bytes of the {{prefixComparison}} and appends it to the {{comparisonOrder}}, so that we do not need to go to the {{ColumnIdentifier}} at all most of the time. > Improve ColumnDefinition comparison performance > --- > > Key: CASSANDRA-10316 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10316 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Benedict > Fix For: 3.0.x > > > We have already improved the performance here for the basic comparison, > however since these happen exceedingly frequently we may as well go a little > (easy step) further. This is a really tiny patch, and we should aim to > include before GA, but not RC. > Right now we use all of an int for the metadata presorting, but in fact we > only need 2-3 bytes for this. We can upcast the int to a long, and use the > remaining bits to store the prefix of the _name_. This way, a majority of > comparisons become a single long comparison. Which will be considerably > cheaper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10241) Keep a separate production debug log for troubleshooting
[ https://issues.apache.org/jira/browse/CASSANDRA-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743886#comment-14743886 ] Robert Coli commented on CASSANDRA-10241: - Thanks [~jbellis] for the ping. I am pretty strongly +1 on a default DEBUG level log that is always enabled. Users seeking support on #cassandra/(-user@) frequently don't find out about an issue until after it has happened. When they do find out about it, I'd love to be able to tell them what classes to enable at DEBUG level. Unfortunately, I have serious FUD about accidentally instructing them to enable DEBUG on some set of classes which will render the resulting log spammy and useless. If I were able to direct them to a log that was reasonably sized and contained the last two weeks or so of relevant DEBUG information, the quality and actionability of bug reports would increase significantly. As to an approach, knocking all current DEBUG to TRACE seems reasonable on its face. "We" should opt log messages in to the new-very-useful-DEBUG-log, not opt them out. I agree with jeffj's sentiments about a handle to disable such logging, but I find that in practice concerns about logging overhead are often over-estimated. For example, the CPU/io/etc. overhead of enabling the full query log on one of my high throughput production MySQL slaves (which is more analogous to the more heavyweight "read/write path DEBUG logging") is not significant versus other impacts on my system graphs. The meaningful problem in that case is that the log is writing to a disk of finite size.. :) > Keep a separate production debug log for troubleshooting > > > Key: CASSANDRA-10241 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10241 > Project: Cassandra > Issue Type: New Feature > Components: Config >Reporter: Jonathan Ellis >Assignee: Paulo Motta > Fix For: 2.1.x, 2.2.x, 3.0.x > > > [~aweisberg] had the suggestion to keep a separate debug log for aid in > troubleshooting, not intended for regular human consumption but where we can > log things that might help if something goes wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743887#comment-14743887 ] Carl Yeksigian commented on CASSANDRA-8805: --- Sorry, I missed that you had moved it up -- looks good. > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Carl Yeksigian > Fix For: 2.1.x > > Attachments: 8805-2.1.txt > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743881#comment-14743881 ] Benedict commented on CASSANDRA-8805: - There wasn't a finally clause before, and it's currently being executed in all normal executions (whereas under failure it is discarded anyway), so I must admit I'm not certain what you're reasoning is. If you could elaborate I'd appreciate it. > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Carl Yeksigian > Fix For: 2.1.x > > Attachments: 8805-2.1.txt > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10266) Introduce direct unit test coverage for Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743873#comment-14743873 ] Benedict commented on CASSANDRA-10266: -- bq. rewriting each of the methods in the test class to verify their output. For something so critical this wouldn't necessarily be a bad thing, so long as they are implemented orthogonally (and we can of course implement it as inefficiently as we like). However there's no need to go that far; the randomiser can work backwards, generating the expected output at the same time as the input. bq. not too difficult to exhaustively unit test. That's why I ask "Are you confident that all possible combinations are explored?" - this code is not so simple that I can tell if this is the case by inspection, and any future changes to this code only make that harder to guarantee. However if you're comfortable this is future proof and covers all combinations, I'll try to convince myself as well. > Introduce direct unit test coverage for Rows > > > Key: CASSANDRA-10266 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10266 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > As with much of the codebase, we have no direct unit test coverage for > {{Rows}}, and we should remedy this given how central it is to behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8805) runWithCompactionsDisabled only cancels compactions, which is not the only source of markCompacted
[ https://issues.apache.org/jira/browse/CASSANDRA-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743872#comment-14743872 ] Carl Yeksigian commented on CASSANDRA-8805: --- I think you need to add the {{newSSTables.add}} back to the finally clause. Otherwise, changes look good. > runWithCompactionsDisabled only cancels compactions, which is not the only > source of markCompacted > -- > > Key: CASSANDRA-8805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8805 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Carl Yeksigian > Fix For: 2.1.x > > Attachments: 8805-2.1.txt > > > Operations like repair that may operate over all sstables cancel compactions > before beginning, and fail if there are any files marked compacting after > doing so. Redistribution of index summaries is not a compaction, so is not > cancelled by this action, but does mark sstables as compacting, so such an > action will fail to initiate if there is an index summary redistribution in > progress. It seems that IndexSummaryManager needs to register itself as > interruptible along with compactions (AFAICT no other actions that may > markCompacting are not themselves compactions). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out
[ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743870#comment-14743870 ] Ariel Weisberg commented on CASSANDRA-7392: --- bq. I propose to log at WARN a generic message such as some CQL queries timed out using the no spam logger to avoid polluting the main log file. Then, we log the full details at DEBUG level and follow what Paulo Motta is doing for CASSANDRA-10241 by either adding a new appender writing to query.log or by using debug.log. Does this sound OK? +1. I think this dovetails nicely with the direction we are headed. bq. It's 4-5 seconds on my box and 8-9 seconds on Jenkins. So worth having the protection of a minimum. I have a vague calculus in my head for how much of my time is worth it to reduce test time. 8 seconds is worth maybe less than a half hour. bq. At this level in ReadCommand the query is already bound. So the only parameter we could add is the maximum query size. I am not sure we should truncate the statement if we are logging at DEBUG level in a separate log file. Well since this is just read queries and read queries tend to not be ginormous (unlike writes) I could be comfortable. To really do this right you don't want to truncate the query you want to truncate the larger values in the query so that the shape of the query is still visible as are the smaller values that usually serve as keys. So out of scope, something to think about for later. For writes, we are going to be retaining the queries past the timeout for the logger. If someone has a memory utilization issue they can't fix it by setting the timeout lower since the logger will still retain it and it runs on it's own period. Even in a separate debug log file rolling is a concern. One bad log statement can wipe away all the other information in a failure scenario. bq. It's to give an indication on how timely the expiration thread is, as you suggested in the first round of the code review. Ah, right! > Abort in-progress queries that time out > --- > > Key: CASSANDRA-7392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7392 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Stefania >Priority: Critical > Fix For: 3.x > > > Currently we drop queries that time out before we get to them (because node > is overloaded) but not queries that time out while being processed. > (Particularly common for index queries on data that shouldn't be indexed.) > Adding the latter and logging when we have to interrupt one gets us a poor > man's "slow query log" for free. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9921) Combine MV schema definition with MV table definition
[ https://issues.apache.org/jira/browse/CASSANDRA-9921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743861#comment-14743861 ] Adam Holmberg commented on CASSANDRA-9921: -- Thanks Aleksey. I'm still struggling a bit with this data model. Right now we get both keyspace and table change events pushed when a view is added. I understand we can't have view events yet because of protocol limitations. With just the base table name, we're unable to selectively query views for the table. One suggestion would be to cluster by {{base_table_name}} now that it's available: {code} CREATE TABLE system_schema.views (... PIMARY KEY (keyspace_name, base_table_name, view_name)) {code} Even with that, we're still faced with another query phase to get the {{system_schema.columns}} belonging to the views, once they're known. Are we just going to live with that until view events come along? > Combine MV schema definition with MV table definition > - > > Key: CASSANDRA-9921 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9921 > Project: Cassandra > Issue Type: Improvement >Reporter: Carl Yeksigian >Assignee: Carl Yeksigian > Labels: client-impacting, materializedviews > Fix For: 3.0.0 rc1 > > Attachments: 9921-unit-test.txt > > > Prevent MV from reusing {{system_schema.tables}} and instead move those > properties into the {{system_schema.materializedviews}} table to keep them > separate entities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10314) Update index file format
[ https://issues.apache.org/jira/browse/CASSANDRA-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743838#comment-14743838 ] Ariel Weisberg commented on CASSANDRA-10314: This is more conservative then I was expecting, but that is good. The one thing we have to have for 3.0 is the format. I think the next step is to do everything that is in 9738 (as a new task), but without changing the key cache implementation. Then as another task bring in the OHC key cache as an option, and then as a final task finish up by mapping the index file. Is it possible to avoid calculating the serialized size to find the offset to write? If we are testing loading both the old and new format and Cassci is happy then +1. It looks like testall failed to clone the git repo. > Update index file format > > > Key: CASSANDRA-10314 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10314 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp > Fix For: 3.0.0 rc1 > > > As CASSANDRA-9738 may not make it into 3.0rc1, but having an off-heap > key-cache is still a goal, we should change the index file format to meet > off-heap requirements (so I've set fixver to 3.0rc1). > Off-heap (and mmap'd index files) need the offsets of the individual > IndexInfo objects and the at least the offset field of IndexInfo structures. > The format I propose is as follows: > {noformat} > (long) position (vint since 3.0, 64bit before) > (int) serialized size of data that follows (vint since 3.0, 32bit before) > -- following for indexed entries only (so serialized size > 0) > (long) header-length (vint since 3.0) > (int) DeletionTime.localDeletionTime (32 bit int) > (long) DeletionTime.markedForDeletionAt (64 bit long) > (int) number of IndexInfo objects (vint since 3.0, 32bit before) > (*) serialized IndexInfo objects, see below > (*) offsets of serialized IndexInfo objects, since version "ma" (3.0) > Each IndexInfo object's offset is relative to the first IndexInfo > object. > {noformat} > {noformat} > (*) IndexInfo.firstName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (*) IndexInfo.lastName (ClusteringPrefix serializer, either > Clustering.serializer.serialize or Slice.Bound.serializer.serialize) > (long) IndexInfo.offset (vint encoded since 3.0, 64bit int before) > (long) IndexInfo.width (vint encoded since 3.0, 64bit int before) > (bool) IndexInfo.endOpenMarker != null (if 3.0) > (int) IndexInfo.endOpenMarker.localDeletionTime(if 3.0 && > IndexInfo.endOpenMarker != null) > (long) IndexInfo.endOpenMarker.markedForDeletionAt (if 3.0 && > IndexInfo.endOpenMarker != null) > {noformat} > Regarding the {{IndexInfo.offset}} and {{.width}} fields there are two > options. > * Serialize both of them or > * Serialize only the offset field plus a _last byte offset_ to be able to > recalculate the width of the last IndexInfo > The first option is probably the simpler one, the second saves a few bytes > (those of the vint encoded width). > EDIT: update vint fields (as per CASSANDRA-10232) > EDIT2: add header-length fields (as per CASSANDRA-10232) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10266) Introduce direct unit test coverage for Rows
[ https://issues.apache.org/jira/browse/CASSANDRA-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743806#comment-14743806 ] Blake Eggleston commented on CASSANDRA-10266: - I think adding random testing is a great idea. In fact, I wrote a randomized failure simulator test for epaxos ([here|https://github.com/bdeggleston/cassandra/blob/CASSANDRA-6246-trunk/test/long/org/apache/cassandra/service/epaxos/EpaxosFuzzer.java#L48-48]) that was very useful. That said, I'm not sure that randomized testing of lower level classes like Rows is the best way to go. First, checking the result of random inputs is basically going to require rewriting each of the methods in the test class to verify their output. Second, although these methods are used all over the place, they're pretty narrow in scope, and not too difficult to exhaustively unit test. I think randomized testing would be most valuable when testing how multiple classes work together, ideally simulating the interaction between multiple nodes. Randomized integration tests basically. For instance, testing DataResolver with random (and repeatable) sequences of node query combinations, messaging failures, memtable flushes, and compactions would have revealed the bug found here, and it probably would have exposed bugs that existed in other classes as well. The code used to verify the results would be simpler as well. > Introduce direct unit test coverage for Rows > > > Key: CASSANDRA-10266 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10266 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Benedict >Assignee: Blake Eggleston > Fix For: 3.0.0 rc1 > > > As with much of the codebase, we have no direct unit test coverage for > {{Rows}}, and we should remedy this given how central it is to behaviour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-10319) aggregate sum on Counter type of Cassandra
Xin Ren created CASSANDRA-10319: --- Summary: aggregate sum on Counter type of Cassandra Key: CASSANDRA-10319 URL: https://issues.apache.org/jira/browse/CASSANDRA-10319 Project: Cassandra Issue Type: Wish Components: Core Reporter: Xin Ren Fix For: 3.x, 2.2.x I'm using Cassandra 2.2.0, and I've read "using Counter type". Is there a way to aggregate on Counter type column for Cassandra, similar to the following? {code}SELECT sum(my_counter_column) FROM my_table ;{code} The above query results in this error: {code} InvalidRequest: code=2200 [Invalid query] message="Invalid call to function sum, none of its type signatures match (known type signatures: system.sum : (tinyint) -> tinyint, system.sum : (smallint) -> smallint, system.sum : (int) -> int, system.sum : (bigint) -> bigint, system.sum : (float) -> float, system.sum : (double) -> double, system.sum : (decimal) -> decimal, system.sum : (varint) -> varint)" {code} I know I can fetch all data and then do aggregation in the client, but I'm just wondering if it can be done within Cassandra. Thanks a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10303) stream always hang up when use rebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Witschey updated CASSANDRA-10303: - Description: we add another datacenter. use nodetool rebuild DC1 stream from some node of old datacenter always hang up with these exception: {code} ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - Exception in thread Thread[Thread-1472,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Connection timed out at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.8.jar:2.1.8] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] Caused by: java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_60] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) ~[na:1.7.0_60] at org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) ~[apache-cassandra-2.1.8.jar:2.1.8] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.8.jar:2.1.8] ... 1 common frames omitted {code} i must restart node to stop current rebuild, and rebuild agagin and again to success was: we add another datacenter. use nodetool rebuild DC1 stream from some node of old datacenter always hang up with these exception: ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - Exception in thread Thread[Thread-1472,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Connection timed out at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.8.jar:2.1.8] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] Caused by: java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_60] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109) ~[na:1.7.0_60] at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) ~[na:1.7.0_60] at org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:172) ~[apache-cassandra-2.1.8.jar:2.1.8] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.8.jar:2.1.8] ... 1 common frames omitted i must restart node to stop current rebuild, and rebuild agagin and again to success > stream always hang up when use rebuild > -- > > Key: CASSANDRA-10303 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10303 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: jdk1.7 > cassandra 2.1.8 >Reporter: zhaoyan > > we add another datacenter. > use nodetool rebuild DC1 > stream from some node of old datacenter always hang up with these exception: > {code} > ERROR [Thread-1472] 2015-09-10 19:24:53,091 CassandraDaemon.java:223 - > Exception in thread Thread[Thread-1472,5,RMI Runtime] > java.lang.RuntimeException: java.io.IOException: Connection timed out > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.8.jar:2.1.8] > at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_60] > Caused by: java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_60] > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_60] > at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_60] > at sun.nio.ch.
[jira] [Updated] (CASSANDRA-10318) Update cqlsh COPY for new internal driver serialization interface
[ https://issues.apache.org/jira/browse/CASSANDRA-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Witschey updated CASSANDRA-10318: - Assignee: Stefania > Update cqlsh COPY for new internal driver serialization interface > - > > Key: CASSANDRA-10318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10318 > Project: Cassandra > Issue Type: Bug >Reporter: Adam Holmberg >Assignee: Stefania > Fix For: 3.0.0 rc1 > > > A recent driver update changed some of the internal serialization interface. > cqlsh relies on this for the copy command and will need to be updated. > Previously used > {code} > cassandra.protocol.QueryMesage.to_binary > {code} > now should use > {code} > cassandra.protocol.ProtocolHandler.encode_message(...) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10318) Update cqlsh COPY for new internal driver serialization interface
[ https://issues.apache.org/jira/browse/CASSANDRA-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743790#comment-14743790 ] Jim Witschey commented on CASSANDRA-10318: -- [~Stefania] I'm assigning you here because you've done some work on cqlsh recently, but please let me know if you'd like me to find someone else. > Update cqlsh COPY for new internal driver serialization interface > - > > Key: CASSANDRA-10318 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10318 > Project: Cassandra > Issue Type: Bug >Reporter: Adam Holmberg >Assignee: Stefania > Fix For: 3.0.0 rc1 > > > A recent driver update changed some of the internal serialization interface. > cqlsh relies on this for the copy command and will need to be updated. > Previously used > {code} > cassandra.protocol.QueryMesage.to_binary > {code} > now should use > {code} > cassandra.protocol.ProtocolHandler.encode_message(...) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10216) Remove target type from internal index metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743792#comment-14743792 ] Sam Tunnicliffe commented on CASSANDRA-10216: - tl;dr the {{target}} stuff is changed in {{74a234d}}, (there's also a further update to the python driver [here|https://github.com/beobal/python-driver/commit/3e507497386f689440ebc0202b5cd95a11597dad]). The other nits are addressed in {{cb8d43c}}. bq.Hence why I want to match the user statement. I totally get that, I was just thinking about the fact that the syntax itself is somewhat inconsistent. The reason for that is completely understandable, I was just wary of blindly reproducing it in the metadata. Also, I wasn't really putting any weight on the current implementation of {{IndexTarget}}, just using it to make the point that the specific syntax for 'regular' indexes is missing some information which we have to infer. Anyway, I don't mean to labour the point, I'm cool with going with your approach b/c of being able to recreate the index without having to parse {{target}}. bq.Well, since you mention it, I would have a slight preference for actually using another "type" for that (REGULAR, NONE, SIMPLE, whatever). wfm, I've added {{IndexTarget.Type.SIMPLE}} bq.The fact that "values" is the default for collection is an historical accident...but my preference would be to add support for CREATE INDEX ON t(values(myCollection)) Sure, I'm aware that that's the history of the thing. Adding an explicit version to the syntax is my preference too. > Remove target type from internal index metadata > --- > > Key: CASSANDRA-10216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10216 > Project: Cassandra > Issue Type: Improvement >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > Labels: client-impacting > Fix For: 3.0.0 rc1 > > > As part of CASSANDRA-6716 & in anticipation of CASSANDRA-10124, a distinction > was introduced between secondary indexes which target a fixed set of 1 or > more columns in the base data, and those which are agnostic to the structure > of the underlying rows. This distinction is manifested in > {{IndexMetadata.targetType}} and {{system_schema.indexes}}, in the > {{target_type}} column. It could be argued that this distinction complicates > the codebase without providing any tangible benefit, given that the target > type is not actually used anywhere. > It's only the impact on {{system_schema.indexes}} that makes puts this on the > critical path for 3.0, any code changes are just implementation details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10301) Search for items past end of descending BTreeSearchIterator can fail
[ https://issues.apache.org/jira/browse/CASSANDRA-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743757#comment-14743757 ] Benedict commented on CASSANDRA-10301: -- Well, it looks like "common parlance" == "benedict parlance" as I can find only very few uses, non reputable. > Search for items past end of descending BTreeSearchIterator can fail > > > Key: CASSANDRA-10301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Blocker > Fix For: 3.0.0 rc1 > > > A very simple problem, but obvious and with simple fix once it is made > apparent. > The internal {{seekTo}} method uses {{binarySearch}} semantics for its return > value, however when searching backwards {{-1}} is a real value that should be > returned to the client, as it indicates "past the end" - so basing inexact > matches from -1 leads to a conflicting meaning, and so it gets > misinterpreted. Rebasing inexact results to -2 fixes the problem. > This was not caught because the randomized testing apparently did not test > for values outside the bounds of the btree. This has been fixed as well, and > the tests did easily exhibit the problem without the fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)