[jira] [Commented] (CASSANDRA-6588) Add a 'NO EMPTY RESULTS' filter to SELECT
[ https://issues.apache.org/jira/browse/CASSANDRA-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147545#comment-15147545 ] Gianluca Borello commented on CASSANDRA-6588: - Thanks for the comment. As you predicted, this makes a night and day difference. Running my benchmark script (the one I explained in the mailing list thread) on 3.3 gives: Response time for querying a single column on a large table (column size 10 MB): 10 columns: 236 ms 20 columns: 684 ms 30 columns: 1096 ms 40 columns: 1219 ms 50 columns: 1809 ms ... (heap failure after this) Running it on the latest trunk as of today: Response time for querying a single column on a large table (column size 10 MB): 10 columns: 52 ms 20 columns: 59 ms 30 columns: 72 ms 40 columns: 100 ms 50 columns: 109 ms 60 columns: 134 ms 70 columns: 155 ms 80 columns: 165 ms 90 columns: 178 ms 100 columns: 199 ms That's absolutely perfect, I just wish this was addressed in 2.1 or maybe even 2.2, moving my production environment to 3.4 is way too scary. > Add a 'NO EMPTY RESULTS' filter to SELECT > - > > Key: CASSANDRA-6588 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6588 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Priority: Minor > > It is the semantic of CQL that a (CQL) row exists as long as it has one > non-null column (including the PK columns, which, given that no PK columns > can be null, means that it's enough to have the PK set for a row to exist). > This does means that the result to > {noformat} > CREATE TABLE test (k int PRIMARY KEY, v1 int, v2 int); > INSERT INTO test(k, v1) VALUES (0, 4); > SELECT v2 FROM test; > {noformat} > must be (and is) > {noformat} > v2 > -- > null > {noformat} > That fact does mean however that when we only select a few columns of a row, > we still need to find out rows that exist but have no values for the selected > columns. Long story short, given how the storage engine works, this means we > need to query full (CQL) rows even when only some of the columns are selected > because that's the only way to distinguish between "the row exists but have > no value for the selected columns" and "the row doesn't exist". I'll note in > particular that, due to CASSANDRA-5762, we can't unfortunately rely on the > row marker to optimize that out. > Now, when you selects only a subsets of the columns of a row, there is many > cases where you don't care about rows that exists but have no value for the > columns you requested and are happy to filter those out. So, for those cases, > we could provided a new SELECT filter. Outside the potential convenience (not > having to filter empty results client side), one interesting part is that > when this filter is provided, we could optimize a bit by only querying the > columns selected, since we wouldn't need to return rows that exists but have > no values for the selected columns. > For the exact syntax, there is probably a bunch of options. For instance: > * {{SELECT NON EMPTY(v2, v3) FROM test}}: the vague rational for putting it > in the SELECT part is that such filter is kind of in the spirit to DISTINCT. > Possibly a bit ugly outside of that. > * {{SELECT v2, v3 FROM test NO EMPTY RESULTS}} or {{SELECT v2, v3 FROM test > NO EMPTY ROWS}} or {{SELECT v2, v3 FROM test NO EMPTY}}: the last one is > shorter but maybe a bit less explicit. As for {{RESULTS}} versus {{ROWS}}, > the only small object to {{NO EMPTY ROWS}} could be that it might suggest it > is filtering non existing rows (I mean, the fact we never ever return non > existing rows should hint that it's not what it does but well...) while we're > just filtering empty "resultSet rows". > Of course, if there is a pre-existing SQL syntax for that, it's even better, > though a very quick search didn't turn anything. Other suggestions welcome > too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6588) Add a 'NO EMPTY RESULTS' filter to SELECT
[ https://issues.apache.org/jira/browse/CASSANDRA-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146806#comment-15146806 ] Gianluca Borello commented on CASSANDRA-6588: - Just wanting to point out a relevant thread on the users mailing list that is happening right now: http://www.mail-archive.com/user@cassandra.apache.org/msg46162.html I'm truly surprised I'm the only one who is showing interest in this issue after months. Because of our current data model (which, according to the CQL documentation and best practices, is completely legit) the behavior described in this issue is causing us a huge performance penalty. > Add a 'NO EMPTY RESULTS' filter to SELECT > - > > Key: CASSANDRA-6588 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6588 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Priority: Minor > > It is the semantic of CQL that a (CQL) row exists as long as it has one > non-null column (including the PK columns, which, given that no PK columns > can be null, means that it's enough to have the PK set for a row to exist). > This does means that the result to > {noformat} > CREATE TABLE test (k int PRIMARY KEY, v1 int, v2 int); > INSERT INTO test(k, v1) VALUES (0, 4); > SELECT v2 FROM test; > {noformat} > must be (and is) > {noformat} > v2 > -- > null > {noformat} > That fact does mean however that when we only select a few columns of a row, > we still need to find out rows that exist but have no values for the selected > columns. Long story short, given how the storage engine works, this means we > need to query full (CQL) rows even when only some of the columns are selected > because that's the only way to distinguish between "the row exists but have > no value for the selected columns" and "the row doesn't exist". I'll note in > particular that, due to CASSANDRA-5762, we can't unfortunately rely on the > row marker to optimize that out. > Now, when you selects only a subsets of the columns of a row, there is many > cases where you don't care about rows that exists but have no value for the > columns you requested and are happy to filter those out. So, for those cases, > we could provided a new SELECT filter. Outside the potential convenience (not > having to filter empty results client side), one interesting part is that > when this filter is provided, we could optimize a bit by only querying the > columns selected, since we wouldn't need to return rows that exists but have > no values for the selected columns. > For the exact syntax, there is probably a bunch of options. For instance: > * {{SELECT NON EMPTY(v2, v3) FROM test}}: the vague rational for putting it > in the SELECT part is that such filter is kind of in the spirit to DISTINCT. > Possibly a bit ugly outside of that. > * {{SELECT v2, v3 FROM test NO EMPTY RESULTS}} or {{SELECT v2, v3 FROM test > NO EMPTY ROWS}} or {{SELECT v2, v3 FROM test NO EMPTY}}: the last one is > shorter but maybe a bit less explicit. As for {{RESULTS}} versus {{ROWS}}, > the only small object to {{NO EMPTY ROWS}} could be that it might suggest it > is filtering non existing rows (I mean, the fact we never ever return non > existing rows should hint that it's not what it does but well...) while we're > just filtering empty "resultSet rows". > Of course, if there is a pre-existing SQL syntax for that, it's even better, > though a very quick search didn't turn anything. Other suggestions welcome > too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8716) "java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was freed" when running cleanup
[ https://issues.apache.org/jira/browse/CASSANDRA-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341848#comment-14341848 ] Gianluca Borello commented on CASSANDRA-8716: - +1 for a workaround, I could really use a cleanup without downgrading production to 2.0.11 > "java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory > was freed" when running cleanup > -- > > Key: CASSANDRA-8716 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8716 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Centos 6.6, Cassandra 2.0.12, Oracle JDK 1.7.0_67 >Reporter: Imri Zvik >Assignee: Robert Stupp >Priority: Minor > Fix For: 2.0.13 > > Attachments: 8716.txt, system.log.gz > > > {code}Error occurred during cleanup > java.util.concurrent.ExecutionException: java.lang.AssertionError: Memory was > freed > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:234) > at > org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:272) > at > org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1115) > at > org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2177) > at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) > at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) > at > javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) > at > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) > at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) > at sun.rmi.transport.Transport$1.run(Transport.java:177) > at sun.rmi.transport.Transport$1.run(Transport.java:174) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:173) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.AssertionError: Memory was freed > at org.apache.cassandra.io.util.Memory.checkPosition(Memory.java:259) > at org.apache.cassandra.io.util.Memory.getInt(Memory.java:211) > at > org.apache.cassandra.io.sstable.IndexSummary.getIndex(IndexSummary.java:79)
[jira] [Commented] (CASSANDRA-8061) tmplink files are not removed
[ https://issues.apache.org/jira/browse/CASSANDRA-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165362#comment-14165362 ] Gianluca Borello commented on CASSANDRA-8061: - I installed 2.1 on our staging environment. About 20 CFs, mixed workload depending on the kind of traffic we test at a specific moment, usually ranging from a total of 100 to 1 writes/s, and say no more than 50 reads/s. We heavily use TTLs, most of the rows completely expire in less than 24 hours. All the Cassandra metrics are always within reasonable values, the only thing to notice is that when we push the number of writes towards the upper end, we can get a considerable number of pending compactions (around 100-200). In normal conditions, pending compactions are just a few. I am observing the tmplink files on a different number of CFs, but for your reference one that causes it is for example: create table mounted_fs_by_agent1 ( customer int, base bigint, ts bigint, agent int, weight bigint, value blob, primary key ((customer, base), agent, ts) ) WITH gc_grace_seconds = 3600 AND compaction = {'class': 'LeveledCompactionStrategy'} AND compression = { 'sstable_compression' : 'LZ4Compressor' } AND CLUSTERING ORDER BY (agent DESC) AND CLUSTERING ORDER BY (ts DESC); Where blob are small blobs of ~1000-2000 bytes. I rolled back to 2.0 because this bug is effectively a blocker for us, as the disk space is completely filled up in just a day or two because of this (unless I restart Cassandra), but if you are unable to replicate it I can set it up again and try to answer more questions you might have. When I said fresh instance, I meant "fresh install and sometimes after I actually start putting data". > tmplink files are not removed > - > > Key: CASSANDRA-8061 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8061 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Linux >Reporter: Gianluca Borello >Assignee: Ryan McGuire > > After installing 2.1.0, I'm experiencing a bunch of tmplink files that are > filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 > and that is very similar, and I confirm it happens both on 2.1.0 as well as > from the latest commit on the cassandra-2.1 branch > (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685 > from the cassandra-2.1) > Even starting with a clean keyspace, after a few hours I get: > $ sudo find /raid0 | grep tmplink | xargs du -hs > 2.7G > /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db > 13M > /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db > 1.8G > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db > 12M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db > 5.2M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db > 822M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db > 7.3M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db > 1.2G > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db > 6.7M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db > 1.1G > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db > 11M > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db > 1.7G > /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db > 812K > /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db > 122M > /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db > 744K > /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db > 660K > /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf9
[jira] [Created] (CASSANDRA-8061) tmplink files are not removed
Gianluca Borello created CASSANDRA-8061: --- Summary: tmplink files are not removed Key: CASSANDRA-8061 URL: https://issues.apache.org/jira/browse/CASSANDRA-8061 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux Reporter: Gianluca Borello After installing 2.1.0, I'm experiencing a bunch of tmplink files that are filling my disk. I found https://issues.apache.org/jira/browse/CASSANDRA-7803 and that is very similar, and I confirm it happens both on 2.1.0 as well as from the latest commit on the cassandra-2.1 branch (https://github.com/apache/cassandra/commit/aca80da38c3d86a40cc63d9a122f7d45258e4685 from the cassandra-2.1) Even starting with a clean keyspace, after a few hours I get: $ sudo find /raid0 | grep tmplink | xargs du -hs 2.7G /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Data.db 13M /raid0/cassandra/data/draios/protobuf1-ccc6dce04beb11e4abf997b38fbf920b/draios-protobuf1-tmplink-ka-4515-Index.db 1.8G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Data.db 12M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-1788-Index.db 5.2M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Index.db 822M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-2678-Data.db 7.3M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Index.db 1.2G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3283-Data.db 6.7M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Index.db 1.1G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-3951-Data.db 11M /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Index.db 1.7G /raid0/cassandra/data/draios/protobuf_by_agent1-cd071a304beb11e4abf997b38fbf920b/draios-protobuf_by_agent1-tmplink-ka-4799-Data.db 812K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Index.db 122M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Data.db 744K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Index.db 660K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Index.db 796K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Index.db 137M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-230-Data.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Data.db 139M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-234-Data.db 940K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Index.db 936K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-269-Index.db 161M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-786-Data.db 672K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Index.db 113M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-193-Data.db 116M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-197-Data.db 712K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-208-Index.db 127M /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-739-Data.db 776K /raid0/cassandra/data/draios/mounted_fs_by_agent1-d7bf3e304beb11e4abf997b38fbf920b/draios-mounted_fs_by_agent1-tmplink-ka-487-Index.d
[jira] [Commented] (CASSANDRA-8028) Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158457#comment-14158457 ] Gianluca Borello commented on CASSANDRA-8028: - I know it's absolutely OT and I can perhaps post this to the mailing list instead, but I really have to ask: are we doing something wrong then? Should we make partitions much smaller? I've read from various different sources that having rows "up to a few megabytes" is totally acceptable, so that has become our rule of thumb when designing sharding keys for the partitions. > Unable to compute when histogram overflowed > --- > > Key: CASSANDRA-8028 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8028 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Linux >Reporter: Gianluca Borello >Assignee: Carl Yeksigian > Fix For: 2.1.1 > > Attachments: 8028-2.1.txt > > > It seems like with 2.1.0 histograms can't be computed most of the times: > $ nodetool cfhistograms draios top_files_by_agent1 > nodetool: Unable to compute when histogram overflowed > See 'nodetool help' or 'nodetool help '. > I can probably find a way to attach a .cql script to reproduce it, but I > suspect it must be obvious to replicate it as it happens on more than 50% of > my column families. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8028) Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156737#comment-14156737 ] Gianluca Borello commented on CASSANDRA-8028: - How "large" partitions are we talking about here? The worrisome thing is that the command fails on a column family, and then, a few seconds later, it works, again on that same column family: First attempt: $ nodetool cfhistograms draios protobuf1 nodetool: Unable to compute when histogram overflowed See 'nodetool help' or 'nodetool help '. Second attempt (after about 30 seconds): $ nodetool cfhistograms draios protobuf1 draios/protobuf1 histograms Percentile SSTables Write Latency Read LatencyPartition Size Cell Count (micros) (micros) (bytes) 50% 0.00 18.60159.55 1955666 1597 75% 1.00 21.77364.55 4055269 3973 95% 1.00 33.11 10789.18 7007506 61214 98% 3.00 53.04 56822.90 8409007 61214 99% 4.00155.01 77205.61 8409007 61214 Min 0.00 7.11 58.23105779 87 Max 5.00 85449.58 189451.45 17436917 61214 There were no deletions in between, and the partitions don't seem that big to me, we try to keep them always under a few MBs. > Unable to compute when histogram overflowed > --- > > Key: CASSANDRA-8028 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8028 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Linux >Reporter: Gianluca Borello >Assignee: Carl Yeksigian > Fix For: 2.1.1 > > Attachments: 8028-2.1.txt > > > It seems like with 2.1.0 histograms can't be computed most of the times: > $ nodetool cfhistograms draios top_files_by_agent1 > nodetool: Unable to compute when histogram overflowed > See 'nodetool help' or 'nodetool help '. > I can probably find a way to attach a .cql script to reproduce it, but I > suspect it must be obvious to replicate it as it happens on more than 50% of > my column families. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-8028) Unable to compute when histogram overflowed
Gianluca Borello created CASSANDRA-8028: --- Summary: Unable to compute when histogram overflowed Key: CASSANDRA-8028 URL: https://issues.apache.org/jira/browse/CASSANDRA-8028 Project: Cassandra Issue Type: Bug Components: Tools Environment: Linux Reporter: Gianluca Borello Fix For: 2.1.0 It seems like with 2.1.0 histograms can't be computed most of the times: $ nodetool cfhistograms draios top_files_by_agent1 nodetool: Unable to compute when histogram overflowed See 'nodetool help' or 'nodetool help '. I can probably find a way to attach a .cql script to reproduce it, but I suspect it must be obvious to replicate it as it happens on more than 50% of my column families. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6275) 2.0.x leaks file handles
[ https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819363#comment-13819363 ] Gianluca Borello commented on CASSANDRA-6275: - [~jbellis], FWIW 1.2.11 is working fine for us (I have one week of uptime so far since the downgrade). > 2.0.x leaks file handles > > > Key: CASSANDRA-6275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6275 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: java version "1.7.0_25" > Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) > Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT > 2012 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Mikhail Mazursky > Attachments: cassandra_jstack.txt, leak.log, slog.gz > > > Looks like C* is leaking file descriptors when doing lots of CAS operations. > {noformat} > $ sudo cat /proc/15455/limits > Limit Soft Limit Hard Limit Units > Max cpu time unlimitedunlimitedseconds > Max file size unlimitedunlimitedbytes > Max data size unlimitedunlimitedbytes > Max stack size10485760 unlimitedbytes > Max core file size00bytes > Max resident set unlimitedunlimitedbytes > Max processes 1024 unlimitedprocesses > Max open files4096 4096 files > Max locked memory unlimitedunlimitedbytes > Max address space unlimitedunlimitedbytes > Max file locksunlimitedunlimitedlocks > Max pending signals 1463314633signals > Max msgqueue size 819200 819200 bytes > Max nice priority 00 > Max realtime priority 00 > Max realtime timeout unlimitedunlimitedus > {noformat} > Looks like the problem is not in limits. > Before load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 166 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 164 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 180 > {noformat} > After load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 967 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 1766 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 2578 > {noformat} > Most opened files have names like: > {noformat} > java 16890 cassandra 1636r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1637r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1638r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1639r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1640r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1641r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1642r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1643r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1644r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1645r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1646r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1647r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db > java 16890 cassandra 1648r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890
[jira] [Comment Edited] (CASSANDRA-6275) 2.0.x leaks file handles
[ https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819144#comment-13819144 ] Gianluca Borello edited comment on CASSANDRA-6275 at 11/11/13 5:35 PM: --- We are experiencing a similar issue in 2.0.2. It started happening after we set a TTL for all our columns in a very limited datastore (just a few GBs). We can easily see the fd count rapidly increase to 10+, and the majority of fds are (from lsof): {noformat} java 13168 cassandra 267r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 268r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 269r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 270r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 271r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 272r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 273r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) {noformat} I'm attaching the log of the exception (leak.log). You can see the exceptions, and then Cassandra eventually shuts down. We had to temporarily downgrade to 1.2.11 was (Author: gianlucaborello): We are experiencing a similar issue in 2.0.2. It started happening after we set a TTL for all our columns in a very limited datastore (just a few GBs). We can easily see the fd count rapidly increase to 10+, and the majority of fds are (from lsof): {noformat} java 13168 cassandra 267r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 268r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 269r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 270r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 271r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 272r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 273r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) {noformat} I'm attaching the log of the exception. You can see the exceptions, and then Cassandra eventually shuts down. We had to temporarily downgrade to 1.2.11 > 2.0.x leaks file handles > > > Key: CASSANDRA-6275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6275 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: java version "1.7.0_25" > Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) > Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT > 2012 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Mikhail Mazursky > Attachments: cassandra_jstack.txt, leak.log, slog.gz > > > Looks like C* is leaking file descriptors when doing lots of CAS operations. > {noformat} > $ sudo cat /proc/15455/limits > Limit Soft Limit Hard Limit Units > Max cpu time unlimitedunlimitedseconds > Max file size unlimitedunlimi
[jira] [Updated] (CASSANDRA-6275) 2.0.x leaks file handles
[ https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianluca Borello updated CASSANDRA-6275: Attachment: leak.log We are experiencing a similar issue in 2.0.2. It started happening after we set a TTL for all our columns in a very limited datastore (just a few GBs). We can easily see the fd count rapidly increase to 10+, and the majority of fds are (from lsof): {noformat} java 13168 cassandra 267r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 268r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 269r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 270r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 271r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 272r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) java 13168 cassandra 273r REG9,0 273129 671089723 /raid0/cassandra/data/draios/process_counters_by_exe/draios-process_counters_by_exe-jb-231-Data.db (deleted) {noformat} I'm attaching the log of the exception. You can see the exceptions, and then Cassandra eventually shuts down. We had to temporarily downgrade to 1.2.11 > 2.0.x leaks file handles > > > Key: CASSANDRA-6275 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6275 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: java version "1.7.0_25" > Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) > Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT > 2012 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Mikhail Mazursky > Attachments: cassandra_jstack.txt, leak.log, slog.gz > > > Looks like C* is leaking file descriptors when doing lots of CAS operations. > {noformat} > $ sudo cat /proc/15455/limits > Limit Soft Limit Hard Limit Units > Max cpu time unlimitedunlimitedseconds > Max file size unlimitedunlimitedbytes > Max data size unlimitedunlimitedbytes > Max stack size10485760 unlimitedbytes > Max core file size00bytes > Max resident set unlimitedunlimitedbytes > Max processes 1024 unlimitedprocesses > Max open files4096 4096 files > Max locked memory unlimitedunlimitedbytes > Max address space unlimitedunlimitedbytes > Max file locksunlimitedunlimitedlocks > Max pending signals 1463314633signals > Max msgqueue size 819200 819200 bytes > Max nice priority 00 > Max realtime priority 00 > Max realtime timeout unlimitedunlimitedus > {noformat} > Looks like the problem is not in limits. > Before load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 166 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 164 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 180 > {noformat} > After load test: > {noformat} > cassandra-test0 ~]$ lsof -n | grep java | wc -l > 967 > cassandra-test1 ~]$ lsof -n | grep java | wc -l > 1766 > cassandra-test2 ~]$ lsof -n | grep java | wc -l > 2578 > {noformat} > Most opened files have names like: > {noformat} > java 16890 cassandra 1636r REG 202,17 88724987 > 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db > java 16890 cassandra 1637r REG 202,17 161158485 > 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-