[ https://issues.apache.org/jira/browse/CASSANDRA-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203431#comment-13203431 ]
Sylvain Lebresne commented on CASSANDRA-3861: --------------------------------------------- bq. In your example above, the "right" thing to do from a client's perspective is to use a limit of 10000. Agreed, but my argument is that if 99% of query returns < 10 rows, our code is uselessly inefficient for 99% of the queries. I'm really only talking about a performance issue. bq. I guess I'd be okay with dropping that if we add a special check to return IRE for the MAX_VALUE antipattern. I think that forbidding the MAX_VALUE anti-pattern is a different debate, but throwing a IRE on MAX_VALUE would be very java specific. For users of other languages, the same anti-pattern would likely be to pass some huge number, but likely not MAX_VALUE exactly. The right solution moving forward will be to do automatic paging with CQL, but in the meantime I don't see a good way to protect people against their own mistake that does not incur inefficiency or limitations. > get_indexed_slices throws OOM Error when is called with too big > indexClause.count > --------------------------------------------------------------------------------- > > Key: CASSANDRA-3861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3861 > Project: Cassandra > Issue Type: Bug > Components: API, Core > Affects Versions: 1.0.7 > Reporter: Vladimir Tsanev > Assignee: Sylvain Lebresne > Fix For: 1.0.8 > > Attachments: 3861.patch > > > I tried to call get_index_slices with Integer.MAX_VALUE as IndexClause.count. > Unfortunately the node died with OOM. In the log there si following error: > ERROR [Thrift:4] 2012-02-06 17:43:39,224 Cassandra.java (line 3252) Internal > error processing get_indexed_slices > java.lang.OutOfMemoryError: Java heap space > at java.util.ArrayList.<init>(ArrayList.java:112) > at > org.apache.cassandra.service.StorageProxy.scan(StorageProxy.java:1067) > at > org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:746) > at > org.apache.cassandra.thrift.Cassandra$Processor$get_indexed_slices.process(Cassandra.java:3244) > at > org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) > at > org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Is it necessary to allocate all the memory in advance. I only have 3 KEYS > that match my caluse. I do not known the exact number but in general I am > sure that they wil fit in the memory. > I can/will implement some calls with paging, but wanted to test and I am not > happy with the fact the node disconnected. > I wonder why ArrayList is used here? > I think the result is never accessed by index (but only iterated) and the > subList for non RandomAccess Lists (for example LinkedList) will do the same > job if you are not using other operations than iteration. > Is this related to the problem described in CASSANDRA-691. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira