[ https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179733#comment-15179733 ]
Benjamin Lerer commented on CASSANDRA-10707: -------------------------------------------- {quote}I noticed that CQLLimits.forShortReadRetry() does not provide any limit on the number of rows either.{quote} It looks like I was really tired when I looked at the code :-( {quote}Not sure about CQLGroupByLimits.forShortReadRetry(). I believe putting no limit on the number of rows (and only on the group) might lead to OOM. In fact, I need to think more carefully about this but I'm not 100% sure that the short read logic isn't throw off by the fact that counted() returns a number of groups not rows.{quote} I think my implementation for {{forShortReadRetry()}} is simply wrong. The goal of the short read retry is to fetch the rows that were missing for a given partition due to short read. As the number of rows to request is computed in {{ShortReadRowProtection:: moreContents}} I believe that even in the case of {{GROUP BY}} we should use a {{CQLLimits}} to request the rows. > Add support for Group By to Select statement > -------------------------------------------- > > Key: CASSANDRA-10707 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10707 > Project: Cassandra > Issue Type: Improvement > Components: CQL > Reporter: Benjamin Lerer > Assignee: Benjamin Lerer > > Now that Cassandra support aggregate functions, it makes sense to support > {{GROUP BY}} on the {{SELECT}} statements. > It should be possible to group either at the partition level or at the > clustering column level. > {code} > SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey; > SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP > BY partitionKey, clustering0, clustering1; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)