Benjamin Lerer created CASSANDRA-17183: ------------------------------------------
Summary: Using the user specified page size for internal paging in GROUP BY queries can slow down the query and create high traffic between nodes Key: CASSANDRA-17183 URL: https://issues.apache.org/jira/browse/CASSANDRA-17183 Project: Cassandra Issue Type: Bug Reporter: Benjamin Lerer When performing aggregation queries or GROUP BY queries Cassandra compute the aggregates on the coordinator node to ensure consistency and request the data by pages (numbers of rows). Today, Cassandra use as internal page size the page size requested by the user (the number of rows that should be returned to the user). By consequence, if the page size requested by the user is too small the number of request performed by the node will be much higher. For 1,000,000 rows, a consistency level of LOCAL_QUORUM and a page size of 5,000 the coordinator will contact 200 times the replicas. For a page size of 100 (CQLSH page size) the coordinator will contact 10,000 times the replicas. To avoid this problem we should have a minimum page size for the internal paging and the possibility for the operators to change its value. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org