[ https://issues.apache.org/jira/browse/CASSANDRA-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171391#comment-14171391 ]
Alex Liu commented on CASSANDRA-7280: ------------------------------------- cassandra.input.split.size is used to partition rows by partitioning key. It doesn't affect native paging. Native internal paging has a page size which can be set by "cassandra.input.page.row.size" > Hadoop support not respecting cassandra.input.split.size > -------------------------------------------------------- > > Key: CASSANDRA-7280 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7280 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Reporter: Jeremy Hanna > > Long ago (0.7), I tried to set the cassandra.input.split.size property and > never really got it to respect that property. However the batch size was > useful for what I needed to affect the timeouts. > Now with the cql record reader and the native paging, users can specify > queries potentially using allow filtering clauses. The input split size is > more important because the server may have to scan through many many records > to get matching records. If the user can effectively set the input split > size, then that gives a hard limit on how many records it will traverse. > Currently it appears to be overriding the property, perhaps in the > client.describe_splits_ex method on the server side. > It can be argued that users shouldn't be using allow filtering clauses in > their cql in the first place. However it is still a bug that the input split > size is not honored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)