[ https://issues.apache.org/jira/browse/ACCUMULO-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440033#comment-13440033 ]
Pradeep Gollakota commented on ACCUMULO-736: -------------------------------------------- I myself have extremely limited knowledge of the HBase API. I provided the link as a way of including relevant discussions. The reason I'm requesting this feature is for network optimization. Please correct me if my understanding of the Accumulo API is not correct. Scanner returns the data in KV pairs via a Java Iterator. However, the data itself is returned from the server to the Scanner in batches (of size 1000 by default). So, if I'm looking for columns (n, n+k) from a row, the only way the client can filter the correct range is by retrieving n+k KV pairs. For large values of n, this can cause a lot of network overhead. If we can page the data server side and return only the relevant data over the network, it would be more optimized. My initial attempt at this problem would probably be an Iterator/Filter. However, if this can become a part of the Scanner API, it would become more natural to work with it. > Add Column Pagination Filter > ---------------------------- > > Key: ACCUMULO-736 > URL: https://issues.apache.org/jira/browse/ACCUMULO-736 > Project: Accumulo > Issue Type: Wish > Components: client > Reporter: Pradeep Gollakota > Assignee: Billie Rinaldi > > Client application may need to perform pagination of data depending on the > number of columns returned. This would be more efficient if the database > itself handled the pagination. > Similar to https://issues.apache.org/jira/browse/HBASE-2438 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira