[
https://issues.apache.org/jira/browse/HADOOP-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508929
]
James Kennedy commented on HADOOP-1439:
---------------------------------------
Oh, one thing I forgot to add in the limitations above:
Column criteria can only apply to columns included int he results. You cannot
retrieve COL1, COL2 where COL3 = 'XYZ'
This is because the filtering is happening at the HScanner level and for e.g.
the lower level scanner for COL3 is not employed and so all COL3's values
appear as null.
> Add endRow parameter to HClient#obtainScanner
> ---------------------------------------------
>
> Key: HADOOP-1439
> URL: https://issues.apache.org/jira/browse/HADOOP-1439
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
>
> Currently the HClient#obtainScanner looks like this:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text
> startRow) throws IOException;
> {code}
> Add an overload that allows specification of endRow:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text
> startRow, Text endRow) throws IOException;
> {code}
> Use Case: Table contains the whole web. Client just wants to scan google's
> pages. Currently, client could cut off the scanner as soon as the row key
> leaves the google domain but cleaner if {{HScannerInterface#next()}} returns
> false
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.