[
https://issues.apache.org/jira/browse/HBASE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Duxbury updated HBASE-521:
--------------------------------
Attachment: 521.patch
This patch does the vast majority of the work to replace client-side scanners
with the Scanner interface.
As a side effect, I've refactored the way that mapreduce pieces work. TableMap
takes in a Text, RowResult pair. TableOutputFormat takes a Text, BatchUpdate
pair. (The Text is ignored.). TableReduce takes a Text key and a list of
BatchUpdates.
TTI works fine, but TTMR is having some issues. For some reason the test times
out, and watching the log output doesn't seem to indicate what the problem
would be. If someone else would like to take a glance and give me some input,
Id be much obliged.
As far as I can tell, all the rest of the tests pass.
> Improve client scanner interface
> --------------------------------
>
> Key: HBASE-521
> URL: https://issues.apache.org/jira/browse/HBASE-521
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: client
> Reporter: Bryan Duxbury
> Assignee: Bryan Duxbury
> Priority: Minor
> Fix For: 0.2.0
>
> Attachments: 521.patch
>
>
> The current client scanner interface is pretty ugly. You need to instantiate
> an HStoreKey and SortedMap<Text, byte[]> externally and then pass them into
> next. This is pretty bad, because for starters, the client has to choose the
> implementation of the map when they create it, so it's extra brain cycles to
> figure that out. HStoreKey doesn't show up anywhere else in the entire client
> side API, but here it bubbles out of next as a way to get the row and
> presumably the timestamp of the columns.
> I propose that we supplant HScannerInterface with Scanner, an easier-to-use
> version for clients. Its next method would look something like:
> {code}
> public RowResult next() throws IOException;
> {code}
> This packs the data up much more cleanly, including using Cells as values
> instead of raw byte[], meaning you have much more granular timestamp
> information. You also don't need HStoreKey anymore.
> By breaking Scanner away from HScannerInterface, we can leave the internal
> scanning code completely alone (keep using HStoreKeys and such) but make the
> client cleaner.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.