[
https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy V. Ryaboy updated PIG-1205:
-----------------------------------
Attachment: PIG_1205_5.path
This patch (not really review-ready yet) introduces the Elephant-Bird
improvements.
You can use -gt, -gte, -lt, -lte flags to filter out row ranges, specify
caching and per-region row limits, and you can specify the caster to use
(interpret Strings, as before, or use bytes directly for more eficient storage
and communication).
The filtering is a bit off because it still spins up all the map tasks, the
ones whose keys are filtered out just finish extremely fast.
The progress reporting is a bit jittery, but better than nothing.
TODO: fix up filtering, add projection pushdown, add filter pushdown, and write
better tests.
> Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
> ------------------------------------------------------------------------------
>
> Key: PIG-1205
> URL: https://issues.apache.org/jira/browse/PIG-1205
> Project: Pig
> Issue Type: Sub-task
> Affects Versions: 0.7.0
> Reporter: Jeff Zhang
> Assignee: Dmitriy V. Ryaboy
> Fix For: 0.8.0
>
> Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch,
> PIG_1205_4.patch, PIG_1205_5.path
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.