Hi Dmitriy,
Sorry for the late reply, I was out of office.
Discarding the caster and caching option (i.e. using only the -loadkey
option) does not change anything except that some
FIELD_DISCARDED_TYPE_CONVERSION_FAILED warnings are issued.
On Fri, Jan 21, 2011 at 1:42 AM, Dmitriy Ryaboy
Hi,
so it seems to be more efficient if storing to hbase partitions by regions
and orders by hbase keys.
I see that pig 0.8 (pig-282) added custom partitioner in a group but i am
not sure if order is enforced there.
Is there a way to run single MR that orders and partitions data as per above
Pushing this logic into the storefunc would force an MR boundary before the
store (unless the StoreFunc passed, I suppose) which can make things overly
complex.
I think for the purposes of bulk-loading into HBase, a better approach might
be to use the native map-reduce functionality and feed
Thanks.
So i take there's no way in pig to specify custom partitioner And the
ordering in one MR step?
I don't think prebuilding HFILEs is the best strategy in my case. For my job
is incremental (i.e. i am not replacing 100% of the data). However, it is
big enough that i don't want to create
Do you want to order the groups or just within the groups? If you
want to order within the groups you can do that in Pig in a single job.
Alan.
On Jan 24, 2011, at 1:20 PM, Dmitriy Lyubimov wrote:
Thanks.
So i take there's no way in pig to specify custom partitioner And the
ordering in
Jacob,
Are you sure you don't have pig 7 or earlier jars kicking around?
I mean..
public class HBaseStorage extends LoadFunc implements StoreFuncInterface,
LoadPushDown {
...
On Mon, Jan 24, 2011 at 4:20 PM, jacob jacob.a.perk...@gmail.com wrote:
I'm having problems getting HBaseStorage to
Thank you, Alan. Let me consider this for a moment.
-d
On Mon, Jan 24, 2011 at 2:26 PM, Alan Gates ga...@yahoo-inc.com wrote:
Since Pig uses the partitioner to provide a total order (by which I mean an
order across part files), we don't allow users to override the partitioner
in that case.