[
https://issues.apache.org/jira/browse/CRUNCH-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Micah Whitacre updated CRUNCH-127:
----------------------------------
Attachment: CRUNCH-127_itest.patch
I wrote up an itest that I thought would demonstrate writing to two table
successfully. I haven't gotten it to execute successfully. (I do have your
patched applied locally) The message indicates that the Job is failing but I
haven't dug into why just yet.
Is that how you anticipated the consumers using the multiple outputs? Or did I
do something wrong?
It'd be nice if we could actually hide the multi table support from consumers.
From a consumer API perspective the way I would hope to use this would be to
seemingly do independent writes anywhere I want along the pipeline but
implementation wise they would be aggregated to use the HBaseMultiTableTarget
if necessary.
If there was a method on the ToHBase class like:
{code}
PCollection<Put> puts = ...;
ToHBase.write(pipeline, "tableName", puts);
{code}
This would essentially hide the conversion to PTable<ImmutableBytesWritable,
Put> which seems like the same code everywhere. The difficulty with the above
is if ToHBase would have to track internal state of the target and union of the
collections.
Or if this could be hidden behind HBaseTarget itself that would be nice. Just
throwing out ideas and will hopefully get some time to play with the
implementation.
Also is the intention that a single pipeline would only ever use the
HBaseMultiTableTarget or HBaseTarget? Or would it be acceptable to use the
together?
> Allow multiple HBaseTargets in a single pipeline
> ------------------------------------------------
>
> Key: CRUNCH-127
> URL: https://issues.apache.org/jira/browse/CRUNCH-127
> Project: Crunch
> Issue Type: Bug
> Reporter: Micah Whitacre
> Assignee: Josh Wills
> Attachments: CRUNCH-127_itest.patch, CRUNCH-127.patch
>
>
> Currently when a pipeline contains writes to multiple HBaseTargets, all puts
> are being sent to the first configured HBaseTarget ignoring the second one
> and causing issues if the columns are not the same.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira