[
https://issues.apache.org/jira/browse/PHOENIX-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961163#comment-14961163
]
James Taylor commented on PHOENIX-2216:
---------------------------------------
Good idea, [~gabriel.reid]. To force the table to span regions, you can
pre-split it by tacking on a SPLIT ON (1, 2, 3) to your CREATE TABLE statement,
where the 1, 2, 3 represent the region boundaries (i.e. this would assume your
leading PK column is an INTEGER or BIGINT).
Also, one more minor nit: would you mind adding to the class level javadoc
where the code was copy/pasted from and that the plan is to put together an
HBase patch to allow this to be done without the copy/paste (ideally
referencing an HBase JIRA that you file).
> Support single mapper pass to CSV bulk load table and indexes
> -------------------------------------------------------------
>
> Key: PHOENIX-2216
> URL: https://issues.apache.org/jira/browse/PHOENIX-2216
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: maghamravikiran
> Attachments: phoenix-custom-hfileoutputformat-comments.patch,
> phoenix-custom-hfileoutputformat.patch, phoenix-multipleoutputs.patch
>
>
> Instead of running separate MR jobs for CSV bulk load: once for the table and
> then once for each secondary index, generate both the data table HFiles and
> the index table(s) HFiles in one mapper phase.
> Not sure if we need HBASE-3727 to be implemented for this or if we can do it
> with existing HBase APIs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)