[jira] [Commented] (PHOENIX-2216) Support single mapper pass to CSV bulk load table and indexes

maghamravikiran (JIRA) Fri, 16 Oct 2015 18:05:42 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961612#comment-14961612
 ]


maghamravikiran commented on PHOENIX-2216:
------------------------------------------

The tests which involve local indexes fail when the pre-split option is 
specified . I have attached the test case .  

[~jamestaylor]
    Currently, I have used a custom Writable class (CsvTableRowkeyPair) . To 
get it onto HBase, I feel we should stick ImmutableByteWritable as the Reducer 
output key.  Also, we would need the delimiter for the table name and rowkey to 
be passed on as a configuration parameter.  This way, we can write the parsing 
the of reducer output key for the table and rowkey and construct the necessary 
output path.  Let me know if this sounds reasonable.  

> Support single mapper pass to CSV bulk load table and indexes
> -------------------------------------------------------------
>
>                 Key: PHOENIX-2216
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2216
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: maghamravikiran
>         Attachments: phoenix-custom-hfileoutputformat-comments.patch, 
> phoenix-custom-hfileoutputformat.patch, phoenix-multipleoutputs.patch
>
>
> Instead of running separate MR jobs for CSV bulk load: once for the table and 
> then once for each secondary index, generate both the data table HFiles and 
> the index table(s) HFiles in one mapper phase.
> Not sure if we need HBASE-3727 to be implemented for this or if we can do it 
> with existing HBase APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2216) Support single mapper pass to CSV bulk load table and indexes

Reply via email to