[ 
https://issues.apache.org/jira/browse/PHOENIX-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901809#comment-13901809
 ] 

Prashant Kommireddi commented on PHOENIX-11:
--------------------------------------------

The reason we need this is for MapReduce to read a phoenix table in parallel. 
MR needs this info to spawn multiple mappers simultaneously, and that's where 
MR having a handle on split information becomes necessary (btw I feel HBase 
split is different from MapReduce splits. In case of MR, this is generally an 
HDFS block but could be different). What you refer to would be useful in case 
of writing to hbase, however reading from a phoenix table is a different 
problem. For example, if we use the query
{code}
Select id, first_name, last_name from MyPhoenixTable
{code}
we need a way in a MapReduce job to figure the parallelism. The underlying 
HBase splits information is not visible to MR through the query.

> Create Pig Loader
> -----------------
>
>                 Key: PHOENIX-11
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-11
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: maghamravikiran
>
> A Pig Storage function exists, so we can store to phoenix tables. What is 
> needed is a Loader to go with the Storer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to