[
https://issues.apache.org/jira/browse/PHOENIX-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901809#comment-13901809
]
Prashant Kommireddi commented on PHOENIX-11:
--------------------------------------------
The reason we need this is for MapReduce to read a phoenix table in parallel.
MR needs this info to spawn multiple mappers simultaneously, and that's where
MR having a handle on split information becomes necessary (btw I feel HBase
split is different from MapReduce splits. In case of MR, this is generally an
HDFS block but could be different). What you refer to would be useful in case
of writing to hbase, however reading from a phoenix table is a different
problem. For example, if we use the query
{code}
Select id, first_name, last_name from MyPhoenixTable
{code}
we need a way in a MapReduce job to figure the parallelism. The underlying
HBase splits information is not visible to MR through the query.
> Create Pig Loader
> -----------------
>
> Key: PHOENIX-11
> URL: https://issues.apache.org/jira/browse/PHOENIX-11
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: maghamravikiran
>
> A Pig Storage function exists, so we can store to phoenix tables. What is
> needed is a Loader to go with the Storer.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)