[
https://issues.apache.org/jira/browse/PHOENIX-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269517#comment-14269517
]
Gabriel Reid commented on PHOENIX-1561:
---------------------------------------
{quote}If it's possible to catch some of these conditions and alert the user
that's great, but documenting them is also sufficient because they are only
used when they are specified in the pig script.{quote}
FWIW, I'd definitely be in favor of catching these situations (e.g. throwing an
exception in {{getSplitComparable()}} and/or
{{ensureAllKeyInstancesInSameSplit()}}) over just documenting them. The
semantics of HBase row keys (i.e. ordering within a region, etc) are pretty
well known, whereas for Phoenix they aren't necessarily as well known. The row
key contains all primary key values, each PK column can have its own ordering
rules, and there are potentially two "hidden" values within the row key as well
(tenant id and salt bucket).
> Pig optimized joins
> -------------------
>
> Key: PHOENIX-1561
> URL: https://issues.apache.org/jira/browse/PHOENIX-1561
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.2
> Reporter: Brian Johnson
> Assignee: Brian Johnson
> Attachments: 0001-PHOENIX-1561-Optimizing-Joins.patch, patch
>
>
> PhoenixHBaseLoader should implement both OrderedLoadFunc and
> CollectableLoadFunc just like HBaseStorage. There is nothing special that
> needs to be done other than implementing a single method. As in HBaseStorage,
> it is up to the user to ensure that the required constraints are not
> violated.
> {code:java}
> public void ensureAllKeyInstancesInSameSplit() throws IOException {
> /**
> * no-op because hbase keys are unique
> * This will also work with things like
> DelimitedKeyPrefixRegionSplitPolicy
> * if you need a partial key match to be included in the split
> */
> LOG.debug("ensureAllKeyInstancesInSameSplit");
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)