[ 
https://issues.apache.org/jira/browse/PHOENIX-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635815#comment-14635815
 ] 

Josh Mahonin commented on PHOENIX-2088:
---------------------------------------

[~jamestaylor] If [~tdsilva] is comfortable with making the changes, I have no 
problems reviewing it. It's a busy time for me right now, so unfortunately I 
don't think I'll get a chance to dive into this until some time tomorrow. The 
spark integration tests are a pretty good indicator if the changes are 
compatible or not.

The trouble with Spark is it's attempt to serialize data to each worker, and it 
can do so in subtle and frustrating ways. If the 
'ColumnInfoToStringEncoderDecoder' class is back, then the code in trunk should 
be pretty close to working I would think. However, if we need to derive the 
columns from the configuration object, then each partition needs to instantiate 
its own version (as per the previous patch) to prevent serialization.

> Prevent splitting and recombining select expressions for MR integration
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-2088
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2088
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Thomas D'Silva
>         Attachments: PHOENIX-2088-4.4-HBase-0.98-v2.patch, 
> PHOENIX-2088-4.4-HBase-0.98.patch, PHOENIX-2088-pig.patch, 
> PHOENIX-2088-wip-v2.patch, PHOENIX-2088-wip-v3.patch, PHOENIX-2088-wip.patch
>
>
> We currently send in the select expressions for the MR integration with a 
> delimiter separated string, split based on the delimiter, and then recombine 
> again using a comma separator. This is problematic because the delimiter 
> character may appear in a select expression, thus breaking this logic. 
> Instead, we should use a comma as the delimiter and avoid splitting and 
> recombining as it's not necessary in that case. Instead, the entire string 
> can be used as-is in that case to form the select expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to