[jira] [Commented] (PHOENIX-2088) Prevent splitting and recombining select expressions for MR integration

Josh Mahonin (JIRA) Wed, 01 Jul 2015 20:36:07 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611410#comment-14611410
 ]


Josh Mahonin commented on PHOENIX-2088:
---------------------------------------

On a first look, I'd vote for another way of serializing the names / column 
infos, if possible.

However, it's been a while since I was knee-deep in that code. I recall the 
main issue being that neither the 'Configuration', nor 'ColumnInfo' objects are 
serializable, hence the call to the encode / decode functions.

I haven't had a chance to really dive into it yet, but I suspect that these 
same issues are shared with mapreduce / pig as well, since that's where most of 
the spark code came from to begin with. I imagine there's probably a general 
solution that should work for all of them, but if these changes are already 
compatible with mapreduce and pig, it should be possible to adjust the spark 
integration accordingly.

Am willing and able to help where needed. I'll try take a deeper dive tomorrow 
to confirm my assumptions.

> Prevent splitting and recombining select expressions for MR integration
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-2088
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2088
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: maghamravikiran
>         Attachments: PHOENIX-2088-wip.patch
>
>
> We currently send in the select expressions for the MR integration with a 
> delimiter separated string, split based on the delimiter, and then recombine 
> again using a comma separator. This is problematic because the delimiter 
> character may appear in a select expression, thus breaking this logic. 
> Instead, we should use a comma as the delimiter and avoid splitting and 
> recombining as it's not necessary in that case. Instead, the entire string 
> can be used as-is in that case to form the select expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2088) Prevent splitting and recombining select expressions for MR integration

Reply via email to