[jira] [Updated] (PHOENIX-2088) Prevent splitting and recombining select expressions for MR integration

Thomas D'Silva (JIRA) Tue, 21 Jul 2015 13:40:47 -0700

     [ 
https://issues.apache.org/jira/browse/PHOENIX-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Thomas D'Silva updated PHOENIX-2088:
------------------------------------
    Attachment: PHOENIX-2088-4.4-HBase-0.98-v2.patch

After talking offline with [~jamestaylor], we decided to bring back 
ColumnInfoToStringEncoderDecoder, and change it to use one config value per 
columnInfo for serialization. I have uploaded a patch with my latest changes.

[~jmahonin] I refactored and now there is a new method setPhysicalTableName 
that is used to create local indexes by IndexTool. For spark 
setPhysicalTableName can be called with the same input as setOutputTableName. 
The spark tests are failing because of the way ColumnInfoToStringEncoderDecoder 
serializes the column info
Any pointers on how to fix this?

 Cause: java.io.NotSerializableException: org.apache.hadoop.conf.Configuration
Serialization stack:
        - object not serializable (class: org.apache.hadoop.conf.Configuration, 
value: Configuration: core-default.xml, core-site.xml, mapred-default.xml, 
mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, 
hdfs-site.xml, hbase-default.xml, hbase-site.xml)
        - field (class: org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1, 
name: config$1, type: class org.apache.hadoop.conf.Configuration)
        - object (class org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1, 
<function1>)


[[email protected]] I have fixed the pig test failure.

> Prevent splitting and recombining select expressions for MR integration
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-2088
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2088
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Thomas D'Silva
>         Attachments: PHOENIX-2088-4.4-HBase-0.98-v2.patch, 
> PHOENIX-2088-4.4-HBase-0.98.patch, PHOENIX-2088-pig.patch, 
> PHOENIX-2088-wip-v2.patch, PHOENIX-2088-wip-v3.patch, PHOENIX-2088-wip.patch
>
>
> We currently send in the select expressions for the MR integration with a 
> delimiter separated string, split based on the delimiter, and then recombine 
> again using a comma separator. This is problematic because the delimiter 
> character may appear in a select expression, thus breaking this logic. 
> Instead, we should use a comma as the delimiter and avoid splitting and 
> recombining as it's not necessary in that case. Instead, the entire string 
> can be used as-is in that case to form the select expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-2088) Prevent splitting and recombining select expressions for MR integration

Reply via email to