[jira] [Updated] (SPARK-6607) Aggregation attribute name including special chars '(' and ')' should be replaced before generating Parquet schema

2015-03-30 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-6607:
--
Assignee: Liang-Chi Hsieh

 Aggregation attribute name including special chars '(' and ')' should be 
 replaced before generating Parquet schema
 --

 Key: SPARK-6607
 URL: https://issues.apache.org/jira/browse/SPARK-6607
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Liang-Chi Hsieh
Assignee: Liang-Chi Hsieh

 '(' and ')' are special characters used in Parquet schema for type 
 annotation. When we run an aggregation query, we will obtain attribute name 
 such as MAX(a).
 If we directly store the generated DataFrame as Parquet file, it causes 
 failure when reading and parsing the stored schema string.
 Several methods can be adopted to solve this. This pr uses a simplest one to 
 just replace attribute names before generating Parquet schema based on these 
 attributes.
 Another possible method might be modifying all aggregation expression names 
 from func(column) to func[column].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-6607) Aggregation attribute name including special chars '(' and ')' should be replaced before generating Parquet schema

2015-03-30 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-6607:
--
 Target Version/s: 1.4.0
Affects Version/s: 1.1.1
   1.2.1
   1.3.0

 Aggregation attribute name including special chars '(' and ')' should be 
 replaced before generating Parquet schema
 --

 Key: SPARK-6607
 URL: https://issues.apache.org/jira/browse/SPARK-6607
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.1.1, 1.2.1, 1.3.0
Reporter: Liang-Chi Hsieh
Assignee: Liang-Chi Hsieh

 '(' and ')' are special characters used in Parquet schema for type 
 annotation. When we run an aggregation query, we will obtain attribute name 
 such as MAX(a).
 If we directly store the generated DataFrame as Parquet file, it causes 
 failure when reading and parsing the stored schema string.
 Several methods can be adopted to solve this. This pr uses a simplest one to 
 just replace attribute names before generating Parquet schema based on these 
 attributes.
 Another possible method might be modifying all aggregation expression names 
 from func(column) to func[column].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org