[jira] [Updated] (SPARK-8669) Parquet 1.7 files that store binary enums crash when inferring schema

2015-06-27 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8669: -- Assignee: Steven She Parquet 1.7 files that store binary enums crash when inferring schema

[jira] [Resolved] (SPARK-8379) LeaseExpiredException when using dynamic partition with speculative execution

2015-06-21 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8379. --- Resolution: Fixed Fix Version/s: 1.4.1 1.5.0 Issue resolved by pull request

[jira] [Created] (SPARK-8508) Test case SQLQuerySuite.test script transform for stderr generates super long output

2015-06-21 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8508: - Summary: Test case SQLQuerySuite.test script transform for stderr generates super long output Key: SPARK-8508 URL: https://issues.apache.org/jira/browse/SPARK-8508

[jira] [Updated] (SPARK-8461) ClassNotFoundException when code generation is enabled

2015-06-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8461: -- Description: Build Spark with {{-Phive}}, then run the following Spark shell snippet: {code}

[jira] [Created] (SPARK-8461) ClassNotFoundException when code generation is enabled

2015-06-18 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8461: - Summary: ClassNotFoundException when code generation is enabled Key: SPARK-8461 URL: https://issues.apache.org/jira/browse/SPARK-8461 Project: Spark Issue Type:

[jira] [Updated] (SPARK-8461) ClassNotFoundException when code generation is enabled

2015-06-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8461: -- Description: Build Spark with {{-Phive}}, then run the following Spark shell snippet: {code}

[jira] [Updated] (SPARK-8461) ClassNotFoundException when code generation is enabled

2015-06-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8461: -- Description: Build Spark without {{-Phive}} to make sure the isolated classloader for Hive support is

[jira] [Resolved] (SPARK-8139) Documents data sources and Parquet output committer related options

2015-06-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8139. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 6683

[jira] [Created] (SPARK-8580) Add Parquet files generated by different systems to test interoperability and compatibility

2015-06-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8580: - Summary: Add Parquet files generated by different systems to test interoperability and compatibility Key: SPARK-8580 URL: https://issues.apache.org/jira/browse/SPARK-8580

[jira] [Created] (SPARK-8578) Should ignore user defined output committer when appending data

2015-06-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8578: - Summary: Should ignore user defined output committer when appending data Key: SPARK-8578 URL: https://issues.apache.org/jira/browse/SPARK-8578 Project: Spark

[jira] [Updated] (SPARK-8578) Should ignore user defined output committer when appending data

2015-06-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8578: -- Target Version/s: 1.4.1, 1.5.0 (was: 1.5.0) Should ignore user defined output committer when

[jira] [Commented] (SPARK-8311) saveAsTextFile with Hadoop1 could lead to errors

2015-06-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583202#comment-14583202 ] Cheng Lian commented on SPARK-8311: --- Hi [~shivaram], what version of Hadoop 1 were you

[jira] [Resolved] (SPARK-6566) Update Spark to use the latest version of Parquet libraries

2015-06-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-6566. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 5889

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max part number of

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max part number of

[jira] [Commented] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590494#comment-14590494 ] Cheng Lian commented on SPARK-8406: --- Yeah, just updated the JIRA description. ORC may

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max part number of

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max part number of

[jira] [Comment Edited] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589592#comment-14589592 ] Cheng Lian edited comment on SPARK-8406 at 6/17/15 6:07 PM: An

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max part number of

[jira] [Updated] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8406: -- Description: To support appending, the Parquet data source tries to find out the max ID of part-files

[jira] [Created] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8406: - Summary: Race condition when writing Parquet files Key: SPARK-8406 URL: https://issues.apache.org/jira/browse/SPARK-8406 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-8406) Race condition when writing Parquet files

2015-06-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589592#comment-14589592 ] Cheng Lian commented on SPARK-8406: --- An example task completion and scheduling order

[jira] [Created] (SPARK-8328) Add a CheckAnalysis rule to ensure that Union branches have the same schema

2015-06-12 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8328: - Summary: Add a CheckAnalysis rule to ensure that Union branches have the same schema Key: SPARK-8328 URL: https://issues.apache.org/jira/browse/SPARK-8328 Project: Spark

[jira] [Updated] (SPARK-7939) Make URL partition recognition return String by default for all partition column types and values

2015-05-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7939: -- Labels: 1.4.1 (was: ) Make URL partition recognition return String by default for all partition

[jira] [Commented] (SPARK-7939) Make URL partition recognition return String by default for all partition column types and values

2015-05-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564330#comment-14564330 ] Cheng Lian commented on SPARK-7939: --- We can probably provide a data source option to

[jira] [Created] (SPARK-7950) HiveThriftServer2.startWithContext() doesn't set spark.sql.hive.version

2015-05-29 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-7950: - Summary: HiveThriftServer2.startWithContext() doesn't set spark.sql.hive.version Key: SPARK-7950 URL: https://issues.apache.org/jira/browse/SPARK-7950 Project: Spark

[jira] [Updated] (SPARK-7950) HiveThriftServer2.startWithContext() doesn't set spark.sql.hive.version

2015-05-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7950: -- Description: While testing the newly released Simba Spark SQL ODBC driver 1.0.8.1006 against

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Description: When saving a DataFrame with {{ErrorIfExists}} as save mode, we shouldn't do metadata

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Priority: Major (was: Minor) DataFrame.write.mode(error).save(...) should not scan the output folder

[jira] [Assigned] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-8014: - Assignee: Cheng Lian DataFrame.write.mode(error).save(...) should not scan the output folder

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Description: When saving a DataFrame with {{ErrorIfExists}} as save mode, we shouldn't do metadata

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Description: When saving a DataFrame with {{ErrorIfExists}} as save mode, we shouldn't do metadata

[jira] [Updated] (SPARK-7853) ClassNotFoundException for SparkSQL

2015-05-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7853: -- Description: Reproduce steps: {code} bin/spark-sql --jars

[jira] [Commented] (SPARK-7853) ClassNotFoundException for SparkSQL

2015-05-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560429#comment-14560429 ] Cheng Lian commented on SPARK-7853: --- OT: [~chenghao] Just edited the JIRA description.

[jira] [Updated] (SPARK-7819) Isolated Hive Client Loader appears to cause Native Library libMapRClient.4.0.2-mapr.so already loaded in another classloader error

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7819: -- Description: In reference to the pull request: https://github.com/apache/spark/pull/5876 I have been

[jira] [Commented] (SPARK-5327) HiveCompatibilitySuite fails when executed against Hive 0.12.0

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562692#comment-14562692 ] Cheng Lian commented on SPARK-5327: --- Yeah, I think it's OK to resolve this now.

[jira] [Commented] (SPARK-4852) Hive query plan deserialization failure caused by shaded hive-exec jar file when generating golden answers

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562713#comment-14562713 ] Cheng Lian commented on SPARK-4852: --- This issue should have been addressed by SPARK-6505

[jira] [Updated] (SPARK-7268) [Spark SQL] Throw 'Shutdown hooks cannot be modified during shutdown' on YARN

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7268: -- Description: {noformat} 15/04/30 08:26:32 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor

[jira] [Resolved] (SPARK-7684) TestHive.reset complains Database does not exist: default

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7684. --- Resolution: Fixed Fix Version/s: 1.4.0 TestHive.reset complains Database does not exist:

[jira] [Updated] (SPARK-4176) Support decimals with precision 18 in Parquet

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-4176: -- Target Version/s: 1.5.0 Affects Version/s: 1.4.0 1.3.0

[jira] [Commented] (SPARK-4176) Support decimals with precision 18 in Parquet

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562690#comment-14562690 ] Cheng Lian commented on SPARK-4176: --- No, this is about the Parquet schema conversion

[jira] [Resolved] (SPARK-6432) Cannot load parquet data with partitions if not all partition columns match data columns

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-6432. --- Resolution: Fixed Fix Version/s: 1.4.0 Cannot load parquet data with partitions if not all

[jira] [Updated] (SPARK-6432) Cannot load parquet data with partitions if not all partition columns match data columns

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6432: -- Target Version/s: 1.4.0 Assignee: Cheng Lian Cannot load parquet data with partitions if

[jira] [Commented] (SPARK-4852) Hive query plan deserialization failure caused by shaded hive-exec jar file when generating golden answers

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562718#comment-14562718 ] Cheng Lian commented on SPARK-4852: --- I'm not sure how to fix this properly for golden

[jira] [Commented] (SPARK-2551) Cleanup FilteringParquetRowInputFormat

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562708#comment-14562708 ] Cheng Lian commented on SPARK-2551: --- Yes, I'll try to address this issue while upgrading

[jira] [Assigned] (SPARK-7737) parquet schema discovery should not fail because of empty _temporary dir

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-7737: - Assignee: Cheng Lian (was: Yin Huai) parquet schema discovery should not fail because of empty

[jira] [Updated] (SPARK-7737) parquet schema discovery should not fail because of empty _temporary dir

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7737: -- Assignee: Yin Huai (was: Cheng Lian) parquet schema discovery should not fail because of empty

[jira] [Commented] (SPARK-6432) Cannot load parquet data with partitions if not all partition columns match data columns

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562700#comment-14562700 ] Cheng Lian commented on SPARK-6432: --- This use case is covered by newly introduced

[jira] [Commented] (SPARK-7684) TestHive.reset complains Database does not exist: default

2015-05-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562702#comment-14562702 ] Cheng Lian commented on SPARK-7684: --- By workaround did you mean PR

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Description: When saving a DataFrame with {{ErrorIfExists}} as save mode, we shouldn't do metadata

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Description: When saving a DataFrame with {{ErrorIfExists}} as save mode, we shouldn't do metadata

[jira] [Resolved] (SPARK-8037) Ignores files whose name starts with . while enumerating files in HadoopFsRelation

2015-06-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8037. --- Resolution: Fixed Fix Version/s: 1.4.0 Ignores files whose name starts with . while

[jira] [Updated] (SPARK-8014) DataFrame.write.mode(error).save(...) should not scan the output folder

2015-06-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8014: -- Target Version/s: 1.4.1, 1.5.0 DataFrame.write.mode(error).save(...) should not scan the output folder

[jira] [Created] (SPARK-8031) Version number written to Hive metastore is 0.13.1aa instead of 0.13.1a

2015-06-02 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8031: - Summary: Version number written to Hive metastore is 0.13.1aa instead of 0.13.1a Key: SPARK-8031 URL: https://issues.apache.org/jira/browse/SPARK-8031 Project: Spark

[jira] [Updated] (SPARK-8811) Read array struct data from parquet error

2015-07-03 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8811: -- Shepherd: Cheng Lian Read array struct data from parquet error

[jira] [Commented] (SPARK-8811) Read array struct data from parquet error

2015-07-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14613594#comment-14613594 ] Cheng Lian commented on SPARK-8811: --- This is actually a known Parquet compatibility

[jira] [Created] (SPARK-8824) Support Parquet logical types TIMESTAMP_MILLIS and TIMESTAMP_MICROS

2015-07-04 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8824: - Summary: Support Parquet logical types TIMESTAMP_MILLIS and TIMESTAMP_MICROS Key: SPARK-8824 URL: https://issues.apache.org/jira/browse/SPARK-8824 Project: Spark

[jira] [Reopened] (SPARK-7845) Bump Hadoop 1 tests to version 1.2.1

2015-06-27 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reopened SPARK-7845: --- [PR #5694|https://github.com/apache/spark/pull/5694] reverted [PR

[jira] [Updated] (SPARK-8672) throws NPE when running spark sql thrift server with session state authenticator

2015-06-27 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8672: -- Shepherd: Cheng Lian Environment: Hive 0.13.1 throws NPE when running spark sql thrift server

[jira] [Updated] (SPARK-6774) Implement Parquet complex types backwards-compatiblity rules

2015-07-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6774: -- Shepherd: Cheng Lian Implement Parquet complex types backwards-compatiblity rules

[jira] [Commented] (SPARK-8501) ORC data source may give empty schema if an ORC file containing zero rows is picked for schema discovery

2015-07-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612646#comment-14612646 ] Cheng Lian commented on SPARK-8501: --- Exactly. Please see my PR description here

[jira] [Resolved] (SPARK-8501) ORC data source may give empty schema if an ORC file containing zero rows is picked for schema discovery

2015-07-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8501. --- Resolution: Fixed Fixed by https://github.com/apache/spark/pull/7199 Backported to 1.4.1 by

[jira] [Resolved] (SPARK-8690) Add a setting to disable SparkSQL parquet schema merge by using datasource API

2015-07-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8690. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7070

[jira] [Resolved] (SPARK-5508) Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in the Data Souce API

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-5508. --- Resolution: Duplicate Assignee: Cheng Lian This is an Parquet compatibility/interoperability

[jira] [Updated] (SPARK-6774) Implement Parquet complex types backwards-compatiblity rules

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6774: -- Description: [Parquet format PR #17|https://github.com/apache/incubator-parquet-format/pull/17]

[jira] [Resolved] (SPARK-8811) Read array struct data from parquet error

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8811. --- Resolution: Duplicate This is an Parquet compatibility/interoperability issue. PR

[jira] [Assigned] (SPARK-8811) Read array struct data from parquet error

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-8811: - Assignee: Cheng Lian Read array struct data from parquet error

[jira] [Created] (SPARK-8848) Write Parquet LISTs and MAPs conforming to Parquet format spec

2015-07-06 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8848: - Summary: Write Parquet LISTs and MAPs conforming to Parquet format spec Key: SPARK-8848 URL: https://issues.apache.org/jira/browse/SPARK-8848 Project: Spark

[jira] [Updated] (SPARK-8690) Add a setting to disable SparkSQL parquet schema merge by using datasource API

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8690: -- Assignee: thegiive Add a setting to disable SparkSQL parquet schema merge by using datasource API

[jira] [Updated] (SPARK-8841) Fix partition pruning percentage log message

2015-07-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8841: -- Assignee: Steve Lindemann Fix partition pruning percentage log message

[jira] [Updated] (SPARK-8669) Parquet 1.7 files that store binary enums crash when inferring schema

2015-06-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8669: -- Target Version/s: 1.5.0 Parquet 1.7 files that store binary enums crash when inferring schema

[jira] [Updated] (SPARK-8692) re-order the case statements that handling catalyst data types

2015-06-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8692: -- Shepherd: Cheng Lian re-order the case statements that handling catalyst data types

[jira] [Resolved] (SPARK-8692) re-order the case statements that handling catalyst data types

2015-06-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8692. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7073

[jira] [Updated] (SPARK-8501) ORC data source may give empty schema if an ORC file containing zero rows is picked for schema discovery

2015-07-02 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8501: -- Target Version/s: 1.4.1, 1.5.0 (was: 1.5.0, 1.4.2) ORC data source may give empty schema if an ORC

[jira] [Created] (SPARK-7847) Fix dynamic partition path escaping

2015-05-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-7847: - Summary: Fix dynamic partition path escaping Key: SPARK-7847 URL: https://issues.apache.org/jira/browse/SPARK-7847 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-7849) Update Spark SQL Hive support documentation for 1.4

2015-05-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-7849: - Summary: Update Spark SQL Hive support documentation for 1.4 Key: SPARK-7849 URL: https://issues.apache.org/jira/browse/SPARK-7849 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-7842) For InsertIntoHadoopFsRelation, if an exception is thrown while committing a task, the task is not aborted

2015-05-25 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7842. --- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 6378

[jira] [Resolved] (SPARK-7684) TestHive.reset complains Database does not exist: default

2015-05-25 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7684. --- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 6359

[jira] [Created] (SPARK-7868) Ignores _temporary directories while listing files in HadoopFsRelation

2015-05-26 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-7868: - Summary: Ignores _temporary directories while listing files in HadoopFsRelation Key: SPARK-7868 URL: https://issues.apache.org/jira/browse/SPARK-7868 Project: Spark

[jira] [Created] (SPARK-7842) For InsertIntoHadoopFsRelation, if an exception is thrown while committing a task, the task is not aborted

2015-05-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-7842: - Summary: For InsertIntoHadoopFsRelation, if an exception is thrown while committing a task, the task is not aborted Key: SPARK-7842 URL:

[jira] [Updated] (SPARK-7119) ScriptTransform doesn't consider the output data type

2015-05-23 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7119: -- Description: {code:sql} from (from src select transform(key, value) using 'cat' as (thing1 int, thing2

[jira] [Updated] (SPARK-7616) Column order can be corrupted when saving DataFrame as a partitioned table

2015-05-21 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7616: -- Description: When saved as a partitioned table, partition columns of a DataFrame are appended after

[jira] [Updated] (SPARK-7616) Column order can be corrupted when saving DataFrame as a partitioned table

2015-05-21 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7616: -- Summary: Column order can be corrupted when saving DataFrame as a partitioned table (was: Overwriting

[jira] [Resolved] (SPARK-7737) parquet schema discovery should not fail because of empty _temporary dir

2015-05-21 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7737. --- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 6329

[jira] [Commented] (SPARK-7684) TestHive.reset complains Database does not exist: default

2015-05-22 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556550#comment-14556550 ] Cheng Lian commented on SPARK-7684: --- A minimum test suite that reproduces this issue:

[jira] [Updated] (SPARK-8839) Thrift Sever will throw `java.util.NoSuchElementException: key not found` exception when many clients connect it

2015-07-07 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8839: -- Shepherd: Yi Tian Thrift Sever will throw `java.util.NoSuchElementException: key not found`

[jira] [Updated] (SPARK-9407) Parquet shouldn't fail when pushing down predicates over a column whose underlying Parquet type is an ENUM

2015-08-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9407: -- Summary: Parquet shouldn't fail when pushing down predicates over a column whose underlying Parquet

[jira] [Commented] (SPARK-9600) DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

2015-08-08 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662894#comment-14662894 ] Cheng Lian commented on SPARK-9600: --- [~sthotaibeam] Thanks for investigating this issue.

[jira] [Updated] (SPARK-9600) DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

2015-08-08 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9600: -- Description: Get a clean Spark 1.4.1 build: {noformat} $ git checkout v1.4.1 $ ./build/sbt -Phive

[jira] [Commented] (SPARK-9701) allow not automatically using HiveContext with spark-shell when hive support built in

2015-08-08 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662895#comment-14662895 ] Cheng Lian commented on SPARK-9701: --- I targeted it to 1.5.0. If we can't make it, we can

[jira] [Updated] (SPARK-9701) allow not automatically using HiveContext with spark-shell when hive support built in

2015-08-08 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9701: -- Target Version/s: 1.5.0 allow not automatically using HiveContext with spark-shell when hive support

[jira] [Updated] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-08 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9182: -- Assignee: Yijie Shen (was: Cheng Lian) filter and groupBy on DataFrames are not passed through to

[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660278#comment-14660278 ] Cheng Lian commented on SPARK-9182: --- Hey [~grahn], sorry for the late reply, I somehow

[jira] [Updated] (SPARK-6795) Avoid reading Parquet footers on driver side when an global arbitrative schema is available

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6795: -- Fix Version/s: 1.5.0 Avoid reading Parquet footers on driver side when an global arbitrative schema

[jira] [Updated] (SPARK-6795) Avoid reading Parquet footers on driver side when an global arbitrative schema is available

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6795: -- Target Version/s: 1.5.0 (was: 1.6.0) Avoid reading Parquet footers on driver side when an global

[jira] [Updated] (SPARK-9757) Can't create persistent data source tables with decimal

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9757: -- Description: {{ParquetHiveSerDe}} in Hive versions 1.2.0 doesn't support decimal. Persisting Parquet

[jira] [Resolved] (SPARK-9885) IsolatedClientLoader ignores shared prefixes and barrier prefixes when spark.sql.hive.metastore.jars is set to maven

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9885. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8158

[jira] [Assigned] (SPARK-9757) Can't create persistent data source tables with decimal

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-9757: - Assignee: Cheng Lian Can't create persistent data source tables with decimal

[jira] [Resolved] (SPARK-9757) Can't create persistent data source tables with decimal

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9757. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8130

<    3   4   5   6   7   8   9   10   11   12   >