[jira] [Commented] (SPARK-6795) Avoid reading Parquet footers on driver side when an global arbitrative schema is available

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694899#comment-14694899 ] Cheng Lian commented on SPARK-6795: --- As explained on GitHub, usually we only backport

[jira] [Resolved] (SPARK-9927) Revert fix of 9182 since it's pushing the wrong filter down

2015-08-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9927. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8157

[jira] [Updated] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9182: -- Priority: Blocker (was: Critical) filter and groupBy on DataFrames are not passed through to jdbc

[jira] [Commented] (SPARK-9927) Revert fix of 9182 since it's pushing the wrong filter down

2015-08-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694684#comment-14694684 ] Cheng Lian commented on SPARK-9927: --- Could you provide a snippet that reproduces this

[jira] [Reopened] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reopened SPARK-9182: --- Found a regression in https://github.com/apache/spark/pull/8049 and reverted it via

[jira] [Commented] (SPARK-9182) filter and groupBy on DataFrames are not passed through to jdbc source

2015-08-12 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694739#comment-14694739 ] Cheng Lian commented on SPARK-9182: --- [~grahn] Unfortunately we found a regression in the

[jira] [Updated] (SPARK-10035) Parquet filters does not process EqualNullSafe filter.

2015-08-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10035: --- Assignee: Hyukjin Kwon Parquet filters does not process EqualNullSafe filter.

[jira] [Commented] (SPARK-10035) Parquet filters does not process EqualNullSafe filter.

2015-08-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699316#comment-14699316 ] Cheng Lian commented on SPARK-10035: Done, thanks for working on this! Parquet

[jira] [Commented] (SPARK-10030) Managed memory leak detected when cache table

2015-08-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699372#comment-14699372 ] Cheng Lian commented on SPARK-10030: [~joshrosen] Seems to be related to Tungsten?

[jira] [Updated] (SPARK-9973) Wrong initial size of in-memory columnar buffers

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9973: -- Summary: Wrong initial size of in-memory columnar buffers (was: wrong buffle size) Wrong initial

[jira] [Updated] (SPARK-9973) wrong buffle size

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9973: -- Assignee: xukun wrong buffle size - Key: SPARK-9973

[jira] [Updated] (SPARK-9973) Wrong initial size of in-memory columnar buffers

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9973: -- Shepherd: Cheng Lian Sprint: Spark 1.5 doc/QA sprint Affects Version/s:

[jira] [Commented] (SPARK-9973) Wrong initial size of in-memory columnar buffers

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698553#comment-14698553 ] Cheng Lian commented on SPARK-9973: --- I've updated the title and description. Wrong

[jira] [Updated] (SPARK-10005) Parquet reader doesn't handle schema merging properly for nested structs

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10005: --- Description: Spark shell snippet to reproduce this issue (note that both {{DataFrame}} written

[jira] [Resolved] (SPARK-9973) Wrong initial size of in-memory columnar buffers

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9973. --- Resolution: Fixed Resolved by https://github.com/apache/spark/pull/8189 Wrong initial size of

[jira] [Updated] (SPARK-9973) Wrong initial size of in-memory columnar buffers

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9973: -- Fix Version/s: 1.5.0 Wrong initial size of in-memory columnar buffers

[jira] [Commented] (SPARK-9627) SQL job failed if the dataframe with string columns is cached

2015-08-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701167#comment-14701167 ] Cheng Lian commented on SPARK-9627: --- [~davies] I tried to reproduce this issue locally

[jira] [Resolved] (SPARK-8118) Turn off noisy log output produced by Parquet 1.7.0

2015-08-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8118. --- Resolution: Fixed Issue resolved by pull request 8196 [https://github.com/apache/spark/pull/8196]

[jira] [Commented] (SPARK-7837) NPE when save as parquet in speculative tasks

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698767#comment-14698767 ] Cheng Lian commented on SPARK-7837: --- Just a note to people who want to reproduce this

[jira] [Resolved] (SPARK-7837) NPE when save as parquet in speculative tasks

2015-08-17 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7837. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8236

[jira] [Updated] (SPARK-9899) JSON/Parquet writing on retry or speculation broken with direct output committer

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9899: -- Description: If the first task fails all subsequent tasks will. We probably need to set a different

[jira] [Updated] (SPARK-9899) JSON/Parquet writing on retry or speculation broken with direct output committer

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9899: -- Description: If the first task fails all subsequent tasks will. We probably need to set a different

[jira] [Updated] (SPARK-9899) JSON/Parquet writing on retry or speculation broken with direct output committer

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9899: -- Description: If the first task fails all subsequent tasks will. We probably need to set a different

[jira] [Updated] (SPARK-9899) JSON/Parquet writing on retry or speculation broken with direct output committer

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9899: -- Description: If the first task fails all subsequent tasks will. We probably need to set a different

[jira] [Resolved] (SPARK-9606) HiveThriftServer tests failing.

2015-08-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9606. --- Resolution: Fixed Fix Version/s: 1.5.0 Fixed by SPARK-9939 HiveThriftServer tests failing.

[jira] [Resolved] (SPARK-9939) Resort to Java process API in test suites forking subprocesses

2015-08-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9939. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8168

[jira] [Resolved] (SPARK-10035) Parquet filters does not process EqualNullSafe filter.

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-10035. Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8275

[jira] [Updated] (SPARK-10035) Parquet filters does not process EqualNullSafe filter.

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10035: --- Shepherd: Cheng Lian Affects Version/s: 1.5.0 1.4.1 Parquet

[jira] [Updated] (SPARK-9735) Auto infer partition schema of HadoopFsRelation should should respected the user specified one

2015-08-22 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9735: -- Assignee: Cheng Hao (was: Cheng Lian) Auto infer partition schema of HadoopFsRelation should should

[jira] [Updated] (SPARK-9735) Auto infer partition schema of HadoopFsRelation should should respected the user specified one

2015-08-22 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9735: -- Shepherd: Cheng Lian Auto infer partition schema of HadoopFsRelation should should respected the

[jira] [Updated] (SPARK-10136) Parquet support fail to decode Avro/Thrift arrays of primitive array (e.g. arrayarrayint)

2015-08-21 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10136: --- Summary: Parquet support fail to decode Avro/Thrift arrays of primitive array (e.g. arrayarrayint)

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Description: Running the following SQL under Hive 0.14.0+ (tested against 0.14.0 and 1.2.1):

[jira] [Commented] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709430#comment-14709430 ] Cheng Lian commented on SPARK-10177: I'm not quite familiar with Julian date, but the

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Description: Running the following SQL under Hive 0.14.0+ (tested against 0.14.0 and 1.2.1):

[jira] [Resolved] (SPARK-10092) Multi-DB support follow up

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-10092. Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8336

[jira] [Updated] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10136: --- Description: The following Avro schema {noformat} record AvroNonNullableArrays { arrayarrayint

[jira] [Resolved] (SPARK-9600) DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9600. --- Resolution: Not A Problem DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

[jira] [Commented] (SPARK-9600) DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702747#comment-14702747 ] Cheng Lian commented on SPARK-9600: --- [~sthotaibeam] Sorry for my late reply. With the

[jira] [Commented] (SPARK-9627) SQL job failed if the dataframe with string columns is cached

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702661#comment-14702661 ] Cheng Lian commented on SPARK-9627: --- OK I finally reproduced this issue. The tricky part

[jira] [Commented] (SPARK-9627) SQL job failed if the dataframe with string columns is cached

2015-08-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702671#comment-14702671 ] Cheng Lian commented on SPARK-9627: --- A quick Googling suggesting that it's probably

[jira] [Commented] (SPARK-7837) NPE when save as parquet in speculative tasks

2015-08-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699036#comment-14699036 ] Cheng Lian commented on SPARK-7837: --- Good job! NPE when save as parquet in speculative

[jira] [Created] (SPARK-10177) Parquet support interpret timestamp values differently from Hive 0.14.0+

2015-08-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-10177: -- Summary: Parquet support interpret timestamp values differently from Hive 0.14.0+ Key: SPARK-10177 URL: https://issues.apache.org/jira/browse/SPARK-10177 Project: Spark

[jira] [Commented] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708957#comment-14708957 ] Cheng Lian commented on SPARK-10177: [~davies] I'm not sure whether this is a

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Description: Running the following SQL under Hive 0.14.0+ (tested against 0.14.0 and 1.2.1):

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Description: Running the following SQL under Hive 0.14.0+ (tested against 0.14.0 and 1.2.1):

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Description: Running the following SQL under Hive 0.14.0+ (tested against 0.14.0 and 1.2.1):

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Attachment: 00_0 Attached the Parquet file generated by the Hive 0.14.0 SQL statement mentioned

[jira] [Updated] (SPARK-8580) Test Parquet interoperability and compatibility with other libraries/systems

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8580: -- Summary: Test Parquet interoperability and compatibility with other libraries/systems (was: Add

[jira] [Updated] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10177: --- Summary: Parquet support interprets timestamp values differently from Hive 0.14.0+ (was: Parquet

[jira] [Comment Edited] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708957#comment-14708957 ] Cheng Lian edited comment on SPARK-10177 at 8/24/15 8:41 AM: -

[jira] [Commented] (SPARK-10177) Parquet support interprets timestamp values differently from Hive 0.14.0+

2015-08-24 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708968#comment-14708968 ] Cheng Lian commented on SPARK-10177: Ah, didn't realize that the

[jira] [Updated] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10136: --- Description: The following Avro schema {noformat} record AvroNonNullableArrays { arrayarrayint

[jira] [Created] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-10136: -- Summary: Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint) Key: SPARK-10136 URL: https://issues.apache.org/jira/browse/SPARK-10136

[jira] [Updated] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10136: --- Description: The following Avro schema {noformat} record AvroNonNullableArrays { arrayarrayint

[jira] [Updated] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-10136: --- Priority: Blocker (was: Major) Parquet support fail to decode Avro arrays of primitive array (e.g.

[jira] [Commented] (SPARK-10136) Parquet support fail to decode Avro arrays of primitive array (e.g. arrayarrayint)

2015-08-20 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705189#comment-14705189 ] Cheng Lian commented on SPARK-10136: Marked this as BLOCKER since it's a regression

[jira] [Resolved] (SPARK-8615) sql programming guide recommends deprecated code

2015-06-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8615. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7039

[jira] [Updated] (SPARK-8615) sql programming guide recommends deprecated code

2015-06-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8615: -- Target Version/s: 1.5.0 sql programming guide recommends deprecated code

[jira] [Created] (SPARK-8720) PR #7036 breaks branch-1.4 because of a malformed comment

2015-06-29 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-8720: - Summary: PR #7036 breaks branch-1.4 because of a malformed comment Key: SPARK-8720 URL: https://issues.apache.org/jira/browse/SPARK-8720 Project: Spark Issue

[jira] [Updated] (SPARK-8690) Add a setting to disable SparkSQL parquet schema merge by using datasource API

2015-06-29 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8690: -- Shepherd: Cheng Lian Add a setting to disable SparkSQL parquet schema merge by using datasource API

[jira] [Updated] (SPARK-8125) Accelerate ParquetRelation2 metadata discovery

2015-06-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8125: -- Shepherd: Cheng Lian Accelerate ParquetRelation2 metadata discovery

[jira] [Created] (SPARK-9424) Document recent Parquet changes in Spark 1.5

2015-07-28 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9424: - Summary: Document recent Parquet changes in Spark 1.5 Key: SPARK-9424 URL: https://issues.apache.org/jira/browse/SPARK-9424 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6873) Some Hive-Catalyst comparison tests fail due to unimportant order of some printed elements

2015-08-01 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650268#comment-14650268 ] Cheng Lian commented on SPARK-6873: --- It's not important. Internally, Hive just traverses

[jira] [Updated] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-03 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-8887: -- Target Version/s: 1.5.0 (was: 1.6.0) Explicitly define which data types can be used as dynamic

[jira] [Assigned] (SPARK-8887) Explicitly define which data types can be used as dynamic partition columns

2015-08-03 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian reassigned SPARK-8887: - Assignee: Cheng Lian Explicitly define which data types can be used as dynamic partition

[jira] [Updated] (SPARK-9359) Support IntervalType for Parquet

2015-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9359: -- Assignee: Liang-Chi Hsieh Support IntervalType for Parquet

[jira] [Updated] (SPARK-9359) Support IntervalType for Parquet

2015-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9359: -- Shepherd: Cheng Lian Support IntervalType for Parquet

[jira] [Created] (SPARK-9593) Hive ShimLoader loads wrong Hadoop shims when Spark is compiled against Hadoop 2.0.0-mr1-cdh4.1.1

2015-08-04 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9593: - Summary: Hive ShimLoader loads wrong Hadoop shims when Spark is compiled against Hadoop 2.0.0-mr1-cdh4.1.1 Key: SPARK-9593 URL: https://issues.apache.org/jira/browse/SPARK-9593

[jira] [Created] (SPARK-9600) DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse

2015-08-04 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9600: - Summary: DataFrameWriter.saveAsTable always writes data to /user/hive/warehouse Key: SPARK-9600 URL: https://issues.apache.org/jira/browse/SPARK-9600 Project: Spark

[jira] [Created] (SPARK-9407) Parquet shouldn't push down predicates over a column whose underlying Parquet type is an ENUM

2015-07-28 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9407: - Summary: Parquet shouldn't push down predicates over a column whose underlying Parquet type is an ENUM Key: SPARK-9407 URL: https://issues.apache.org/jira/browse/SPARK-9407

[jira] [Resolved] (SPARK-9593) Hive ShimLoader loads wrong Hadoop shims when Spark is compiled against Hadoop 2.0.0-mr1-cdh4.1.1

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9593. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7929

[jira] [Resolved] (SPARK-9618) SQLContext.read.schema().parquet() ignores the supplied schema

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9618. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7947

[jira] [Resolved] (SPARK-9381) Migrate JSON data source to the new partitioning data source

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9381. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7696

[jira] [Updated] (SPARK-9381) Migrate JSON data source to the new partitioning data source

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9381: -- Assignee: Cheng Hao Migrate JSON data source to the new partitioning data source

[jira] [Commented] (SPARK-9593) Hive ShimLoader loads wrong Hadoop shims when Spark is compiled against Hadoop 2.0.0-mr1-cdh4.1.1

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658463#comment-14658463 ] Cheng Lian commented on SPARK-9593: --- Please feel free to edit the ticket description

[jira] [Created] (SPARK-9557) Refactor ParquetFilterSuite and remove old ParquetFilters code

2015-08-03 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9557: - Summary: Refactor ParquetFilterSuite and remove old ParquetFilters code Key: SPARK-9557 URL: https://issues.apache.org/jira/browse/SPARK-9557 Project: Spark

[jira] [Updated] (SPARK-9550) Configuration renaming, defaults changes, and deprecation for 1.5.0 (master ticket)

2015-08-03 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9550: -- Description: This ticket tracks configurations which need to be renamed, deprecated, or have their

[jira] [Created] (SPARK-9554) Turn on in-memory relation partition pruning by default

2015-08-03 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9554: - Summary: Turn on in-memory relation partition pruning by default Key: SPARK-9554 URL: https://issues.apache.org/jira/browse/SPARK-9554 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-8838) Add config to enable/disable merging part-files when merging parquet schema

2015-07-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-8838. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7238

[jira] [Comment Edited] (SPARK-8824) Support Parquet logical types TIMESTAMP_MILLIS and TIMESTAMP_MICROS

2015-08-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681331#comment-14681331 ] Cheng Lian edited comment on SPARK-8824 at 8/11/15 6:55 AM: Oh

[jira] [Commented] (SPARK-8824) Support Parquet logical types TIMESTAMP_MILLIS and TIMESTAMP_MICROS

2015-08-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681331#comment-14681331 ] Cheng Lian commented on SPARK-8824: --- Oh sorry, I mistook your request for

[jira] [Resolved] (SPARK-9340) CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields correctly

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9340. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8070

[jira] [Commented] (SPARK-9627) SQL job failed if the dataframe is cached

2015-08-06 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661092#comment-14661092 ] Cheng Lian commented on SPARK-9627: --- Sure SQL job failed if the dataframe is cached

[jira] [Resolved] (SPARK-7550) Support setting the right schema serde when writing to Hive metastore

2015-08-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-7550. --- Resolution: Fixed Resolved by https://github.com/apache/spark/pull/7967 Support setting the right

[jira] [Updated] (SPARK-9689) Cache doesn't refresh for HadoopFsRelation based table

2015-08-07 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9689: -- Assignee: Cheng Hao Affects Version/s: 1.5.0 1.4.1 Cache doesn't

[jira] [Updated] (SPARK-9689) Cache doesn't refresh for HadoopFsRelation based table

2015-08-07 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9689: -- Shepherd: Cheng Lian Cache doesn't refresh for HadoopFsRelation based table

[jira] [Commented] (SPARK-9588) spark sql cache: partition level cache eviction

2015-08-04 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654044#comment-14654044 ] Cheng Lian commented on SPARK-9588: --- What we did for improving partitioning in 1.5 are

[jira] [Created] (SPARK-9743) Scanning a HadoopFsRelation shouldn't requrire refreshing

2015-08-07 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9743: - Summary: Scanning a HadoopFsRelation shouldn't requrire refreshing Key: SPARK-9743 URL: https://issues.apache.org/jira/browse/SPARK-9743 Project: Spark Issue

[jira] [Commented] (SPARK-9725) spark sql query string field return empty/garbled string

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696235#comment-14696235 ] Cheng Lian commented on SPARK-9725: --- Seems that the result of {{show tables}} is

[jira] [Commented] (SPARK-9725) spark sql query string field return empty/garbled string

2015-08-13 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696230#comment-14696230 ] Cheng Lian commented on SPARK-9725: --- I don't have so much memory to reproduce this issue

[jira] [Updated] (SPARK-9958) HiveThriftServer2Listener is not thread-safe

2015-08-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9958: -- Assignee: Shixiong Zhu HiveThriftServer2Listener is not thread-safe

[jira] [Resolved] (SPARK-9958) HiveThriftServer2Listener is not thread-safe

2015-08-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-9958. --- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8185

[jira] [Commented] (SPARK-8118) Turn off noisy log output produced by Parquet 1.7.0

2015-08-14 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697203#comment-14697203 ] Cheng Lian commented on SPARK-8118: --- Unfortunately no. Turn off noisy log output

[jira] [Updated] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9783: -- Sprint: Spark 1.5 doc/QA sprint Environment: (was: PR #8035 made a quick fix for SPARK-9743

[jira] [Updated] (SPARK-9340) CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields correctly

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9340: -- Sprint: Spark 1.5 doc/QA sprint Target Version/s: 1.5.0 CatalystSchemaConverter and

[jira] [Commented] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680457#comment-14680457 ] Cheng Lian commented on SPARK-9783: --- cc [~yhuai] Use SqlNewHadoopRDD in JSONRelation

[jira] [Commented] (SPARK-9340) ParquetTypeConverter incorrectly handling of repeated types results in schema mismatch

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680418#comment-14680418 ] Cheng Lian commented on SPARK-9340: --- Thanks for the clarification. In [PR

[jira] [Updated] (SPARK-9340) CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields correctly

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-9340: -- Summary: CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields

[jira] [Created] (SPARK-9783) Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call

2015-08-10 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-9783: - Summary: Use SqlNewHadoopRDD in JSONRelation to eliminate extra refresh() call Key: SPARK-9783 URL: https://issues.apache.org/jira/browse/SPARK-9783 Project: Spark

[jira] [Commented] (SPARK-9340) CatalystSchemaConverter and CatalystRowConverter don't handle unannotated repeated fields correctly

2015-08-10 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680523#comment-14680523 ] Cheng Lian commented on SPARK-9340: --- Great, would you mind to leave a LGTM on the GitHub

<    4   5   6   7   8   9   10   11   12   13   >