[jira] [Resolved] (SPARK-32589) NoSuchElementException: None.get for needsUnsafeRowConversion

2020-09-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32589. - Resolution: Duplicate > NoSuchElementException: None.get for needsUnsafeRowConversion >

[jira] [Updated] (SPARK-32825) CTE support on MSSQL

2020-09-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32825: Component/s: (was: Spark Core) SQL > CTE support on MSSQL >

[jira] [Commented] (SPARK-32825) CTE support on MSSQL

2020-09-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193200#comment-17193200 ] L. C. Hsieh commented on SPARK-32825: - The user provided query should be a query that could be put

[jira] [Updated] (SPARK-32819) Spark SQL aggregate() fails on nested string arrays

2020-09-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32819: Priority: Major (was: Minor) > Spark SQL aggregate() fails on nested string arrays >

[jira] [Resolved] (SPARK-32796) Make withField API support nested struct in array

2020-09-07 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32796. - Resolution: Won't Fix > Make withField API support nested struct in array >

[jira] [Updated] (SPARK-32813) Reading parquet rdd in non columnar mode fails in multithreaded environment

2020-09-07 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32813: Affects Version/s: 3.1.0 > Reading parquet rdd in non columnar mode fails in multithreaded

[jira] [Updated] (SPARK-32813) Reading parquet rdd in non columnar mode fails in multithreaded environment

2020-09-07 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32813: Priority: Major (was: Blocker) > Reading parquet rdd in non columnar mode fails in multithreaded

[jira] [Commented] (SPARK-32805) Literal integer seems to get confused as column reference

2020-09-06 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191327#comment-17191327 ] L. C. Hsieh commented on SPARK-32805: - You can disable group by ordinal feature by disabling SQL

[jira] [Commented] (SPARK-32784) java.lang.NoClassDefFoundError: parquet/hadoop/ParquetOutputFormat

2020-09-06 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191326#comment-17191326 ] L. C. Hsieh commented on SPARK-32784: - This looks more like related to environment setting or

[jira] [Updated] (SPARK-32780) Fill since fields for all the expressions

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32780: Labels: starter (was: beginner) > Fill since fields for all the expressions >

[jira] [Updated] (SPARK-32798) Make unionByName optionally fill missing columns with nulls in PySpark

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32798: Labels: starter (was: beginner) > Make unionByName optionally fill missing columns with nulls in

[jira] [Updated] (SPARK-32799) Make unionByName optionally fill missing columns with nulls in SparkR

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32799: Labels: starter (was: beginner) > Make unionByName optionally fill missing columns with nulls in

[jira] [Commented] (SPARK-32758) Spark ignores limit(1) and starts tasks for all partition

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191175#comment-17191175 ] L. C. Hsieh commented on SPARK-32758: - I think it's known behavior in Spark SQL. This involves some

[jira] [Updated] (SPARK-32780) Fill since fields for all the expressions

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32780: Labels: beginner (was: ) > Fill since fields for all the expressions >

[jira] [Comment Edited] (SPARK-32787) requirement failed: The columns of A don't match the number of elements of x. A: 14, x: 10

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191174#comment-17191174 ] L. C. Hsieh edited comment on SPARK-32787 at 9/6/20, 1:04 AM: -- Oh, I see.

[jira] [Commented] (SPARK-32787) requirement failed: The columns of A don't match the number of elements of x. A: 14, x: 10

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191174#comment-17191174 ] L. C. Hsieh commented on SPARK-32787: - Oh, I see. You cannot use {{OneHotEncoderEstimator}} to train

[jira] [Updated] (SPARK-32799) Make unionByName optionally fill missing columns with nulls in SparkR

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32799: Description: It would be nicer to expose {{unionByName}} parameter in R APIs as well. Currently

[jira] [Updated] (SPARK-32799) Make unionByName optionally fill missing columns with nulls in SparkR

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32799: Description: It would be nicer to expose {{unionbyName}} parameter in R APIs as well. Currently

[jira] [Commented] (SPARK-32784) java.lang.NoClassDefFoundError: parquet/hadoop/ParquetOutputFormat

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191126#comment-17191126 ] L. C. Hsieh commented on SPARK-32784: - This is very basic write operation, it is unlikely to have a

[jira] [Commented] (SPARK-32793) Expose assert_true in Python/Scala APIs and add error message parameter

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191099#comment-17191099 ] L. C. Hsieh commented on SPARK-32793: - So this is a new SQL expression which will be rewritten to

[jira] [Updated] (SPARK-32802) Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32802: Affects Version/s: (was: 3.0.0) 3.1.0 > Avoid using

[jira] [Updated] (SPARK-32796) Make withField API support nested struct in array

2020-09-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32796: Description: Currently {{Column.withField}} only supports {{StructType}}. For nested struct in

[jira] [Updated] (SPARK-32799) Make unionByName optionally fill missing columns with nulls in SparkR

2020-09-04 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32799: Labels: beginner (was: ) > Make unionByName optionally fill missing columns with nulls in SparkR

[jira] [Updated] (SPARK-32798) Make unionByName optionally fill missing columns with nulls in PySpark

2020-09-04 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32798: Labels: beginner (was: ) > Make unionByName optionally fill missing columns with nulls in

[jira] [Updated] (SPARK-32796) Make withField API support nested struct in array

2020-09-04 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32796: Summary: Make withField API support nested struct in array (was: Make withField API support

[jira] [Created] (SPARK-32796) Make withField API support ArrayType

2020-09-04 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32796: --- Summary: Make withField API support ArrayType Key: SPARK-32796 URL: https://issues.apache.org/jira/browse/SPARK-32796 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-32693) Compare two dataframes with same schema except nullable property

2020-08-31 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188057#comment-17188057 ] L. C. Hsieh commented on SPARK-32693: - Ok. Thanks [~maropu] > Compare two dataframes with same

[jira] [Commented] (SPARK-19256) Hive bucketing write support

2020-08-27 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185649#comment-17185649 ] L. C. Hsieh commented on SPARK-19256: - > Hive on Tez: support zero and multiple files per bucket >

[jira] [Commented] (SPARK-19256) Hive bucketing write support

2020-08-27 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185648#comment-17185648 ] L. C. Hsieh commented on SPARK-19256: - > Do not allow for writing Hive non-ORC/Parquet bucketed

[jira] [Updated] (SPARK-32693) Compare two dataframes with same schema except nullable property

2020-08-26 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32693: Affects Version/s: 3.1.0 > Compare two dataframes with same schema except nullable property >

[jira] [Commented] (SPARK-32361) Remove project if output is subset of child

2020-08-26 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17184977#comment-17184977 ] L. C. Hsieh commented on SPARK-32361: - Isn't it already in physical plan phase? Removing such

[jira] [Commented] (SPARK-32110) -0.0 vs 0.0 is inconsistent

2020-08-24 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183747#comment-17183747 ] L. C. Hsieh commented on SPARK-32110: - For `HyperLogLogPlusPlus`, seems its problem is that we use

[jira] [Commented] (SPARK-32110) -0.0 vs 0.0 is inconsistent

2020-08-24 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183509#comment-17183509 ] L. C. Hsieh commented on SPARK-32110: - [~cloud_fan] Is the idea to not change writing path to

[jira] [Commented] (SPARK-32646) ORC predicate pushdown should work with case-insensitive analysis

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182951#comment-17182951 ] L. C. Hsieh commented on SPARK-32646: - All failed tests in hive-1.2 profile are fixed now. I will

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hive1.2 profile in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Summary: HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hive1.2

[jira] [Comment Edited] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182797#comment-17182797 ] L. C. Hsieh edited comment on SPARK-32689 at 8/24/20, 5:12 AM: ---

[jira] [Resolved] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32689. - Resolution: Resolved > HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Fix Version/s: 3.1.0 3.0.1 > HiveSerDeReadWriteSuite and

[jira] [Comment Edited] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182750#comment-17182750 ] L. C. Hsieh edited comment on SPARK-32689 at 8/23/20, 7:49 PM: ---

[jira] [Commented] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182797#comment-17182797 ] L. C. Hsieh commented on SPARK-32689: -

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Description: There are three tests which are currently failed under hive1.2 profiles in

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Description: There are three tests which are currently failed under hive1.2 profiles in

[jira] [Commented] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182750#comment-17182750 ] L. C. Hsieh commented on SPARK-32689: -

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Description: There are three tests which are currently failed under hive1.2 profiles in

[jira] [Reopened] (SPARK-32646) ORC predicate pushdown should work with case-insensitive analysis

2020-08-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh reopened SPARK-32646: - Due to issue of hive-1.2 profile, reverted merged diff and so reopened this. > ORC predicate

[jira] [Commented] (SPARK-32678) Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182588#comment-17182588 ] L. C. Hsieh commented on SPARK-32678: - Issue resolved by pull request 29503

[jira] [Resolved] (SPARK-32678) Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32678. - Resolution: Resolved > Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated

[jira] [Assigned] (SPARK-32678) Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh reassigned SPARK-32678: --- Assignee: Leanken.Lin > Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ

[jira] [Updated] (SPARK-32678) Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32678: Fix Version/s: 3.1.0 > Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

[jira] [Updated] (SPARK-32678) Rename EmptyHashedRelationWithAllNullKeys and simplify NAAJ generated code

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32678: Affects Version/s: (was: 3.0.0) 3.1.0 > Rename

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Description: There are three tests which are currently failed under hadoop2.7 and hive1.2

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0 and master

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Summary: HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Affects Version/s: 3.1.0 > HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under hadoop2.7 and hive1.2 profiles in branch-3.0

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Summary: HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed under

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed in branch-3.0

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Description: There are three tests which are currently failed under hadoop2.7 and hive1.2

[jira] [Updated] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed in branch-3.0

2020-08-22 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32689: Affects Version/s: (was: 3.0.0) 3.0.1 > HiveSerDeReadWriteSuite and

[jira] [Created] (SPARK-32689) HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed in branch-3.0

2020-08-22 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32689: --- Summary: HiveSerDeReadWriteSuite and ScriptTransformationSuite are currently failed in branch-3.0 Key: SPARK-32689 URL: https://issues.apache.org/jira/browse/SPARK-32689

[jira] [Updated] (SPARK-32646) ORC predicate pushdown should work with case-insensitive analysis

2020-08-21 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32646: Affects Version/s: 3.0.0 > ORC predicate pushdown should work with case-insensitive analysis >

[jira] [Assigned] (SPARK-32376) Make unionByName null-filling behavior work with struct columns

2020-08-18 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh reassigned SPARK-32376: --- Assignee: L. C. Hsieh > Make unionByName null-filling behavior work with struct columns >

[jira] [Commented] (SPARK-32619) converting dataframe to dataset for the json schema

2020-08-18 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179731#comment-17179731 ] L. C. Hsieh commented on SPARK-32619: - Can you provide more details when you reopened it? I still

[jira] [Commented] (SPARK-32376) Make unionByName null-filling behavior work with struct columns

2020-08-18 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17179727#comment-17179727 ] L. C. Hsieh commented on SPARK-32376: - Thanks [~mukulmurthy]. I will try to use these tests.   >

[jira] [Created] (SPARK-32646) ORC predicate pushdown should work with case-insensitive analysis

2020-08-17 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32646: --- Summary: ORC predicate pushdown should work with case-insensitive analysis Key: SPARK-32646 URL: https://issues.apache.org/jira/browse/SPARK-32646 Project: Spark

[jira] [Commented] (SPARK-32601) Issue in converting an RDD of Arrow RecordBatches in v3.0.0

2020-08-16 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178691#comment-17178691 ] L. C. Hsieh commented on SPARK-32601: - I think that we changed to use Arrow stream format in

[jira] [Commented] (SPARK-32611) Querying ORC table in Spark3 using spark.sql.orc.impl=hive produces incorrect when timestamp is present in predicate

2020-08-16 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178576#comment-17178576 ] L. C. Hsieh commented on SPARK-32611: - I also tested on branch-3.0, but still cannot reproduce it.

[jira] [Commented] (SPARK-32611) Querying ORC table in Spark3 using spark.sql.orc.impl=hive produces incorrect when timestamp is present in predicate

2020-08-16 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178571#comment-17178571 ] L. C. Hsieh commented on SPARK-32611: - Hm, I build from current master branch, but cannot reproduce

[jira] [Commented] (SPARK-32619) converting dataframe to dataset for the json schema

2020-08-16 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178563#comment-17178563 ] L. C. Hsieh commented on SPARK-32619: - Could you show the schema of the dataframe? By

[jira] [Created] (SPARK-32622) Add case-sensitivity test for ORC predicate pushdown

2020-08-14 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32622: --- Summary: Add case-sensitivity test for ORC predicate pushdown Key: SPARK-32622 URL: https://issues.apache.org/jira/browse/SPARK-32622 Project: Spark Issue

[jira] [Commented] (SPARK-32580) Issue accessing a column values after 'explode' function

2020-08-11 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175672#comment-17175672 ] L. C. Hsieh commented on SPARK-32580: - Yeah, this looks like related to nested column pruning. You

[jira] [Commented] (SPARK-32481) Support truncate table to move the data to trash

2020-08-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173757#comment-17173757 ] L. C. Hsieh commented on SPARK-32481: - Why this needs to be subtask of SPARK-32480. Will SPARK-32480

[jira] [Commented] (SPARK-32563) spark-sql doesn't support insert into mixed static & dynamic partition

2020-08-09 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173751#comment-17173751 ] L. C. Hsieh commented on SPARK-32563: - So it is not an issue anymore in 2.4, 3.0 branches, right?

[jira] [Commented] (SPARK-32571) yarnClient.killApplication(appId) is never called

2020-08-08 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173750#comment-17173750 ] L. C. Hsieh commented on SPARK-32571: - I think by design in cluster mode the spark application is

[jira] [Commented] (SPARK-32502) Please fix CVE related to Guava 14.0.1

2020-08-08 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173744#comment-17173744 ] L. C. Hsieh commented on SPARK-32502: - Currently I'm working on some changes at Hive side, including

[jira] [Commented] (SPARK-32502) Please fix CVE related to Guava 14.0.1

2020-08-04 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171197#comment-17171197 ] L. C. Hsieh commented on SPARK-32502: - I did some testings in the PRs. Few changes are required to

[jira] [Commented] (SPARK-32427) Omit USING in CREATE TABLE via JDBC Table Catalog

2020-08-02 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169610#comment-17169610 ] L. C. Hsieh commented on SPARK-32427: - Do you mean "CREATE TABLE .." without USING indicates using

[jira] [Commented] (SPARK-32376) Make unionByName null-filling behavior work with struct columns

2020-08-01 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169241#comment-17169241 ] L. C. Hsieh commented on SPARK-32376: - Hi [~mukulmurthy], are you ok if I go to work on this? >

[jira] [Commented] (SPARK-32425) Spark sequence() fails if start and end of range are identical timestamps

2020-07-27 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165810#comment-17165810 ] L. C. Hsieh commented on SPARK-32425: - Thanks [~JinxinTang]. > Spark sequence() fails if start and

[jira] [Resolved] (SPARK-32425) Spark sequence() fails if start and end of range are identical timestamps

2020-07-27 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32425. - Resolution: Duplicate > Spark sequence() fails if start and end of range are identical

[jira] [Created] (SPARK-32450) Upgrade pycodestyle to 2.6.0

2020-07-26 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32450: --- Summary: Upgrade pycodestyle to 2.6.0 Key: SPARK-32450 URL: https://issues.apache.org/jira/browse/SPARK-32450 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-32425) Spark sequence() fails if start and end of range are identical timestamps

2020-07-24 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164748#comment-17164748 ] L. C. Hsieh commented on SPARK-32425: - I ran a test with current master branch. Didn't see that

[jira] [Commented] (SPARK-32376) Make unionByName null-filling behavior work with struct columns

2020-07-24 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164747#comment-17164747 ] L. C. Hsieh commented on SPARK-32376: - [~mukulmurthy] Is it easy to port your code? If not, I think

[jira] [Commented] (SPARK-32411) GPU Cluster Fail

2020-07-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163742#comment-17163742 ] L. C. Hsieh commented on SPARK-32411: - I think it is because the configs.

[jira] [Resolved] (SPARK-32411) GPU Cluster Fail

2020-07-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-32411. - Resolution: Not A Problem > GPU Cluster Fail > > > Key:

[jira] [Commented] (SPARK-32357) Investigate test result reporter integration

2020-07-18 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160572#comment-17160572 ] L. C. Hsieh commented on SPARK-32357: - Thanks for ping. Yeah, I' will also look for possible

[jira] [Commented] (SPARK-32253) Make readability better in the test result logs

2020-07-16 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159443#comment-17159443 ] L. C. Hsieh commented on SPARK-32253: - Thanks [~Gengliang.Wang] > Make readability better in the

[jira] [Created] (SPARK-32308) Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-14 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32308: --- Summary: Move by-name resolution logic of unionByName from API code to analysis phase Key: SPARK-32308 URL: https://issues.apache.org/jira/browse/SPARK-32308 Project:

[jira] [Comment Edited] (SPARK-32253) Make readability better in the test result logs

2020-07-13 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157120#comment-17157120 ] L. C. Hsieh edited comment on SPARK-32253 at 7/14/20, 3:19 AM: --- Looks

[jira] [Comment Edited] (SPARK-32253) Make readability better in the test result logs

2020-07-13 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157120#comment-17157120 ] L. C. Hsieh edited comment on SPARK-32253 at 7/14/20, 3:18 AM: --- Will do

[jira] [Commented] (SPARK-32253) Make readability better in the test result logs

2020-07-13 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157120#comment-17157120 ] L. C. Hsieh commented on SPARK-32253: - Will do some tests. > Make readability better in the test

[jira] [Created] (SPARK-32258) NormalizeFloatingNumbers can directly normalize on certain children expressions

2020-07-09 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32258: --- Summary: NormalizeFloatingNumbers can directly normalize on certain children expressions Key: SPARK-32258 URL: https://issues.apache.org/jira/browse/SPARK-32258

[jira] [Updated] (SPARK-32163) Nested pruning should still work for nested column extractors of attributes with cosmetic variations

2020-07-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32163: Affects Version/s: (was: 3.1.0) 3.0.0 > Nested pruning should still

[jira] [Updated] (SPARK-32163) Nested pruning should still work for nested column extractors of attributes with cosmetic variations

2020-07-05 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32163: Affects Version/s: (was: 3.0.0) 3.1.0 > Nested pruning should still

[jira] [Commented] (SPARK-32169) Allow filter pushdown after a groupBy with collect_list

2020-07-04 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151196#comment-17151196 ] L. C. Hsieh commented on SPARK-32169: - collect_list is non-deterministic expression. The optimizer

[jira] [Updated] (SPARK-32163) Nested pruning should still work for nested column extractors of attributes with cosmetic variations

2020-07-02 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-32163: Issue Type: Bug (was: Improvement) > Nested pruning should still work for nested column

[jira] [Created] (SPARK-32163) Nested pruning should still work for nested column extractors of attributes with cosmetic variations

2020-07-02 Thread L. C. Hsieh (Jira)
L. C. Hsieh created SPARK-32163: --- Summary: Nested pruning should still work for nested column extractors of attributes with cosmetic variations Key: SPARK-32163 URL:

[jira] [Commented] (SPARK-32136) Spark producing incorrect groupBy results when key is a struct with nullable properties

2020-06-30 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17149104#comment-17149104 ] L. C. Hsieh commented on SPARK-32136: - Thanks for the ping. Will look at this. > Spark producing

[jira] [Commented] (SPARK-32096) Support top-N sort for Spark SQL rank window function

2020-06-28 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147542#comment-17147542 ] L. C. Hsieh commented on SPARK-32096: - Then I think it is not a simply top-N sort... You need to do

[jira] [Commented] (SPARK-32096) Support top-N sort for Spark SQL rank window function

2020-06-28 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147504#comment-17147504 ] L. C. Hsieh commented on SPARK-32096: - Does a filter of the window rank (e.g. rank <= 100) mean

[jira] [Commented] (SPARK-32114) Change name of the slaves file, to something more acceptable

2020-06-28 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147227#comment-17147227 ] L. C. Hsieh commented on SPARK-32114: - I think this might be duplicate to SPARK-32004. > Change

[jira] [Commented] (SPARK-32104) Avoid full outer join OOM on skewed dataset

2020-06-27 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146806#comment-17146806 ] L. C. Hsieh commented on SPARK-32104: - Is this duplicate to SPARK-24985? > Avoid full outer join

[jira] [Commented] (SPARK-32063) Spark native temporary table

2020-06-23 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143233#comment-17143233 ] L. C. Hsieh commented on SPARK-32063: - For 1 and 2, it seems all related to performance. In Spark,

<    3   4   5   6   7   8   9   10   >