[jira] [Resolved] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu resolved SPARK-34644. -- Resolution: Not A Problem > UDF returning array followed by explode calls the

[jira] [Commented] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296787#comment-17296787 ] Gavrilescu Laurentiu commented on SPARK-34644: -- [~tanelk] [~hyukjin.kwon] yes, it works

[jira] [Commented] (SPARK-34638) Spark SQL reads unnecessary nested fields (another type of pruning case)

2021-03-06 Thread L. C. Hsieh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296782#comment-17296782 ] L. C. Hsieh commented on SPARK-34638: - Thanks [~yuryn] and [~hyukjin.kwon]. I will look into this.

[jira] [Updated] (SPARK-34634) Self-join with script transformation failed to resolve attribute correctly

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-34634: - Fix Version/s: (was: 3.2) 3.2.0 > Self-join with script transformation

[jira] [Resolved] (SPARK-34634) Self-join with script transformation failed to resolve attribute correctly

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34634. -- Fix Version/s: 3.2 Assignee: EdisonWang (was: Apache Spark) Resolution: Fixed

[jira] [Commented] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296769#comment-17296769 ] Apache Spark commented on SPARK-34623: -- User 'tanelk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34623: Assignee: (was: Apache Spark) > Deduplicate window expressions >

[jira] [Assigned] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34623: Assignee: Apache Spark > Deduplicate window expressions > --

[jira] [Commented] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296768#comment-17296768 ] Apache Spark commented on SPARK-34623: -- User 'tanelk' has created a pull request for this issue:

[jira] [Commented] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Tanel Kiis (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296767#comment-17296767 ] Tanel Kiis commented on SPARK-34623: I had typo in the PR title, so it did not link up:

[jira] [Commented] (SPARK-34583) typed udf fails when it refers to type member in abstract class

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296751#comment-17296751 ] Hyukjin Kwon commented on SPARK-34583: -- cc [~Ngone51] [~cloud_fan] FYI > typed udf fails when it

[jira] [Resolved] (SPARK-34594) OrcColumnarBatchReader uncaught exception

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34594. -- Resolution: Invalid > OrcColumnarBatchReader uncaught exception >

[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296750#comment-17296750 ] Hyukjin Kwon commented on SPARK-34606: -- [~ondrej], are you working on this? Any PR will be very

[jira] [Assigned] (SPARK-34625) Enable Arrow optimization for additional supported types when using SparkR

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34625: Assignee: (was: Apache Spark) > Enable Arrow optimization for additional supported

[jira] [Assigned] (SPARK-34625) Enable Arrow optimization for additional supported types when using SparkR

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34625: Assignee: Apache Spark > Enable Arrow optimization for additional supported types when

[jira] [Commented] (SPARK-34611) Fix grammar issues in sql-migration-guide

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296748#comment-17296748 ] Hyukjin Kwon commented on SPARK-34611: -- [~hopefulnick] are you working on this? > Fix grammar

[jira] [Commented] (SPARK-34625) Enable Arrow optimization for additional supported types when using SparkR

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296747#comment-17296747 ] Apache Spark commented on SPARK-34625: -- User 'msummersgill' has created a pull request for this

[jira] [Commented] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296746#comment-17296746 ] Hyukjin Kwon commented on SPARK-34623: -- [~tanelk] are you working on this? It would be great if if

[jira] [Commented] (SPARK-34631) Caught Hive MetaException when query by partition (partition col start with ‘$’)

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296745#comment-17296745 ] Hyukjin Kwon commented on SPARK-34631: -- 1. does it work in Hive? 2. dose setting

[jira] [Commented] (SPARK-34632) Can we create 'SessionState' with a username in 'HiveClientImpl'

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296743#comment-17296743 ] Hyukjin Kwon commented on SPARK-34632: -- It would be great if you elaborate the benefits of doing

[jira] [Commented] (SPARK-34631) Caught Hive MetaException when query by partition (partition col start with ‘$’)

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296744#comment-17296744 ] Hyukjin Kwon commented on SPARK-34631: -- Please avoid setting blocker+ which is usually reserved for

[jira] [Updated] (SPARK-34631) Caught Hive MetaException when query by partition (partition col start with ‘$’)

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-34631: - Priority: Major (was: Blocker) > Caught Hive MetaException when query by partition (partition

[jira] [Resolved] (SPARK-34633) Self-join with script transformation failed to resolve attribute correctly

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34633. -- Resolution: Invalid > Self-join with script transformation failed to resolve attribute

[jira] [Commented] (SPARK-34633) Self-join with script transformation failed to resolve attribute correctly

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296742#comment-17296742 ] Hyukjin Kwon commented on SPARK-34633: -- [~EdisonWang] please fill the PR description preferably

[jira] [Commented] (SPARK-34638) Spark SQL reads unnecessary nested fields (another type of pruning case)

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296741#comment-17296741 ] Hyukjin Kwon commented on SPARK-34638: -- cc [~viirya] FYI > Spark SQL reads unnecessary nested

[jira] [Resolved] (SPARK-34640) unable to access grouping column after groupBy

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34640. -- Resolution: Invalid > unable to access grouping column after groupBy >

[jira] [Commented] (SPARK-34640) unable to access grouping column after groupBy

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296740#comment-17296740 ] Hyukjin Kwon commented on SPARK-34640: -- You can use backquotes with `$"..."`: {code} scala>

[jira] [Commented] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296735#comment-17296735 ] Hyukjin Kwon commented on SPARK-34644: -- [~lgavrilescu] can you confirm if calling

[jira] [Commented] (SPARK-34646) TreeNode bind issue for duplicate column name.

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296733#comment-17296733 ] Hyukjin Kwon commented on SPARK-34646: -- [~lpn451] do you mind showing the self-contained reproducer

[jira] [Assigned] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-34650: Assignee: Dongjoon Hyun > Exclude zstd-jni transitive dependency from Kafka Client >

[jira] [Resolved] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-34650. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31767

[jira] [Commented] (SPARK-34649) org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296718#comment-17296718 ] Apache Spark commented on SPARK-34649: -- User 'amandeep-sharma' has created a pull request for this

[jira] [Assigned] (SPARK-34649) org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34649: Assignee: Apache Spark > org.apache.spark.sql.DataFrameNaFunctions.replace() fails for

[jira] [Assigned] (SPARK-34649) org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34649: Assignee: (was: Apache Spark) > org.apache.spark.sql.DataFrameNaFunctions.replace()

[jira] [Commented] (SPARK-34649) org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296717#comment-17296717 ] Apache Spark commented on SPARK-34649: -- User 'amandeep-sharma' has created a pull request for this

[jira] [Assigned] (SPARK-33436) PySpark equivalent of SparkContext.hadoopConfiguration

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33436: Assignee: Apache Spark > PySpark equivalent of SparkContext.hadoopConfiguration >

[jira] [Commented] (SPARK-33436) PySpark equivalent of SparkContext.hadoopConfiguration

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296691#comment-17296691 ] Apache Spark commented on SPARK-33436: -- User 'yaroslav-serhiichuk' has created a pull request for

[jira] [Assigned] (SPARK-33436) PySpark equivalent of SparkContext.hadoopConfiguration

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33436: Assignee: (was: Apache Spark) > PySpark equivalent of

[jira] [Updated] (SPARK-34559) Upgrade to ZSTD JNI 1.4.8-6

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34559: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Upgrade to ZSTD JNI

[jira] [Updated] (SPARK-34036) Update ORC data source documentation

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34036: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Update ORC data source

[jira] [Updated] (SPARK-33978) Support ZSTD compression in ORC data source

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-33978: -- Parent: SPARK-34651 Issue Type: Sub-task (was: New Feature) > Support ZSTD

[jira] [Updated] (SPARK-33295) Upgrade ORC to 1.6.6

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-33295: -- Parent: SPARK-34651 Issue Type: Sub-task (was: New Feature) > Upgrade ORC to 1.6.6

[jira] [Updated] (SPARK-34479) Add zstandard codec to spark.sql.avro.compression.codec

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34479: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Add zstandard codec to

[jira] [Updated] (SPARK-34651) Improve ZSTD support and QA

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34651: -- Summary: Improve ZSTD support and QA (was: Improve ZSTD support) > Improve ZSTD support and

[jira] [Commented] (SPARK-25769) UnresolvedAttribute.sql() incorrectly escapes nested columns

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296681#comment-17296681 ] Apache Spark commented on SPARK-25769: -- User 'sarutak' has created a pull request for this issue:

[jira] [Updated] (SPARK-34651) Improve ZSTD support

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34651: -- Summary: Improve ZSTD support (was: Improve ZSTD support and QA) > Improve ZSTD support >

[jira] [Updated] (SPARK-34557) Exclude Avro's transitive zstd-jni dependency

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34557: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Exclude Avro's

[jira] [Updated] (SPARK-34323) Upgrade zstd-jni to 1.4.8-3

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34323: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Upgrade zstd-jni to

[jira] [Updated] (SPARK-34651) Improve ZSTD support

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34651: -- Target Version/s: 3.2.0 > Improve ZSTD support > > >

[jira] [Assigned] (SPARK-31717) Remove a fallback version of HiveExternalCatalogVersionsSuite

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-31717: - Assignee: (was: Dongjoon Hyun) > Remove a fallback version of

[jira] [Closed] (SPARK-31717) Remove a fallback version of HiveExternalCatalogVersionsSuite

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-31717. - > Remove a fallback version of HiveExternalCatalogVersionsSuite >

[jira] [Resolved] (SPARK-31717) Remove a fallback version of HiveExternalCatalogVersionsSuite

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-31717. --- Resolution: Duplicate > Remove a fallback version of HiveExternalCatalogVersionsSuite >

[jira] [Updated] (SPARK-34496) Upgrade ZSTD-JNI to 1.4.8-5 to API compatibility

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34496: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Upgrade ZSTD-JNI to

[jira] [Updated] (SPARK-34340) Support ZSTD JNI BufferPool

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34340: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Support ZSTD JNI

[jira] [Updated] (SPARK-34387) Add ZStandardBenchmark

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34387: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Add ZStandardBenchmark

[jira] [Updated] (SPARK-34503) Use zstd for spark.eventLog.compression.codec by default

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34503: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Use zstd for

[jira] [Updated] (SPARK-34390) Enable Zstandard buffer pool by default

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34390: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Enable Zstandard

[jira] [Updated] (SPARK-34647) Upgrade ZSTD-JNI to 1.4.8-7 and use NoFinalizer classes

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34647: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Upgrade ZSTD-JNI to

[jira] [Updated] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-34650: -- Parent: SPARK-34651 Issue Type: Sub-task (was: Improvement) > Exclude zstd-jni

[jira] [Created] (SPARK-34651) Improve ZSTD support

2021-03-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-34651: - Summary: Improve ZSTD support Key: SPARK-34651 URL: https://issues.apache.org/jira/browse/SPARK-34651 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296672#comment-17296672 ] Apache Spark commented on SPARK-34650: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296671#comment-17296671 ] Apache Spark commented on SPARK-34650: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34650: Assignee: Apache Spark > Exclude zstd-jni transitive dependency from Kafka Client >

[jira] [Assigned] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34650: Assignee: (was: Apache Spark) > Exclude zstd-jni transitive dependency from Kafka

[jira] [Created] (SPARK-34650) Exclude zstd-jni transitive dependency from Kafka Client

2021-03-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-34650: - Summary: Exclude zstd-jni transitive dependency from Kafka Client Key: SPARK-34650 URL: https://issues.apache.org/jira/browse/SPARK-34650 Project: Spark

[jira] [Commented] (SPARK-34607) NewInstance.resolved should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296648#comment-17296648 ] Apache Spark commented on SPARK-34607: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-34607) NewInstance.resolved should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296647#comment-17296647 ] Apache Spark commented on SPARK-34607: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-34596) NewInstance.doGenCode should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296646#comment-17296646 ] Apache Spark commented on SPARK-34596: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Resolved] (SPARK-34647) Upgrade ZSTD-JNI to 1.4.8-7 and use NoFinalizer classes

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-34647. --- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31762

[jira] [Assigned] (SPARK-34647) Upgrade ZSTD-JNI to 1.4.8-7 and use NoFinalizer classes

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-34647: - Assignee: Dongjoon Hyun > Upgrade ZSTD-JNI to 1.4.8-7 and use NoFinalizer classes >

[jira] [Assigned] (SPARK-34628) Remove GlobalLimit operator if it's child max row <= limit

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-34628: - Assignee: Yuming Wang (was: Apache Spark) > Remove GlobalLimit operator if it's child

[jira] [Resolved] (SPARK-34628) Remove GlobalLimit operator if it's child max row <= limit

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-34628. --- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31750

[jira] [Commented] (SPARK-34607) NewInstance.resolved should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296593#comment-17296593 ] Apache Spark commented on SPARK-34607: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-34607) NewInstance.resolved should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296589#comment-17296589 ] Apache Spark commented on SPARK-34607: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-34596) NewInstance.doGenCode should not throw malformed class name error

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296588#comment-17296588 ] Apache Spark commented on SPARK-34596: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-34615) Support java.time.Period as an external type of the year-month interval type

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34615: Assignee: (was: Apache Spark) > Support java.time.Period as an external type of the

[jira] [Assigned] (SPARK-34615) Support java.time.Period as an external type of the year-month interval type

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-34615: Assignee: Apache Spark > Support java.time.Period as an external type of the year-month

[jira] [Commented] (SPARK-34615) Support java.time.Period as an external type of the year-month interval type

2021-03-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296587#comment-17296587 ] Apache Spark commented on SPARK-34615: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-34642) TypeError in Pyspark Linear Regression docs

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-34642: - Assignee: Sean R. Owen > TypeError in Pyspark Linear Regression docs >

[jira] [Resolved] (SPARK-34642) TypeError in Pyspark Linear Regression docs

2021-03-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-34642. --- Fix Version/s: 3.1.2 3.2.0 Resolution: Fixed Issue resolved by

[jira] [Updated] (SPARK-34565) Collapse Window nodes with Project between them

2021-03-06 Thread Tanel Kiis (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanel Kiis updated SPARK-34565: --- Description: The CollapseWindow optimizer rule can be improved to also collapse Window nodes, that

[jira] [Updated] (SPARK-34623) Deduplicate window expressions

2021-03-06 Thread Tanel Kiis (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanel Kiis updated SPARK-34623: --- Issue Type: Improvement (was: Bug) > Deduplicate window expressions >

[jira] [Commented] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Tanel Kiis (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296556#comment-17296556 ] Tanel Kiis commented on SPARK-34644: UDF with internal state should be marked as non-deterministic

[jira] [Commented] (SPARK-34615) Support java.time.Period as an external type of the year-month interval type

2021-03-06 Thread Maxim Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296540#comment-17296540 ] Maxim Gekk commented on SPARK-34615: I am working on this sub-task. > Support java.time.Period as

[jira] [Resolved] (SPARK-34648) Reading Parquet Files in Spark Extremely Slow for Large Number of Files?

2021-03-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-34648. -- Resolution: Invalid > Reading Parquet Files in Spark Extremely Slow for Large Number

[jira] [Commented] (SPARK-34648) Reading Parquet Files in Spark Extremely Slow for Large Number of Files?

2021-03-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296528#comment-17296528 ] Takeshi Yamamuro commented on SPARK-34648: -- Please use the mailing list (u...@spark.apache.org)

[jira] [Resolved] (SPARK-34595) DPP support RLIKE

2021-03-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-34595. -- Fix Version/s: 3.2.0 Assignee: chaojun zhang Resolution: Fixed

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode calls the UDF multiple

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode calls the UDF multiple

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode calls the UDF multiple

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode calls the UDF multiple times and could return wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Summary: UDF returning array followed by explode calls the UDF multiple times

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode returns wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode looks to be calling the UDF

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode returns wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode and sum looks to be calling

[jira] [Updated] (SPARK-34644) UDF returning array followed by explode returns wrong results

2021-03-06 Thread Gavrilescu Laurentiu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gavrilescu Laurentiu updated SPARK-34644: - Description: *Applying an UDF followed by explode looks to be calling the UDF

[jira] [Created] (SPARK-34649) org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot

2021-03-06 Thread Amandeep Sharma (Jira)
Amandeep Sharma created SPARK-34649: --- Summary: org.apache.spark.sql.DataFrameNaFunctions.replace() fails for column name having a dot Key: SPARK-34649 URL: https://issues.apache.org/jira/browse/SPARK-34649