[jira] [Commented] (SPARK-35579) Fix a bug in janino or work around it in Spark.

2022-07-10 Thread Prashant Singh (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564797#comment-17564797 ] Prashant Singh commented on SPARK-35579: should we bump janino to v3.1.7 considering :

[jira] [Resolved] (SPARK-39726) Change the default value of spark.sql.execution.topKSortFallbackThreshold to 800000

2022-07-10 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-39726. - Resolution: Not A Problem > Change the default value of

[jira] [Commented] (SPARK-39739) Upgrade sbt to 1.7.0

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564794#comment-17564794 ] Apache Spark commented on SPARK-39739: -- User 'LuciferYang' has created a pull request for this

[jira] [Assigned] (SPARK-39739) Upgrade sbt to 1.7.0

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39739: Assignee: (was: Apache Spark) > Upgrade sbt to 1.7.0 > > >

[jira] [Assigned] (SPARK-39739) Upgrade sbt to 1.7.0

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39739: Assignee: Apache Spark > Upgrade sbt to 1.7.0 > > >

[jira] [Created] (SPARK-39739) Upgrade sbt to 1.7.0

2022-07-10 Thread Yang Jie (Jira)
Yang Jie created SPARK-39739: Summary: Upgrade sbt to 1.7.0 Key: SPARK-39739 URL: https://issues.apache.org/jira/browse/SPARK-39739 Project: Spark Issue Type: Improvement Components:

[jira] [Created] (SPARK-39738) ORC uses Protobuf version vulnerable to CVE-2021-22569

2022-07-10 Thread Eugene Shinn (Truveta) (Jira)
Eugene Shinn (Truveta) created SPARK-39738: -- Summary: ORC uses Protobuf version vulnerable to CVE-2021-22569 Key: SPARK-39738 URL: https://issues.apache.org/jira/browse/SPARK-39738 Project:

[jira] [Commented] (SPARK-39737) PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564788#comment-17564788 ] Apache Spark commented on SPARK-39737: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-39737) PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39737: Assignee: Apache Spark > PERCENTILE_CONT and PERCENTILE_DISC should support aggregate

[jira] [Commented] (SPARK-39737) PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564786#comment-17564786 ] Apache Spark commented on SPARK-39737: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-39737) PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39737: Assignee: (was: Apache Spark) > PERCENTILE_CONT and PERCENTILE_DISC should support

[jira] [Updated] (SPARK-39737) PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter

2022-07-10 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-39737: --- Summary: PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter (was:

[jira] [Created] (SPARK-39737) percentile_cont and percentile_disc should support aggregate filter

2022-07-10 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-39737: -- Summary: percentile_cont and percentile_disc should support aggregate filter Key: SPARK-39737 URL: https://issues.apache.org/jira/browse/SPARK-39737 Project: Spark

[jira] [Updated] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Yikun Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang updated SPARK-39735: Description: Since sparkr 4.2.x has [below new

[jira] [Updated] (SPARK-39735) Enable base image build in lint job and fix sparkr env

2022-07-10 Thread Yikun Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang updated SPARK-39735: Summary: Enable base image build in lint job and fix sparkr env (was: Enable base image build in

[jira] [Created] (SPARK-39736) Enable base image build in SparkR job

2022-07-10 Thread Yikun Jiang (Jira)
Yikun Jiang created SPARK-39736: --- Summary: Enable base image build in SparkR job Key: SPARK-39736 URL: https://issues.apache.org/jira/browse/SPARK-39736 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39735: Assignee: Apache Spark > Enable base image build in lint job >

[jira] [Assigned] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39735: Assignee: (was: Apache Spark) > Enable base image build in lint job >

[jira] [Commented] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564784#comment-17564784 ] Apache Spark commented on SPARK-39735: -- User 'Yikun' has created a pull request for this issue:

[jira] [Commented] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564783#comment-17564783 ] Apache Spark commented on SPARK-39735: -- User 'Yikun' has created a pull request for this issue:

[jira] [Resolved] (SPARK-39720) Implement tableExists/getTable in SparkR for 3L namespace

2022-07-10 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-39720. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37133

[jira] [Assigned] (SPARK-39720) Implement tableExists/getTable in SparkR for 3L namespace

2022-07-10 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-39720: - Assignee: Ruifeng Zheng > Implement tableExists/getTable in SparkR for 3L namespace >

[jira] [Created] (SPARK-39735) Enable base image build in lint job

2022-07-10 Thread Yikun Jiang (Jira)
Yikun Jiang created SPARK-39735: --- Summary: Enable base image build in lint job Key: SPARK-39735 URL: https://issues.apache.org/jira/browse/SPARK-39735 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-39734) Add call_udf to pyspark.sql.functions

2022-07-10 Thread Andrew Ray (Jira)
Andrew Ray created SPARK-39734: -- Summary: Add call_udf to pyspark.sql.functions Key: SPARK-39734 URL: https://issues.apache.org/jira/browse/SPARK-39734 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-39733) Add map_contains_key to pyspark.sql.functions

2022-07-10 Thread Andrew Ray (Jira)
Andrew Ray created SPARK-39733: -- Summary: Add map_contains_key to pyspark.sql.functions Key: SPARK-39733 URL: https://issues.apache.org/jira/browse/SPARK-39733 Project: Spark Issue Type:

[jira] [Commented] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564767#comment-17564767 ] Andreas Saltveit commented on SPARK-39732: -- Introduced after 2022.07.04 >

[jira] [Updated] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Saltveit updated SPARK-39732: - Description: import pyspark.pandas as pd data = [\{"Category": 'A', "ID": 1, "Value":

[jira] [Created] (SPARK-39732) pyspark.pandas.DataFrame.drop drops dataframe if axis not specified

2022-07-10 Thread Andreas Saltveit (Jira)
Andreas Saltveit created SPARK-39732: Summary: pyspark.pandas.DataFrame.drop drops dataframe if axis not specified Key: SPARK-39732 URL: https://issues.apache.org/jira/browse/SPARK-39732 Project:

[jira] [Updated] (SPARK-39715) spark UI about file size read

2022-07-10 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39715: - Target Version/s: (was: 3.2.1) > spark UI about file size read >

[jira] [Commented] (SPARK-39729) Why generate WholeStagecodegen for single operator?

2022-07-10 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564763#comment-17564763 ] Hyukjin Kwon commented on SPARK-39729: -- Can you turn off {{spark.sql.codegen.wholeStage}}? > Why

[jira] [Assigned] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39731: Assignee: (was: Apache Spark) > Correctness issue when parsing dates with MMdd

[jira] [Commented] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564745#comment-17564745 ] Apache Spark commented on SPARK-39731: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Commented] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564744#comment-17564744 ] Apache Spark commented on SPARK-39731: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Assigned] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39731: Assignee: Apache Spark > Correctness issue when parsing dates with MMdd format in

[jira] [Updated] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39731: - Description: In Spark 3.x, when reading CSV data like this: {code:java} name,mydate 1,2020011

[jira] [Updated] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39731: - Description: In Spark 3.x, when reading CSV data like this: {code:java} name,mydate 1,2020011

[jira] [Updated] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Sadikov updated SPARK-39731: - Description: In Spark 3.x, when reading CSV data like this: {code:java} name,mydate 1,2020011

[jira] [Created] (SPARK-39731) Correctness issue when parsing dates with yyyyMMdd format in CSV

2022-07-10 Thread Ivan Sadikov (Jira)
Ivan Sadikov created SPARK-39731: Summary: Correctness issue when parsing dates with MMdd format in CSV Key: SPARK-39731 URL: https://issues.apache.org/jira/browse/SPARK-39731 Project: Spark

[jira] [Updated] (SPARK-39099) Add dependencies to Dockerfile for building Spark releases

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39099: -- Fix Version/s: 3.2.2 > Add dependencies to Dockerfile for building Spark releases >

[jira] [Updated] (SPARK-37554) Add PyArrow, pandas and plotly to release Docker image dependencies

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37554: -- Fix Version/s: 3.2.2 > Add PyArrow, pandas and plotly to release Docker image dependencies >

[jira] [Resolved] (SPARK-39727) Upgrade joda-time from 2.10.13 to 2.10.14

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-39727. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37143

[jira] [Assigned] (SPARK-39727) Upgrade joda-time from 2.10.13 to 2.10.14

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-39727: - Assignee: BingKun Pan > Upgrade joda-time from 2.10.13 to 2.10.14 >

[jira] [Commented] (SPARK-39426) Subquery star select creates broken plan in case of self join

2022-07-10 Thread Pablo Langa Blanco (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564735#comment-17564735 ] Pablo Langa Blanco commented on SPARK-39426: I tested it on master and 3.3.0 and it seems to

[jira] [Created] (SPARK-39730) spark-core: sonatype-2021-1215 & sonatype-2021-1216 vulnerabilities from com.twitter:chill

2022-07-10 Thread Eugene Shinn (Truveta) (Jira)
Eugene Shinn (Truveta) created SPARK-39730: -- Summary: spark-core: sonatype-2021-1215 & sonatype-2021-1216 vulnerabilities from com.twitter:chill Key: SPARK-39730 URL:

[jira] [Commented] (SPARK-37730) plot.hist throws AttributeError on pandas=1.3.5

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564734#comment-17564734 ] Apache Spark commented on SPARK-37730: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-37730) plot.hist throws AttributeError on pandas=1.3.5

2022-07-10 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564732#comment-17564732 ] Apache Spark commented on SPARK-37730: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-39623) partitionng by datestamp leads to wrong query on backend?

2022-07-10 Thread Pablo Langa Blanco (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564730#comment-17564730 ] Pablo Langa Blanco commented on SPARK-39623: I think the problem here is a misunderstanding

[jira] [Updated] (SPARK-37730) plot.hist throws AttributeError on pandas=1.3.5

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-37730: -- Fix Version/s: 3.2.2 > plot.hist throws AttributeError on pandas=1.3.5 >

[jira] [Commented] (SPARK-37730) plot.hist throws AttributeError on pandas=1.3.5

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564727#comment-17564727 ] Dongjoon Hyun commented on SPARK-37730: --- This is backported to branch-3.2 for Apache Spark 3.2.2

[jira] [Resolved] (SPARK-39702) Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel buffer

2022-07-10 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-39702. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37110

[jira] [Commented] (SPARK-39729) Why generate WholeStagecodegen for single operator?

2022-07-10 Thread xiangxiang Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564676#comment-17564676 ] xiangxiang Shen commented on SPARK-39729: - CC [~tdas]  [~dongjoon]  Thanks > Why generate

[jira] [Updated] (SPARK-39729) Why generate WholeStagecodegen for single operator?

2022-07-10 Thread xiangxiang Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xiangxiang Shen updated SPARK-39729: Environment: (was: WholeStagecodegen will have better performance in many cases. But

[jira] [Updated] (SPARK-39729) Why generate WholeStagecodegen for single operator?

2022-07-10 Thread xiangxiang Shen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xiangxiang Shen updated SPARK-39729: Description: WholeStagecodegen will have better performance in many cases. But it should

[jira] [Created] (SPARK-39729) Why generate WholeStagecodegen for single operator?

2022-07-10 Thread xiangxiang Shen (Jira)
xiangxiang Shen created SPARK-39729: --- Summary: Why generate WholeStagecodegen for single operator? Key: SPARK-39729 URL: https://issues.apache.org/jira/browse/SPARK-39729 Project: Spark