[jira] [Created] (SPARK-40299) java api calls the count() method to appear: java.lang.ArithmeticException: BigInteger would overflow supported range

2022-08-31 Thread code1v5 (Jira)
code1v5 created SPARK-40299: --- Summary: java api calls the count() method to appear: java.lang.ArithmeticException: BigInteger would overflow supported range Key: SPARK-40299 URL:

[jira] [Commented] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598747#comment-17598747 ] Gérald Quintana commented on SPARK-40290: - I agree that if I was using the SparkSession from

[jira] [Assigned] (SPARK-40187) Add doc for using Apache YuniKorn as a customized scheduler

2022-08-31 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40187: - Assignee: Weiwei Yang > Add doc for using Apache YuniKorn as a customized scheduler >

[jira] [Resolved] (SPARK-40187) Add doc for using Apache YuniKorn as a customized scheduler

2022-08-31 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40187. --- Fix Version/s: 3.3.1 3.4.0 Resolution: Fixed Issue resolved by

[jira] [Resolved] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40285. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37736

[jira] [Assigned] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40285: --- Assignee: jiaan.geng > Simplify the roundTo[Numeric] for Decimal >

[jira] [Updated] (SPARK-40298) shuffle data recovery on the reused PVCs no effect

2022-08-31 Thread todd (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] todd updated SPARK-40298: - Description: I use spark3.2.2 to test the [ Support shuffle data recovery on the reused PVCs (SPARK-35593) ]

[jira] [Updated] (SPARK-40298) shuffle data recovery on the reused PVCs no effect

2022-08-31 Thread todd (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] todd updated SPARK-40298: - Attachment: 1662002808396.jpg 1662002822097.jpg > shuffle data recovery on the reused PVCs no

[jira] [Created] (SPARK-40298) shuffle data recovery on the reused PVCs no effect

2022-08-31 Thread todd (Jira)
todd created SPARK-40298: Summary: shuffle data recovery on the reused PVCs no effect Key: SPARK-40298 URL: https://issues.apache.org/jira/browse/SPARK-40298 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-40297) CTE outer reference nested in CTE main body cannot be resolved

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598713#comment-17598713 ] Apache Spark commented on SPARK-40297: -- User 'maryannxue' has created a pull request for this

[jira] [Commented] (SPARK-40297) CTE outer reference nested in CTE main body cannot be resolved

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598712#comment-17598712 ] Apache Spark commented on SPARK-40297: -- User 'maryannxue' has created a pull request for this

[jira] [Assigned] (SPARK-40297) CTE outer reference nested in CTE main body cannot be resolved

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40297: Assignee: (was: Apache Spark) > CTE outer reference nested in CTE main body cannot

[jira] [Assigned] (SPARK-40297) CTE outer reference nested in CTE main body cannot be resolved

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40297: Assignee: Apache Spark > CTE outer reference nested in CTE main body cannot be resolved

[jira] [Created] (SPARK-40297) CTE outer reference nested in CTE main body cannot be resolved

2022-08-31 Thread Wei Xue (Jira)
Wei Xue created SPARK-40297: --- Summary: CTE outer reference nested in CTE main body cannot be resolved Key: SPARK-40297 URL: https://issues.apache.org/jira/browse/SPARK-40297 Project: Spark Issue

[jira] [Resolved] (SPARK-40112) Improve the TO_BINARY() function

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40112. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37483

[jira] [Assigned] (SPARK-40112) Improve the TO_BINARY() function

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40112: --- Assignee: Vitalii Li > Improve the TO_BINARY() function >

[jira] [Commented] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598695#comment-17598695 ] Apache Spark commented on SPARK-40296: -- User 'amaliujia' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40296: Assignee: (was: Apache Spark) > Error Class for DISTINCT function not found >

[jira] [Assigned] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40296: Assignee: Apache Spark > Error Class for DISTINCT function not found >

[jira] [Commented] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598694#comment-17598694 ] Apache Spark commented on SPARK-40296: -- User 'amaliujia' has created a pull request for this issue:

[jira] [Created] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40296: Summary: Error Class for DISTINCT function not found Key: SPARK-40296 URL: https://issues.apache.org/jira/browse/SPARK-40296 Project: Spark Issue Type: Task

[jira] [Resolved] (SPARK-40276) reduce the result size of RDD.takeOrdered

2022-08-31 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng resolved SPARK-40276. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37728

[jira] [Assigned] (SPARK-40276) reduce the result size of RDD.takeOrdered

2022-08-31 Thread Ruifeng Zheng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruifeng Zheng reassigned SPARK-40276: - Assignee: Ruifeng Zheng > reduce the result size of RDD.takeOrdered >

[jira] [Assigned] (SPARK-40261) DirectTaskResult meta should not be counted into result size

2022-08-31 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-40261: -- Assignee: Ziqi Liu > DirectTaskResult meta should not be counted into result size >

[jira] [Resolved] (SPARK-40261) DirectTaskResult meta should not be counted into result size

2022-08-31 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-40261. Fix Version/s: 3.4.0 Resolution: Fixed Fixed by https://github.com/apache/spark/pull/37713

[jira] [Commented] (SPARK-40262) Expensive UDF evaluation pushed down past a join leads to performance issues

2022-08-31 Thread Erik Krogen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598655#comment-17598655 ] Erik Krogen commented on SPARK-40262: - Good find and thanks for sharing the investigation

[jira] [Assigned] (SPARK-40295) Allow v2 functions with literal args in write distribution and ordering

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40295: Assignee: Apache Spark > Allow v2 functions with literal args in write distribution and

[jira] [Assigned] (SPARK-40295) Allow v2 functions with literal args in write distribution and ordering

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40295: Assignee: (was: Apache Spark) > Allow v2 functions with literal args in write

[jira] [Commented] (SPARK-40295) Allow v2 functions with literal args in write distribution and ordering

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598639#comment-17598639 ] Apache Spark commented on SPARK-40295: -- User 'aokolnychyi' has created a pull request for this

[jira] [Created] (SPARK-40295) Allow v2 functions with literal args in write distribution and ordering

2022-08-31 Thread Anton Okolnychyi (Jira)
Anton Okolnychyi created SPARK-40295: Summary: Allow v2 functions with literal args in write distribution and ordering Key: SPARK-40295 URL: https://issues.apache.org/jira/browse/SPARK-40295

[jira] [Commented] (SPARK-40210) Fix math atan2, hypot, pow and pmod float argument call

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598621#comment-17598621 ] Apache Spark commented on SPARK-40210: -- User 'khalidmammadov' has created a pull request for this

[jira] [Commented] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598608#comment-17598608 ] Apache Spark commented on SPARK-40294: -- User 'richardc-db' has created a pull request for this

[jira] [Commented] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598609#comment-17598609 ] Apache Spark commented on SPARK-40294: -- User 'richardc-db' has created a pull request for this

[jira] [Assigned] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40294: Assignee: (was: Apache Spark) > Repeat calls to `PartitionIterator.hasNext` can

[jira] [Assigned] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40294: Assignee: Apache Spark > Repeat calls to `PartitionIterator.hasNext` can timeout >

[jira] [Created] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Richard Chen (Jira)
Richard Chen created SPARK-40294: Summary: Repeat calls to `PartitionIterator.hasNext` can timeout Key: SPARK-40294 URL: https://issues.apache.org/jira/browse/SPARK-40294 Project: Spark

[jira] [Commented] (SPARK-31001) Add ability to create a partitioned table via catalog.createTable()

2022-08-31 Thread Kevin Appel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598582#comment-17598582 ] Kevin Appel commented on SPARK-31001: - I am trying to get SparkR and Sparklyr to work with this and

[jira] [Assigned] (SPARK-40280) Failure to create parquet predicate push down for ints and longs on some valid files

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40280: Assignee: (was: Apache Spark) > Failure to create parquet predicate push down for

[jira] [Assigned] (SPARK-40280) Failure to create parquet predicate push down for ints and longs on some valid files

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40280: Assignee: Apache Spark > Failure to create parquet predicate push down for ints and

[jira] [Commented] (SPARK-40280) Failure to create parquet predicate push down for ints and longs on some valid files

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598577#comment-17598577 ] Apache Spark commented on SPARK-40280: -- User 'revans2' has created a pull request for this issue:

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598574#comment-17598574 ] Sean R. Owen commented on SPARK-40286: -- I could be completely wrong, but then I'd be quite as

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598572#comment-17598572 ] Drew commented on SPARK-40286: -- [~srowen] interesting, this is the only information I could find in regards

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598568#comment-17598568 ] Sean R. Owen commented on SPARK-40286: -- No, LOAD DATA does not delete source data. I'm not sure

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Niranda Perera (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598566#comment-17598566 ] Niranda Perera commented on SPARK-40233: Well, the driver actually hangs after throwing that

[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols

2022-08-31 Thread Santosh Pingale (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598559#comment-17598559 ] Santosh Pingale commented on SPARK-39895: - I am not sure I understand your confusion. > pyspark

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598556#comment-17598556 ] Drew commented on SPARK-40286: -- [~srowen] I see, the table is located in s3 in another bucket of mine. So

[jira] [Assigned] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40293: Assignee: (was: Apache Spark) > Make the V2 table error message more meaningful >

[jira] [Assigned] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40293: Assignee: Apache Spark > Make the V2 table error message more meaningful >

[jira] [Commented] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598552#comment-17598552 ] Apache Spark commented on SPARK-40293: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Created] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-40293: -- Summary: Make the V2 table error message more meaningful Key: SPARK-40293 URL: https://issues.apache.org/jira/browse/SPARK-40293 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-39916) Merge SchemaUtils from mlib to SQL

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39916. -- Resolution: Won't Fix > Merge SchemaUtils from mlib to SQL >

[jira] [Resolved] (SPARK-39269) spark3.2.0 commit tmp file is not found when rename

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39269. -- Resolution: Invalid > spark3.2.0 commit tmp file is not found when rename >

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598550#comment-17598550 ] Sean R. Owen commented on SPARK-40233: -- That's what happens, right? Spark is of course meant to

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Niranda Perera (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598546#comment-17598546 ] Niranda Perera commented on SPARK-40233: [~srowen] shouldn't spark driver program terminate/

[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598545#comment-17598545 ] Sean R. Owen commented on SPARK-39895: -- Not a big deal, but the example doesn't make sense to me.

[jira] [Commented] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598544#comment-17598544 ] Apache Spark commented on SPARK-33605: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33605: Assignee: (was: Apache Spark) > Add GCS FS/connector config (dependencies?) akin to

[jira] [Assigned] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33605: Assignee: Apache Spark > Add GCS FS/connector config (dependencies?) akin to S3 >

[jira] [Commented] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598543#comment-17598543 ] Apache Spark commented on SPARK-33605: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598542#comment-17598542 ] Sean R. Owen commented on SPARK-40286: -- Where is src stored? LOAD DATA should not affect the

[jira] [Commented] (SPARK-39948) exclude velocity 1.5 jar

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598540#comment-17598540 ] Sean R. Owen commented on SPARK-39948: -- Do any of them affect Spark? > exclude velocity 1.5 jar >

[jira] [Commented] (SPARK-39995) PySpark installation doesn't support Scala 2.13 binaries

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598539#comment-17598539 ] Sean R. Owen commented on SPARK-39995: -- Would scala version generally matter to python users who

[jira] [Resolved] (SPARK-40023) Issue with Spark Core version 3.3.0

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40023. -- Resolution: Invalid > Issue with Spark Core version 3.3.0 >

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598538#comment-17598538 ] Drew commented on SPARK-40286: -- Hi [~srowen], In this case, before loading data into the table from my

[jira] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286 ] Drew deleted comment on SPARK-40286: -- was (Author: JIRAUSER295165): In this case, before loading data into the table from my bucket in S3 has `kv1.txt`. Then, when I run the code block above, the

[jira] [Resolved] (SPARK-40122) py4j-0.10.9.5 often produces "Connection reset by peer" in Spark 3.3.0

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40122. -- Resolution: Invalid This itself doesn't mean anything - means the Python process died. It'd

[jira] [Commented] (SPARK-40123) Security Vulnerability CVE-2018-11793 due to mesos-1.4.3-shaded-protobuf.jar

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598536#comment-17598536 ] Sean R. Owen commented on SPARK-40123: -- Mesos is deprecated, but, if you want you can open a PR to

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598535#comment-17598535 ] Drew commented on SPARK-40286: -- In this case, before loading data into the table from my bucket in S3 has

[jira] [Resolved] (SPARK-40126) Security scanning spark v3.3.0 docker image results in DSA-5169-1 critical vulnerability

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40126. -- Resolution: Invalid > Security scanning spark v3.3.0 docker image results in DSA-5169-1

[jira] [Commented] (SPARK-40126) Security scanning spark v3.3.0 docker image results in DSA-5169-1 critical vulnerability

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598534#comment-17598534 ] Sean R. Owen commented on SPARK-40126: -- This isn't part of Spark. You're looking at some

[jira] [Resolved] (SPARK-40170) StringCoding UTF8 decode slowly

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40170. -- Resolution: Invalid > StringCoding UTF8 decode slowly > --- > >

[jira] [Commented] (SPARK-40200) unpersist cascades with Kryo, MEMORY_AND_DISK_SER and monotonically_increasing_id

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598532#comment-17598532 ] Sean R. Owen commented on SPARK-40200: -- I can't make out what this is reporting, please start over

[jira] [Resolved] (SPARK-40200) unpersist cascades with Kryo, MEMORY_AND_DISK_SER and monotonically_increasing_id

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40200. -- Resolution: Invalid > unpersist cascades with Kryo, MEMORY_AND_DISK_SER and >

[jira] [Updated] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linhong Liu updated SPARK-40292: Description: For the below query: {code:sql} with q as (   select     named_struct(      

[jira] [Updated] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linhong Liu updated SPARK-40292: Description: For the below query:   {code:sql} with q as (   select     named_struct(      

[jira] [Resolved] (SPARK-40232) KMeans: high variability in results despite high initSteps parameter value

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40232. -- Resolution: Not A Problem No, initSteps controls an aspect of the initialization. I don't

[jira] [Created] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
Linhong Liu created SPARK-40292: --- Summary: arrays_zip output unexpected alias column names Key: SPARK-40292 URL: https://issues.apache.org/jira/browse/SPARK-40292 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40233. -- Resolution: Not A Problem > Unable to load large pandas dataframe to pyspark >

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598524#comment-17598524 ] Sean R. Owen commented on SPARK-40233: -- This is more a problem with trying send a huge amount of

[jira] [Updated] (SPARK-40237) Can't get JDBC type for map in Spark 3.3.0 and PostgreSQL

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-40237: - Issue Type: Improvement (was: Bug) Priority: Minor (was: Major) > Can't get JDBC type

[jira] [Commented] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598523#comment-17598523 ] Apache Spark commented on SPARK-40291: -- User 'linhongliu-db' has created a pull request for this

[jira] [Assigned] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40291: Assignee: Apache Spark > Improve the message for column not in group by clause error >

[jira] [Commented] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598521#comment-17598521 ] Apache Spark commented on SPARK-40291: -- User 'linhongliu-db' has created a pull request for this

[jira] [Assigned] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40291: Assignee: (was: Apache Spark) > Improve the message for column not in group by

[jira] [Commented] (SPARK-40274) ArrayIndexOutOfBoundsException in BytecodeReadingParanamer

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598522#comment-17598522 ] Sean R. Owen commented on SPARK-40274: -- Yes, it is at least not clear it's due to you using a

[jira] [Created] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Linhong Liu (Jira)
Linhong Liu created SPARK-40291: --- Summary: Improve the message for column not in group by clause error Key: SPARK-40291 URL: https://issues.apache.org/jira/browse/SPARK-40291 Project: Spark

[jira] [Resolved] (SPARK-40277) Use DataFrame's column for referring to DDL schema for from_csv() and from_json()

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40277. -- Resolution: Invalid This doesn't state any problem or specific change > Use DataFrame's

[jira] [Resolved] (SPARK-40282) DataType argument in StructType.add is incorrectly throwing scala.MatchError

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40282. -- Resolution: Not A Problem > DataType argument in StructType.add is incorrectly throwing

[jira] [Commented] (SPARK-40282) DataType argument in StructType.add is incorrectly throwing scala.MatchError

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598515#comment-17598515 ] Sean R. Owen commented on SPARK-40282: -- Try just IntegerType (no parens) as in Scala; otherwise

[jira] [Commented] (SPARK-40284) spark concurrent overwrite mode writes data to files in HDFS format, all request data write success

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598511#comment-17598511 ] Sean R. Owen commented on SPARK-40284: -- You have a race condition where two requests try to delete

[jira] [Updated] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-40285: - Priority: Minor (was: Major) > Simplify the roundTo[Numeric] for Decimal >

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598510#comment-17598510 ] Sean R. Owen commented on SPARK-40286: -- There is no delete here. Why do you think Spark is deleting

[jira] [Resolved] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40290. -- Resolution: Won't Fix > Uncatchable exceptions in SparkSession Java API >

[jira] [Commented] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598506#comment-17598506 ] Sean R. Owen commented on SPARK-40290: -- It doesn't make sense to consider it a RuntimeException.

[jira] [Resolved] (SPARK-39708) ALS Model Loading

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39708. -- Resolution: Not A Problem > ALS Model Loading > - > > Key:

[jira] [Assigned] (SPARK-40219) resolved view plan should hold the schema to avoid redundant lookup

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40219: --- Assignee: Wenchen Fan > resolved view plan should hold the schema to avoid redundant

[jira] [Resolved] (SPARK-40219) resolved view plan should hold the schema to avoid redundant lookup

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40219. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37658

[jira] [Resolved] (SPARK-40040) Push local limit to both sides if join condition is empty

2022-08-31 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-40040. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37475

[jira] [Assigned] (SPARK-40040) Push local limit to both sides if join condition is empty

2022-08-31 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-40040: --- Assignee: Yuming Wang > Push local limit to both sides if join condition is empty >

[jira] [Commented] (SPARK-31001) Add ability to create a partitioned table via catalog.createTable()

2022-08-31 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598403#comment-17598403 ] Nicholas Chammas commented on SPARK-31001: -- Thanks for sharing these details. This is very

  1   2   >