[jira] [Assigned] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40285: Assignee: (was: Apache Spark) > Simplify the roundTo[Numeric] for Decimal >

[jira] [Commented] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598204#comment-17598204 ] Apache Spark commented on SPARK-40285: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40285: Assignee: Apache Spark > Simplify the roundTo[Numeric] for Decimal >

[jira] [Created] (SPARK-40289) The result is strange when casting string to date in ORC reading via Schema Evolution

2022-08-31 Thread Jianbang Xian (Jira)
Jianbang Xian created SPARK-40289: - Summary: The result is strange when casting string to date in ORC reading via Schema Evolution Key: SPARK-40289 URL: https://issues.apache.org/jira/browse/SPARK-40289

[jira] [Commented] (SPARK-40265) Fix the inconsistent behavior for Index.intersection.

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598262#comment-17598262 ] Apache Spark commented on SPARK-40265: -- User 'itholic' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40265) Fix the inconsistent behavior for Index.intersection.

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40265: Assignee: Apache Spark > Fix the inconsistent behavior for Index.intersection. >

[jira] [Assigned] (SPARK-40265) Fix the inconsistent behavior for Index.intersection.

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40265: Assignee: (was: Apache Spark) > Fix the inconsistent behavior for

[jira] [Created] (SPARK-40287) Load Data using Spark by a single partition moves entire dataset under same location in S3

2022-08-31 Thread Drew (Jira)
Drew created SPARK-40287: Summary: Load Data using Spark by a single partition moves entire dataset under same location in S3 Key: SPARK-40287 URL: https://issues.apache.org/jira/browse/SPARK-40287 Project:

[jira] [Updated] (SPARK-40289) The result is strange when casting string to date in ORC reading via Schema Evolution

2022-08-31 Thread Jianbang Xian (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbang Xian updated SPARK-40289: -- Description: I created an ORC file by the code as follows. {code:java} val data = Seq(    

[jira] [Updated] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew updated SPARK-40286: - Description: Hello,  I'm using spark to [load

[jira] [Updated] (SPARK-40284) spark concurrent overwrite mode writes data to files in HDFS format, all request data write success

2022-08-31 Thread Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu updated SPARK-40284: Description: We use Spark as a service. The same Spark service needs to handle multiple requests, but I have a

[jira] [Created] (SPARK-40288) After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression.

2022-08-31 Thread hgs (Jira)
hgs created SPARK-40288: --- Summary: After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression. Key: SPARK-40288 URL:

[jira] [Commented] (SPARK-40055) listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598218#comment-17598218 ] Apache Spark commented on SPARK-40055: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40194) SPLIT function on empty regex should truncate trailing empty string.

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40194: --- Assignee: Vitalii Li > SPLIT function on empty regex should truncate trailing empty

[jira] [Resolved] (SPARK-40194) SPLIT function on empty regex should truncate trailing empty string.

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40194. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37631

[jira] [Updated] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew updated SPARK-40286: - Priority: Major (was: Trivial) > Load Data from S3 deletes data source file >

[jira] [Updated] (SPARK-40288) After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression.

2022-08-31 Thread hgs (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hgs updated SPARK-40288: Description: {{--table}} {{create}}  {{table}} {{miss_expr(id }}{{{}int{}}}{{{},{}}}{{{}name{}}}  {{string,age 

[jira] [Updated] (SPARK-40289) The result is strange when casting string to date in ORC reading via Schema Evolution

2022-08-31 Thread Jianbang Xian (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianbang Xian updated SPARK-40289: -- Description: I created an ORC file by the code as follows. {code:java} val data = Seq(    

[jira] [Created] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
Drew created SPARK-40286: Summary: Load Data from S3 deletes data source file Key: SPARK-40286 URL: https://issues.apache.org/jira/browse/SPARK-40286 Project: Spark Issue Type: Question

[jira] [Commented] (SPARK-40055) listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598217#comment-17598217 ] Apache Spark commented on SPARK-40055: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-40299) java api calls the count() method to appear: java.lang.ArithmeticException: BigInteger would overflow supported range

2022-08-31 Thread code1v5 (Jira)
code1v5 created SPARK-40299: --- Summary: java api calls the count() method to appear: java.lang.ArithmeticException: BigInteger would overflow supported range Key: SPARK-40299 URL:

[jira] [Commented] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598747#comment-17598747 ] Gérald Quintana commented on SPARK-40290: - I agree that if I was using the SparkSession from

[jira] [Resolved] (SPARK-40187) Add doc for using Apache YuniKorn as a customized scheduler

2022-08-31 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40187. --- Fix Version/s: 3.3.1 3.4.0 Resolution: Fixed Issue resolved by

[jira] [Assigned] (SPARK-40187) Add doc for using Apache YuniKorn as a customized scheduler

2022-08-31 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40187: - Assignee: Weiwei Yang > Add doc for using Apache YuniKorn as a customized scheduler >

[jira] [Created] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Jira
Gérald Quintana created SPARK-40290: --- Summary: Uncatchable exceptions in SparkSession Java API Key: SPARK-40290 URL: https://issues.apache.org/jira/browse/SPARK-40290 Project: Spark Issue

[jira] [Commented] (SPARK-40283) Update mima's previousSparkVersion to 3.3.0

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598332#comment-17598332 ] Apache Spark commented on SPARK-40283: -- User 'LuciferYang' has created a pull request for this

[jira] [Updated] (SPARK-40288) After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should applied to avoid attribute missing when use complex expression.

2022-08-31 Thread hgs (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hgs updated SPARK-40288: Affects Version/s: (was: 3.0.0) > After `RemoveRedundantAggregates`, `PullOutGroupingExpressions` should >

[jira] [Assigned] (SPARK-40283) Update mima's previousSparkVersion to 3.3.0

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40283: Assignee: Apache Spark > Update mima's previousSparkVersion to 3.3.0 >

[jira] [Assigned] (SPARK-40283) Update mima's previousSparkVersion to 3.3.0

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40283: Assignee: (was: Apache Spark) > Update mima's previousSparkVersion to 3.3.0 >

[jira] [Commented] (SPARK-40283) Update mima's previousSparkVersion to 3.3.0

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598330#comment-17598330 ] Apache Spark commented on SPARK-40283: -- User 'LuciferYang' has created a pull request for this

[jira] [Assigned] (SPARK-40219) resolved view plan should hold the schema to avoid redundant lookup

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-40219: --- Assignee: Wenchen Fan > resolved view plan should hold the schema to avoid redundant

[jira] [Resolved] (SPARK-40219) resolved view plan should hold the schema to avoid redundant lookup

2022-08-31 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-40219. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37658

[jira] [Resolved] (SPARK-40040) Push local limit to both sides if join condition is empty

2022-08-31 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-40040. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37475

[jira] [Commented] (SPARK-31001) Add ability to create a partitioned table via catalog.createTable()

2022-08-31 Thread Kevin Appel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598397#comment-17598397 ] Kevin Appel commented on SPARK-31001: - Its is defined in here:

[jira] [Comment Edited] (SPARK-31001) Add ability to create a partitioned table via catalog.createTable()

2022-08-31 Thread Kevin Appel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598397#comment-17598397 ] Kevin Appel edited comment on SPARK-31001 at 8/31/22 2:14 PM: -- Its is

[jira] [Commented] (SPARK-31001) Add ability to create a partitioned table via catalog.createTable()

2022-08-31 Thread Nicholas Chammas (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598403#comment-17598403 ] Nicholas Chammas commented on SPARK-31001: -- Thanks for sharing these details. This is very

[jira] (SPARK-38330) Certificate doesn't match any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]

2022-08-31 Thread comet (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38330 ] comet deleted comment on SPARK-38330: --- was (Author: JIRAUSER295079): any update on this ticket? Anyone tested this one the latest version of Hadoop? I tested but still get the same error >

[jira] [Assigned] (SPARK-40040) Push local limit to both sides if join condition is empty

2022-08-31 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-40040: --- Assignee: Yuming Wang > Push local limit to both sides if join condition is empty >

[jira] [Commented] (SPARK-40284) spark concurrent overwrite mode writes data to files in HDFS format, all request data write success

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598511#comment-17598511 ] Sean R. Owen commented on SPARK-40284: -- You have a race condition where two requests try to delete

[jira] [Created] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Linhong Liu (Jira)
Linhong Liu created SPARK-40291: --- Summary: Improve the message for column not in group by clause error Key: SPARK-40291 URL: https://issues.apache.org/jira/browse/SPARK-40291 Project: Spark

[jira] [Assigned] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40291: Assignee: (was: Apache Spark) > Improve the message for column not in group by

[jira] [Commented] (SPARK-40274) ArrayIndexOutOfBoundsException in BytecodeReadingParanamer

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598522#comment-17598522 ] Sean R. Owen commented on SPARK-40274: -- Yes, it is at least not clear it's due to you using a

[jira] [Commented] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598521#comment-17598521 ] Apache Spark commented on SPARK-40291: -- User 'linhongliu-db' has created a pull request for this

[jira] [Assigned] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40291: Assignee: Apache Spark > Improve the message for column not in group by clause error >

[jira] [Created] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
Linhong Liu created SPARK-40292: --- Summary: arrays_zip output unexpected alias column names Key: SPARK-40292 URL: https://issues.apache.org/jira/browse/SPARK-40292 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-40126) Security scanning spark v3.3.0 docker image results in DSA-5169-1 critical vulnerability

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40126. -- Resolution: Invalid > Security scanning spark v3.3.0 docker image results in DSA-5169-1

[jira] [Commented] (SPARK-40126) Security scanning spark v3.3.0 docker image results in DSA-5169-1 critical vulnerability

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598534#comment-17598534 ] Sean R. Owen commented on SPARK-40126: -- This isn't part of Spark. You're looking at some

[jira] [Resolved] (SPARK-40023) Issue with Spark Core version 3.3.0

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40023. -- Resolution: Invalid > Issue with Spark Core version 3.3.0 >

[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598545#comment-17598545 ] Sean R. Owen commented on SPARK-39895: -- Not a big deal, but the example doesn't make sense to me.

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Niranda Perera (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598546#comment-17598546 ] Niranda Perera commented on SPARK-40233: [~srowen] shouldn't spark driver program terminate/

[jira] [Resolved] (SPARK-39916) Merge SchemaUtils from mlib to SQL

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39916. -- Resolution: Won't Fix > Merge SchemaUtils from mlib to SQL >

[jira] [Resolved] (SPARK-39708) ALS Model Loading

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39708. -- Resolution: Not A Problem > ALS Model Loading > - > > Key:

[jira] [Updated] (SPARK-40237) Can't get JDBC type for map in Spark 3.3.0 and PostgreSQL

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-40237: - Issue Type: Improvement (was: Bug) Priority: Minor (was: Major) > Can't get JDBC type

[jira] [Resolved] (SPARK-40232) KMeans: high variability in results despite high initSteps parameter value

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40232. -- Resolution: Not A Problem No, initSteps controls an aspect of the initialization. I don't

[jira] [Updated] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linhong Liu updated SPARK-40292: Description: For the below query:   {code:sql} with q as (   select     named_struct(      

[jira] [Commented] (SPARK-40200) unpersist cascades with Kryo, MEMORY_AND_DISK_SER and monotonically_increasing_id

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598532#comment-17598532 ] Sean R. Owen commented on SPARK-40200: -- I can't make out what this is reporting, please start over

[jira] [Resolved] (SPARK-40200) unpersist cascades with Kryo, MEMORY_AND_DISK_SER and monotonically_increasing_id

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40200. -- Resolution: Invalid > unpersist cascades with Kryo, MEMORY_AND_DISK_SER and >

[jira] [Commented] (SPARK-40123) Security Vulnerability CVE-2018-11793 due to mesos-1.4.3-shaded-protobuf.jar

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598536#comment-17598536 ] Sean R. Owen commented on SPARK-40123: -- Mesos is deprecated, but, if you want you can open a PR to

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598535#comment-17598535 ] Drew commented on SPARK-40286: -- In this case, before loading data into the table from my bucket in S3 has

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598542#comment-17598542 ] Sean R. Owen commented on SPARK-40286: -- Where is src stored? LOAD DATA should not affect the

[jira] [Commented] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598543#comment-17598543 ] Apache Spark commented on SPARK-33605: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33605: Assignee: (was: Apache Spark) > Add GCS FS/connector config (dependencies?) akin to

[jira] [Assigned] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33605: Assignee: Apache Spark > Add GCS FS/connector config (dependencies?) akin to S3 >

[jira] [Resolved] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40290. -- Resolution: Won't Fix > Uncatchable exceptions in SparkSession Java API >

[jira] [Commented] (SPARK-40291) Improve the message for column not in group by clause error

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598523#comment-17598523 ] Apache Spark commented on SPARK-40291: -- User 'linhongliu-db' has created a pull request for this

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598524#comment-17598524 ] Sean R. Owen commented on SPARK-40233: -- This is more a problem with trying send a huge amount of

[jira] [Resolved] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40233. -- Resolution: Not A Problem > Unable to load large pandas dataframe to pyspark >

[jira] [Updated] (SPARK-40292) arrays_zip output unexpected alias column names

2022-08-31 Thread Linhong Liu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Linhong Liu updated SPARK-40292: Description: For the below query: {code:sql} with q as (   select     named_struct(      

[jira] [Resolved] (SPARK-40122) py4j-0.10.9.5 often produces "Connection reset by peer" in Spark 3.3.0

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40122. -- Resolution: Invalid This itself doesn't mean anything - means the Python process died. It'd

[jira] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286 ] Drew deleted comment on SPARK-40286: -- was (Author: JIRAUSER295165): In this case, before loading data into the table from my bucket in S3 has `kv1.txt`. Then, when I run the code block above, the

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598538#comment-17598538 ] Drew commented on SPARK-40286: -- Hi [~srowen], In this case, before loading data into the table from my

[jira] [Commented] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598544#comment-17598544 ] Apache Spark commented on SPARK-33605: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Resolved] (SPARK-39269) spark3.2.0 commit tmp file is not found when rename

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39269. -- Resolution: Invalid > spark3.2.0 commit tmp file is not found when rename >

[jira] [Commented] (SPARK-40290) Uncatchable exceptions in SparkSession Java API

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598506#comment-17598506 ] Sean R. Owen commented on SPARK-40290: -- It doesn't make sense to consider it a RuntimeException.

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598510#comment-17598510 ] Sean R. Owen commented on SPARK-40286: -- There is no delete here. Why do you think Spark is deleting

[jira] [Updated] (SPARK-40285) Simplify the roundTo[Numeric] for Decimal

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-40285: - Priority: Minor (was: Major) > Simplify the roundTo[Numeric] for Decimal >

[jira] [Resolved] (SPARK-40282) DataType argument in StructType.add is incorrectly throwing scala.MatchError

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40282. -- Resolution: Not A Problem > DataType argument in StructType.add is incorrectly throwing

[jira] [Resolved] (SPARK-40277) Use DataFrame's column for referring to DDL schema for from_csv() and from_json()

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40277. -- Resolution: Invalid This doesn't state any problem or specific change > Use DataFrame's

[jira] [Commented] (SPARK-40282) DataType argument in StructType.add is incorrectly throwing scala.MatchError

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598515#comment-17598515 ] Sean R. Owen commented on SPARK-40282: -- Try just IntegerType (no parens) as in Scala; otherwise

[jira] [Resolved] (SPARK-40170) StringCoding UTF8 decode slowly

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-40170. -- Resolution: Invalid > StringCoding UTF8 decode slowly > --- > >

[jira] [Commented] (SPARK-39995) PySpark installation doesn't support Scala 2.13 binaries

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598539#comment-17598539 ] Sean R. Owen commented on SPARK-39995: -- Would scala version generally matter to python users who

[jira] [Commented] (SPARK-39948) exclude velocity 1.5 jar

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598540#comment-17598540 ] Sean R. Owen commented on SPARK-39948: -- Do any of them affect Spark? > exclude velocity 1.5 jar >

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598550#comment-17598550 ] Sean R. Owen commented on SPARK-40233: -- That's what happens, right? Spark is of course meant to

[jira] [Created] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-40293: -- Summary: Make the V2 table error message more meaningful Key: SPARK-40293 URL: https://issues.apache.org/jira/browse/SPARK-40293 Project: Spark Issue Type:

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598568#comment-17598568 ] Sean R. Owen commented on SPARK-40286: -- No, LOAD DATA does not delete source data. I'm not sure

[jira] [Created] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Richard Chen (Jira)
Richard Chen created SPARK-40294: Summary: Repeat calls to `PartitionIterator.hasNext` can timeout Key: SPARK-40294 URL: https://issues.apache.org/jira/browse/SPARK-40294 Project: Spark

[jira] [Commented] (SPARK-40210) Fix math atan2, hypot, pow and pmod float argument call

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598621#comment-17598621 ] Apache Spark commented on SPARK-40210: -- User 'khalidmammadov' has created a pull request for this

[jira] [Created] (SPARK-40295) Allow v2 functions with literal args in write distribution and ordering

2022-08-31 Thread Anton Okolnychyi (Jira)
Anton Okolnychyi created SPARK-40295: Summary: Allow v2 functions with literal args in write distribution and ordering Key: SPARK-40295 URL: https://issues.apache.org/jira/browse/SPARK-40295

[jira] [Commented] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598552#comment-17598552 ] Apache Spark commented on SPARK-40293: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40293: Assignee: (was: Apache Spark) > Make the V2 table error message more meaningful >

[jira] [Assigned] (SPARK-40293) Make the V2 table error message more meaningful

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40293: Assignee: Apache Spark > Make the V2 table error message more meaningful >

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598572#comment-17598572 ] Drew commented on SPARK-40286: -- [~srowen] interesting, this is the only information I could find in regards

[jira] [Assigned] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40294: Assignee: (was: Apache Spark) > Repeat calls to `PartitionIterator.hasNext` can

[jira] [Commented] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598609#comment-17598609 ] Apache Spark commented on SPARK-40294: -- User 'richardc-db' has created a pull request for this

[jira] [Assigned] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-40294: Assignee: Apache Spark > Repeat calls to `PartitionIterator.hasNext` can timeout >

[jira] [Commented] (SPARK-40294) Repeat calls to `PartitionIterator.hasNext` can timeout

2022-08-31 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598608#comment-17598608 ] Apache Spark commented on SPARK-40294: -- User 'richardc-db' has created a pull request for this

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Drew (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598556#comment-17598556 ] Drew commented on SPARK-40286: -- [~srowen] I see, the table is located in s3 in another bucket of mine. So

[jira] [Commented] (SPARK-39895) pyspark drop doesn't accept *cols

2022-08-31 Thread Santosh Pingale (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598559#comment-17598559 ] Santosh Pingale commented on SPARK-39895: - I am not sure I understand your confusion. > pyspark

[jira] [Commented] (SPARK-40233) Unable to load large pandas dataframe to pyspark

2022-08-31 Thread Niranda Perera (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598566#comment-17598566 ] Niranda Perera commented on SPARK-40233: Well, the driver actually hangs after throwing that

[jira] [Commented] (SPARK-40286) Load Data from S3 deletes data source file

2022-08-31 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598574#comment-17598574 ] Sean R. Owen commented on SPARK-40286: -- I could be completely wrong, but then I'd be quite as

  1   2   >