[jira] [Resolved] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42225. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39783 [https://github.com/apache/spark/pull/39783] > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42225: Assignee: Haejoon Lee > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41489) Assign name to _LEGACY_ERROR_TEMP_2415
[ https://issues.apache.org/jira/browse/SPARK-41489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41489. -- Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 39701 [https://github.com/apache/spark/pull/39701] > Assign name to _LEGACY_ERROR_TEMP_2415 > -- > > Key: SPARK-41489 > URL: https://issues.apache.org/jira/browse/SPARK-41489 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.5.0 > > > We should assign proper name for all LEGACY temp error classes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41489) Assign name to _LEGACY_ERROR_TEMP_2415
[ https://issues.apache.org/jira/browse/SPARK-41489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41489: Assignee: Haejoon Lee > Assign name to _LEGACY_ERROR_TEMP_2415 > -- > > Key: SPARK-41489 > URL: https://issues.apache.org/jira/browse/SPARK-41489 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > We should assign proper name for all LEGACY temp error classes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681671#comment-17681671 ] Apache Spark commented on SPARK-42224: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39787 > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42194) Allow `columns` parameter when creating DataFrame with Series.
[ https://issues.apache.org/jira/browse/SPARK-42194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42194: Assignee: Haejoon Lee > Allow `columns` parameter when creating DataFrame with Series. > -- > > Key: SPARK-42194 > URL: https://issues.apache.org/jira/browse/SPARK-42194 > Project: Spark > Issue Type: Bug > Components: ps >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > pandas API on Spark doesn't allow creating DataFrame with Series by > specifying the `columns` parameter as below: > {code:java} > >>> ps.DataFrame(psser, columns=["labels"]) > Traceback (most recent call last): > File "", line 1, in > File ".../spark/python/pyspark/pandas/frame.py", line 539, in __init__ > assert columns is None > AssertionError {code} > We should make it available. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42194) Allow `columns` parameter when creating DataFrame with Series.
[ https://issues.apache.org/jira/browse/SPARK-42194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42194. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39786 [https://github.com/apache/spark/pull/39786] > Allow `columns` parameter when creating DataFrame with Series. > -- > > Key: SPARK-42194 > URL: https://issues.apache.org/jira/browse/SPARK-42194 > Project: Spark > Issue Type: Bug > Components: ps >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > pandas API on Spark doesn't allow creating DataFrame with Series by > specifying the `columns` parameter as below: > {code:java} > >>> ps.DataFrame(psser, columns=["labels"]) > Traceback (most recent call last): > File "", line 1, in > File ".../spark/python/pyspark/pandas/frame.py", line 539, in __init__ > assert columns is None > AssertionError {code} > We should make it available. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42227) Use approx_percentile function running slower in spark3 than spark2
xuanzhiang created SPARK-42227: -- Summary: Use approx_percentile function running slower in spark3 than spark2 Key: SPARK-42227 URL: https://issues.apache.org/jira/browse/SPARK-42227 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.1 Reporter: xuanzhiang approx_percentile(end_ts-start_ts,0.9) cost_p90 in spark3 , it use objectHashAggregate method , but it shuffle very slow. when i use percentile , it become fast. i dont know the reson, i think approx_percentile should fast. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41684) spark3 read the one partition data and write to anthor partition cause error
[ https://issues.apache.org/jira/browse/SPARK-41684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681657#comment-17681657 ] xuanzhiang commented on SPARK-41684: u can set spark.sql.hive.convertMetastoreParquet=false . > spark3 read the one partition data and write to anthor partition cause error > > > Key: SPARK-41684 > URL: https://issues.apache.org/jira/browse/SPARK-41684 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.2.2 >Reporter: sinlang >Priority: Major > > spark3 read the one partition data and write to anthor partition cause error > {code:java} > 1 create temporary view t1 : > select * from jt_ods.ods_ebi_stm_retail_settle_detail_full_di > where dt = '2022-12-21' > union all ( > select * from jt_ods.ods_ebi_stm_retail_settle_detail_full_df as i > where i.dt = '2022-12-20' > and not exists(select 1 from jt_ods.ods_ebi_stm_retail_settle_detail_full_di > as d where d.dt = '2022-12-21' and i.id = d.id)) > 2 insert data : > insert sql insert overwrite table > jt_ods.ods_ebi_stm_retail_settle_detail_full_df partition(dt = '2022-12-21') > select * from t distribute by rand() {code} > {code:java} > 2022-12-22 16:29:48 Driver ERROR > org.apache.spark.deploy.yarn.ApplicationMaster > User class threw exception: org.apache.spark.sql.AnalysisException: Cannot > overwrite a path that is also being read from. > org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also > being read from. > at > org.apache.spark.sql.errors.QueryCompilationErrors$.cannotOverwritePathBeingReadFromError(QueryCompilationErrors.scala:1834) > at > org.apache.spark.sql.execution.command.DDLUtils$.verifyNotReadPath(ddl.scala:980) > at > org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:221) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42224: Assignee: Haejoon Lee > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42224. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39782 [https://github.com/apache/spark/pull/39782] > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Haejoon Lee >Priority: Major > Fix For: 3.4.0 > > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
[ https://issues.apache.org/jira/browse/SPARK-42226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42226. --- Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 39784 [https://github.com/apache/spark/pull/39784] > Upgrade versions-maven-plugin to 2.14.2 > --- > > Key: SPARK-42226 > URL: https://issues.apache.org/jira/browse/SPARK-42226 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > Fix For: 3.5.0 > > > https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42220) Upgrade buf from 1.12.0 to 1.13.1
[ https://issues.apache.org/jira/browse/SPARK-42220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42220. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39776 [https://github.com/apache/spark/pull/39776] > Upgrade buf from 1.12.0 to 1.13.1 > - > > Key: SPARK-42220 > URL: https://issues.apache.org/jira/browse/SPARK-42220 > Project: Spark > Issue Type: Improvement > Components: Build, Connect >Affects Versions: 3.5.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42220) Upgrade buf from 1.12.0 to 1.13.1
[ https://issues.apache.org/jira/browse/SPARK-42220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-42220: - Assignee: BingKun Pan > Upgrade buf from 1.12.0 to 1.13.1 > - > > Key: SPARK-42220 > URL: https://issues.apache.org/jira/browse/SPARK-42220 > Project: Spark > Issue Type: Improvement > Components: Build, Connect >Affects Versions: 3.5.0 >Reporter: BingKun Pan >Assignee: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42161) Upgrade Arrow to 11.0.0
[ https://issues.apache.org/jira/browse/SPARK-42161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-42161: - Assignee: Yang Jie > Upgrade Arrow to 11.0.0 > --- > > Key: SPARK-42161 > URL: https://issues.apache.org/jira/browse/SPARK-42161 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > > https://github.com/apache/arrow/releases/tag/apache-arrow-11.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42161) Upgrade Arrow to 11.0.0
[ https://issues.apache.org/jira/browse/SPARK-42161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42161. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39707 [https://github.com/apache/spark/pull/39707] > Upgrade Arrow to 11.0.0 > --- > > Key: SPARK-42161 > URL: https://issues.apache.org/jira/browse/SPARK-42161 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.4.0 > > > https://github.com/apache/arrow/releases/tag/apache-arrow-11.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42194) Allow `columns` parameter when creating DataFrame with Series.
[ https://issues.apache.org/jira/browse/SPARK-42194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42194: Assignee: (was: Apache Spark) > Allow `columns` parameter when creating DataFrame with Series. > -- > > Key: SPARK-42194 > URL: https://issues.apache.org/jira/browse/SPARK-42194 > Project: Spark > Issue Type: Bug > Components: ps >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > pandas API on Spark doesn't allow creating DataFrame with Series by > specifying the `columns` parameter as below: > {code:java} > >>> ps.DataFrame(psser, columns=["labels"]) > Traceback (most recent call last): > File "", line 1, in > File ".../spark/python/pyspark/pandas/frame.py", line 539, in __init__ > assert columns is None > AssertionError {code} > We should make it available. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42194) Allow `columns` parameter when creating DataFrame with Series.
[ https://issues.apache.org/jira/browse/SPARK-42194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681629#comment-17681629 ] Apache Spark commented on SPARK-42194: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39786 > Allow `columns` parameter when creating DataFrame with Series. > -- > > Key: SPARK-42194 > URL: https://issues.apache.org/jira/browse/SPARK-42194 > Project: Spark > Issue Type: Bug > Components: ps >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > pandas API on Spark doesn't allow creating DataFrame with Series by > specifying the `columns` parameter as below: > {code:java} > >>> ps.DataFrame(psser, columns=["labels"]) > Traceback (most recent call last): > File "", line 1, in > File ".../spark/python/pyspark/pandas/frame.py", line 539, in __init__ > assert columns is None > AssertionError {code} > We should make it available. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42194) Allow `columns` parameter when creating DataFrame with Series.
[ https://issues.apache.org/jira/browse/SPARK-42194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42194: Assignee: Apache Spark > Allow `columns` parameter when creating DataFrame with Series. > -- > > Key: SPARK-42194 > URL: https://issues.apache.org/jira/browse/SPARK-42194 > Project: Spark > Issue Type: Bug > Components: ps >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > > pandas API on Spark doesn't allow creating DataFrame with Series by > specifying the `columns` parameter as below: > {code:java} > >>> ps.DataFrame(psser, columns=["labels"]) > Traceback (most recent call last): > File "", line 1, in > File ".../spark/python/pyspark/pandas/frame.py", line 539, in __init__ > assert columns is None > AssertionError {code} > We should make it available. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42221) Introduce a new conf for TimestampNTZ schema inference in JSON/CSV
[ https://issues.apache.org/jira/browse/SPARK-42221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681599#comment-17681599 ] Apache Spark commented on SPARK-42221: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/39777 > Introduce a new conf for TimestampNTZ schema inference in JSON/CSV > -- > > Key: SPARK-42221 > URL: https://issues.apache.org/jira/browse/SPARK-42221 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > Introduce a new conf "spark.sql.inferTimestampNTZInDataSources.enabled" for > TimestampNTZ schema inference in JSON/CSV -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42192) Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
[ https://issues.apache.org/jira/browse/SPARK-42192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42192: Assignee: Apache Spark > Migrate the `TypeError` from `pyspark/sql/dataframe.py` into > `PySparkTypeError`. > > > Key: SPARK-42192 > URL: https://issues.apache.org/jira/browse/SPARK-42192 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > > Migrate the existing errors into new PySpark error framework. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42192) Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
[ https://issues.apache.org/jira/browse/SPARK-42192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681596#comment-17681596 ] Apache Spark commented on SPARK-42192: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39785 > Migrate the `TypeError` from `pyspark/sql/dataframe.py` into > `PySparkTypeError`. > > > Key: SPARK-42192 > URL: https://issues.apache.org/jira/browse/SPARK-42192 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Migrate the existing errors into new PySpark error framework. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42192) Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
[ https://issues.apache.org/jira/browse/SPARK-42192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42192: Assignee: (was: Apache Spark) > Migrate the `TypeError` from `pyspark/sql/dataframe.py` into > `PySparkTypeError`. > > > Key: SPARK-42192 > URL: https://issues.apache.org/jira/browse/SPARK-42192 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Migrate the existing errors into new PySpark error framework. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42192) Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
[ https://issues.apache.org/jira/browse/SPARK-42192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42192: Summary: Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`. (was: Migrate the errors from `pyspark/sql/dataframe.py` into error class.) > Migrate the `TypeError` from `pyspark/sql/dataframe.py` into > `PySparkTypeError`. > > > Key: SPARK-42192 > URL: https://issues.apache.org/jira/browse/SPARK-42192 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Migrate the existing errors into new PySpark error framework. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
[ https://issues.apache.org/jira/browse/SPARK-42226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42226: Assignee: Apache Spark > Upgrade versions-maven-plugin to 2.14.2 > --- > > Key: SPARK-42226 > URL: https://issues.apache.org/jira/browse/SPARK-42226 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
[ https://issues.apache.org/jira/browse/SPARK-42226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42226: Assignee: (was: Apache Spark) > Upgrade versions-maven-plugin to 2.14.2 > --- > > Key: SPARK-42226 > URL: https://issues.apache.org/jira/browse/SPARK-42226 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
[ https://issues.apache.org/jira/browse/SPARK-42226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681587#comment-17681587 ] Apache Spark commented on SPARK-42226: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/39784 > Upgrade versions-maven-plugin to 2.14.2 > --- > > Key: SPARK-42226 > URL: https://issues.apache.org/jira/browse/SPARK-42226 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
[ https://issues.apache.org/jira/browse/SPARK-42226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42226: Assignee: Apache Spark > Upgrade versions-maven-plugin to 2.14.2 > --- > > Key: SPARK-42226 > URL: https://issues.apache.org/jira/browse/SPARK-42226 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42226) Upgrade versions-maven-plugin to 2.14.2
Yang Jie created SPARK-42226: Summary: Upgrade versions-maven-plugin to 2.14.2 Key: SPARK-42226 URL: https://issues.apache.org/jira/browse/SPARK-42226 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.5.0 Reporter: Yang Jie https://github.com/mojohaus/versions/releases/tag/2.14.2 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681585#comment-17681585 ] Apache Spark commented on SPARK-42224: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39782 > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42224: Assignee: (was: Apache Spark) > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42224: Assignee: Apache Spark > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681584#comment-17681584 ] Apache Spark commented on SPARK-42224: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39782 > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42225: Assignee: (was: Apache Spark) > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42225: Assignee: Apache Spark > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681581#comment-17681581 ] Apache Spark commented on SPARK-42225: -- User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39783 > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely by catching `IllegalArgumentException` before unexpectedly raise `SparkConnectGrpcException`. (was: Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectGrpcException`.) > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely by catching `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectGrpcException`. (was: We should catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`.) > Catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectException`. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Summary: Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely. (was: Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`.) > Add `SparkConnectIllegalArgumentException` to handle Spark Connect error > precisely. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > Catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectGrpcException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: We should catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`. (was: We should handle `SparkConnectException` properly with `PySparkException`.) > Catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectException`. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Summary: Catch `IllegalArgumentException` before unexpectedly raise `SparkConnectException`. (was: Handle `SparkConnectException` properly with `PySparkException`.) > Catch `IllegalArgumentException` before unexpectedly raise > `SparkConnectException`. > --- > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should handle `SparkConnectException` properly with `PySparkException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Handle `SparkConnectException` properly with `PySparkException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: We should handle `SparkConnectException` properly with `PySparkException`. (was: We should migrate Python built-in errors such as `SparkConnectException` into `PySparkException` for Spark Connect DataFrame.) > Handle `SparkConnectException` properly with `PySparkException`. > > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should handle `SparkConnectException` properly with `PySparkException`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Handle `SparkConnectException` properly with `PySparkException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: We should migrate Python built-in errors such as `SparkConnectException` into `PySparkException` for Spark Connect DataFrame. (was: We should migrate Python built-in errors such as `SparkConnectException` into `IllegalArgumentException` for Spark Connect DataFrame.) > Handle `SparkConnectException` properly with `PySparkException`. > > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `SparkConnectException` into > `PySparkException` for Spark Connect DataFrame. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Handle `SparkConnectException` properly with `PySparkException`.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Summary: Handle `SparkConnectException` properly with `PySparkException`. (was: Migrate `IllegalArgumentException ` into error framework for Spark Connect DataFrame.) > Handle `SparkConnectException` properly with `PySparkException`. > > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `SparkConnectException` into > `IllegalArgumentException` for Spark Connect DataFrame. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Migrate `IllegalArgumentException ` into error framework for Spark Connect DataFrame.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Description: We should migrate Python built-in errors such as `SparkConnectException` into `IllegalArgumentException` for Spark Connect DataFrame. (was: We should migrate Python built-in errors such as `ValueError` into `PySparkValueError` for Spark Connect.) > Migrate `IllegalArgumentException ` into error framework for Spark Connect > DataFrame. > - > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `SparkConnectException` into > `IllegalArgumentException` for Spark Connect DataFrame. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42225) Migrate `IllegalArgumentException ` into error framework for Spark Connect DataFrame.
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42225: Summary: Migrate `IllegalArgumentException ` into error framework for Spark Connect DataFrame. (was: Migrate `ValueError` into error framework for Spark Connect functions) > Migrate `IllegalArgumentException ` into error framework for Spark Connect > DataFrame. > - > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `ValueError` into > `PySparkValueError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4224) Support group acls
[ https://issues.apache.org/jira/browse/SPARK-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681578#comment-17681578 ] Apache Spark commented on SPARK-4224: - User 'itholic' has created a pull request for this issue: https://github.com/apache/spark/pull/39782 > Support group acls > -- > > Key: SPARK-4224 > URL: https://issues.apache.org/jira/browse/SPARK-4224 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 1.2.0 >Reporter: Thomas Graves >Assignee: Dhruve Ashar >Priority: Major > Fix For: 2.0.0 > > > Currently we support view and modify acls but you have to specify a list of > users. It would be nice to also support groups, so that anyone in the group > has permissions. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42225) Migrate `ValueError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681576#comment-17681576 ] Haejoon Lee commented on SPARK-42225: - I'm working on it > Migrate `ValueError` into error framework for Spark Connect functions > - > > Key: SPARK-42225 > URL: https://issues.apache.org/jira/browse/SPARK-42225 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `ValueError` into > `PySparkValueError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42225) Migrate `ValueError` into error framework for Spark Connect functions
Haejoon Lee created SPARK-42225: --- Summary: Migrate `ValueError` into error framework for Spark Connect functions Key: SPARK-42225 URL: https://issues.apache.org/jira/browse/SPARK-42225 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.5.0 Reporter: Haejoon Lee We should migrate Python built-in errors such as `ValueError` into `PySparkValueError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42224: Description: We should migrate Python built-in errors such as `TypeError` into `PySparkTypeError` for Spark Connect. (was: We should migrate Python built-in errors such as TypeError, ValueError into PySparkTypeError, PySparkValueError for Spark Connect.) > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as `TypeError` into > `PySparkTypeError` for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42224) Migrate `TypeError` into error framework for Spark Connect functions
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-42224: Summary: Migrate `TypeError` into error framework for Spark Connect functions (was: Migrate python-built in errors from Spark Connect into error framework.) > Migrate `TypeError` into error framework for Spark Connect functions > > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as TypeError, ValueError into > PySparkTypeError, PySparkValueError for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42224) Migrate python-built in errors from Spark Connect into error framework.
[ https://issues.apache.org/jira/browse/SPARK-42224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681562#comment-17681562 ] Haejoon Lee commented on SPARK-42224: - I'm working on it > Migrate python-built in errors from Spark Connect into error framework. > --- > > Key: SPARK-42224 > URL: https://issues.apache.org/jira/browse/SPARK-42224 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > We should migrate Python built-in errors such as TypeError, ValueError into > PySparkTypeError, PySparkValueError for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42224) Migrate python-built in errors from Spark Connect into error framework.
Haejoon Lee created SPARK-42224: --- Summary: Migrate python-built in errors from Spark Connect into error framework. Key: SPARK-42224 URL: https://issues.apache.org/jira/browse/SPARK-42224 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.5.0 Reporter: Haejoon Lee We should migrate Python built-in errors such as TypeError, ValueError into PySparkTypeError, PySparkValueError for Spark Connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42168) CoGroup with window function returns incorrect result when partition keys differ in order
[ https://issues.apache.org/jira/browse/SPARK-42168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681560#comment-17681560 ] Apache Spark commented on SPARK-42168: -- User 'EnricoMi' has created a pull request for this issue: https://github.com/apache/spark/pull/39781 > CoGroup with window function returns incorrect result when partition keys > differ in order > - > > Key: SPARK-42168 > URL: https://issues.apache.org/jira/browse/SPARK-42168 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.0.3, 3.1.3, 3.2.3 >Reporter: Enrico Minack >Assignee: Enrico Minack >Priority: Major > Labels: correctness > Fix For: 3.2.4 > > > The following example returns an incorrect result: > {code:java} > import pandas as pd > from pyspark.sql import SparkSession, Window > from pyspark.sql.functions import col, lit, sum > spark = SparkSession \ > .builder \ > .getOrCreate() > ids = 1000 > days = 1000 > parts = 10 > id_df = spark.range(ids) > day_df = spark.range(days).withColumnRenamed("id", "day") > id_day_df = id_df.join(day_df) > left_df = id_day_df.select(col("id").alias("id"), col("day").alias("day"), > lit("left").alias("side")).repartition(parts).cache() > right_df = id_day_df.select(col("id").alias("id"), col("day").alias("day"), > lit("right").alias("side")).repartition(parts).cache() > #.withColumnRenamed("id", "id2") > # note the column order is different to the groupBy("id", "day") column order > below > window = Window.partitionBy("day", "id") > left_grouped_df = left_df.groupBy("id", "day") > right_grouped_df = right_df.withColumn("day_sum", > sum(col("day")).over(window)).groupBy("id", "day") > def cogroup(left: pd.DataFrame, right: pd.DataFrame) -> pd.DataFrame: > return pd.DataFrame([{ > "id": left["id"][0] if not left.empty else (right["id"][0] if not > right.empty else None), > "day": left["day"][0] if not left.empty else (right["day"][0] if not > right.empty else None), > "lefts": len(left.index), > "rights": len(right.index) > }]) > df = left_grouped_df.cogroup(right_grouped_df) \ > .applyInPandas(cogroup, schema="id long, day long, lefts integer, > rights integer") > df.explain() > df.show(5) > {code} > Output is > {code} > == Physical Plan == > AdaptiveSparkPlan isFinalPlan=false > +- FlatMapCoGroupsInPandas [id#8L, day#9L], [id#29L, day#30L], cogroup(id#8L, > day#9L, side#10, id#29L, day#30L, side#31, day_sum#54L), [id#64L, day#65L, > lefts#66, rights#67] >:- Sort [id#8L ASC NULLS FIRST, day#9L ASC NULLS FIRST], false, 0 >: +- Exchange hashpartitioning(id#8L, day#9L, 200), ENSURE_REQUIREMENTS, > [plan_id=117] >: +- ... >+- Sort [id#29L ASC NULLS FIRST, day#30L ASC NULLS FIRST], false, 0 > +- Project [id#29L, day#30L, id#29L, day#30L, side#31, day_sum#54L] > +- Window [sum(day#30L) windowspecdefinition(day#30L, id#29L, > specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) > AS day_sum#54L], [day#30L, id#29L] > +- Sort [day#30L ASC NULLS FIRST, id#29L ASC NULLS FIRST], false, > 0 >+- Exchange hashpartitioning(day#30L, id#29L, 200), > ENSURE_REQUIREMENTS, [plan_id=112] > +- ... > +---+---+-+--+ > | id|day|lefts|rights| > +---+---+-+--+ > | 0| 3|0| 1| > | 0| 4|0| 1| > | 0| 13|1| 0| > | 0| 27|0| 1| > | 0| 31|0| 1| > +---+---+-+--+ > only showing top 5 rows > {code} > The first child is hash-partitioned by {{id}} and {{{}day{}}}, while the > second child is hash-partitioned by {{day}} and {{id}} (required by the > window function). Therefore, rows end up in different partitions. > This has been fixed in Spark 3.3 by > [#32875|https://github.com/apache/spark/pull/32875/files#diff-e938569a4ca4eba8f7e10fe473d4f9c306ea253df151405bcaba880a601f075fR75-R76]: > {code} > == Physical Plan == > AdaptiveSparkPlan isFinalPlan=false > +- FlatMapCoGroupsInPandas [id#8L, day#9L], [id#29L, day#30L], cogroup(id#8L, > day#9L, side#10, id#29L, day#30L, side#31, day_sum#54L)#63, [id#64L, day#65L, > lefts#66, rights#67] >:- Sort [id#8L ASC NULLS FIRST, day#9L ASC NULLS FIRST], false, 0 >: +- Exchange hashpartitioning(id#8L, day#9L, 200), ENSURE_REQUIREMENTS, > [plan_id=117] >: +- ... >+- Sort [id#29L ASC NULLS FIRST, day#30L ASC NULLS FIRST], false, 0 > +- Exchange hashpartitioning(id#29L, day#30L, 200), > ENSURE_REQUIREMENTS, [plan_id=118] > +- Project [id#29L, day#30L, id#29L, day#30L, side#31, day_sum#54L] > +- Window [sum(day#30L) windowspecdefinition(day#30L, id#29L, > specifiedwindowframe(RowFrame,
[jira] [Assigned] (SPARK-42223) Remove duplicate branches in CASE_WHEN and COALESCE function
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42223: Assignee: (was: Apache Spark) > Remove duplicate branches in CASE_WHEN and COALESCE function > > > Key: SPARK-42223 > URL: https://issues.apache.org/jira/browse/SPARK-42223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wan Kun >Priority: Minor > > We can remove the duplicate branches in CASE_WHEN and COALESCE function. > For example: > {code:sql} > SELECT > CASE WHEN id = 200 THEN 1 >WHEN id = 200 THEN 2 -- same condition, can remove >WHEN id = 200 THEN 3 -- same condition, can remove >ELSE 0 END as c1, > coalesce( > id + 1, > id + 1, -- same expression, can remove > id) as c2 > FROM t1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42223) Remove duplicate branches in CASE_WHEN and COALESCE function
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42223: Assignee: Apache Spark > Remove duplicate branches in CASE_WHEN and COALESCE function > > > Key: SPARK-42223 > URL: https://issues.apache.org/jira/browse/SPARK-42223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wan Kun >Assignee: Apache Spark >Priority: Minor > > We can remove the duplicate branches in CASE_WHEN and COALESCE function. > For example: > {code:sql} > SELECT > CASE WHEN id = 200 THEN 1 >WHEN id = 200 THEN 2 -- same condition, can remove >WHEN id = 200 THEN 3 -- same condition, can remove >ELSE 0 END as c1, > coalesce( > id + 1, > id + 1, -- same expression, can remove > id) as c2 > FROM t1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42223) Remove duplicate branches in CASE_WHEN and COALESCE function
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681540#comment-17681540 ] Apache Spark commented on SPARK-42223: -- User 'wankunde' has created a pull request for this issue: https://github.com/apache/spark/pull/39780 > Remove duplicate branches in CASE_WHEN and COALESCE function > > > Key: SPARK-42223 > URL: https://issues.apache.org/jira/browse/SPARK-42223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wan Kun >Priority: Minor > > We can remove the duplicate branches in CASE_WHEN and COALESCE function. > For example: > {code:sql} > SELECT > CASE WHEN id = 200 THEN 1 >WHEN id = 200 THEN 2 -- same condition, can remove >WHEN id = 200 THEN 3 -- same condition, can remove >ELSE 0 END as c1, > coalesce( > id + 1, > id + 1, -- same expression, can remove > id) as c2 > FROM t1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42223) Remove duplicate branches in CASE_WHEN and COALESCE function
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42223: Summary: Remove duplicate branches in CASE_WHEN and COALESCE function (was: Remove duplicate branch in CASE_WHEN and COALESCE function) > Remove duplicate branches in CASE_WHEN and COALESCE function > > > Key: SPARK-42223 > URL: https://issues.apache.org/jira/browse/SPARK-42223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wan Kun >Priority: Minor > > We can remove the duplicate branches in CASE_WHEN and COALESCE function. > For example: > {code:sql} > SELECT > CASE WHEN id = 200 THEN 1 >WHEN id = 200 THEN 2 -- same condition, can remove >WHEN id = 200 THEN 3 -- same condition, can remove >ELSE 0 END as c1, > coalesce( > id + 1, > id + 1, -- same expression, can remove > id) as c2 > FROM t1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42223) Remove duplicate branch in CASE_WHEN and COALESCE function
[ https://issues.apache.org/jira/browse/SPARK-42223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wan Kun updated SPARK-42223: Summary: Remove duplicate branch in CASE_WHEN and COALESCE function (was: Remove duplicat branch in CASE_WHEN and COALESCE function) > Remove duplicate branch in CASE_WHEN and COALESCE function > -- > > Key: SPARK-42223 > URL: https://issues.apache.org/jira/browse/SPARK-42223 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wan Kun >Priority: Minor > > We can remove the duplicate branches in CASE_WHEN and COALESCE function. > For example: > {code:sql} > SELECT > CASE WHEN id = 200 THEN 1 >WHEN id = 200 THEN 2 -- same condition, can remove >WHEN id = 200 THEN 3 -- same condition, can remove >ELSE 0 END as c1, > coalesce( > id + 1, > id + 1, -- same expression, can remove > id) as c2 > FROM t1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42223) Remove duplicat branch in CASE_WHEN and COALESCE function
Wan Kun created SPARK-42223: --- Summary: Remove duplicat branch in CASE_WHEN and COALESCE function Key: SPARK-42223 URL: https://issues.apache.org/jira/browse/SPARK-42223 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Wan Kun We can remove the duplicate branches in CASE_WHEN and COALESCE function. For example: {code:sql} SELECT CASE WHEN id = 200 THEN 1 WHEN id = 200 THEN 2 -- same condition, can remove WHEN id = 200 THEN 3 -- same condition, can remove ELSE 0 END as c1, coalesce( id + 1, id + 1, -- same expression, can remove id) as c2 FROM t1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org