[jira] [Created] (SPARK-41713) Make CTAS hold a nested execution for data writing
XiDuo You created SPARK-41713: - Summary: Make CTAS hold a nested execution for data writing Key: SPARK-41713 URL: https://issues.apache.org/jira/browse/SPARK-41713 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: XiDuo You decouple the create table and data writing command -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41712) Migrate the Spark Connect errors into error classes.
Haejoon Lee created SPARK-41712: --- Summary: Migrate the Spark Connect errors into error classes. Key: SPARK-41712 URL: https://issues.apache.org/jira/browse/SPARK-41712 Project: Spark Issue Type: Sub-task Components: Connect, PySpark Affects Versions: 3.4.0 Reporter: Haejoon Lee We need to migrate the Spark Connect errors into centralized error framework by leveraging the error class logic. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41711) Upgrade protobuf-java to 3.21.12
[ https://issues.apache.org/jira/browse/SPARK-41711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41711: Assignee: Apache Spark > Upgrade protobuf-java to 3.21.12 > > > Key: SPARK-41711 > URL: https://issues.apache.org/jira/browse/SPARK-41711 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > > https://github.com/protocolbuffers/protobuf/releases/tag/v21.12 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41711) Upgrade protobuf-java to 3.21.12
[ https://issues.apache.org/jira/browse/SPARK-41711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41711: Assignee: (was: Apache Spark) > Upgrade protobuf-java to 3.21.12 > > > Key: SPARK-41711 > URL: https://issues.apache.org/jira/browse/SPARK-41711 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > > https://github.com/protocolbuffers/protobuf/releases/tag/v21.12 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41711) Upgrade protobuf-java to 3.21.12
[ https://issues.apache.org/jira/browse/SPARK-41711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651945#comment-17651945 ] Apache Spark commented on SPARK-41711: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/39217 > Upgrade protobuf-java to 3.21.12 > > > Key: SPARK-41711 > URL: https://issues.apache.org/jira/browse/SPARK-41711 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > > https://github.com/protocolbuffers/protobuf/releases/tag/v21.12 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41711) Upgrade protobuf-java to 3.21.12
Yang Jie created SPARK-41711: Summary: Upgrade protobuf-java to 3.21.12 Key: SPARK-41711 URL: https://issues.apache.org/jira/browse/SPARK-41711 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.4.0 Reporter: Yang Jie https://github.com/protocolbuffers/protobuf/releases/tag/v21.12 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41710) Implement `Column.between`
[ https://issues.apache.org/jira/browse/SPARK-41710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651943#comment-17651943 ] Apache Spark commented on SPARK-41710: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/39216 > Implement `Column.between` > -- > > Key: SPARK-41710 > URL: https://issues.apache.org/jira/browse/SPARK-41710 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41710) Implement `Column.between`
[ https://issues.apache.org/jira/browse/SPARK-41710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41710: Assignee: (was: Apache Spark) > Implement `Column.between` > -- > > Key: SPARK-41710 > URL: https://issues.apache.org/jira/browse/SPARK-41710 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41710) Implement `Column.between`
[ https://issues.apache.org/jira/browse/SPARK-41710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41710: Assignee: Apache Spark > Implement `Column.between` > -- > > Key: SPARK-41710 > URL: https://issues.apache.org/jira/browse/SPARK-41710 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41710) Implement `Column.between`
Ruifeng Zheng created SPARK-41710: - Summary: Implement `Column.between` Key: SPARK-41710 URL: https://issues.apache.org/jira/browse/SPARK-41710 Project: Spark Issue Type: Sub-task Components: Connect, PySpark Affects Versions: 3.4.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41709) Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-41709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651940#comment-17651940 ] Apache Spark commented on SPARK-41709: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/39215 > Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui > objects from protobuf objects for Scala 2.13 > - > > Key: SPARK-41709 > URL: https://issues.apache.org/jira/browse/SPARK-41709 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41709) Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-41709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41709: Assignee: Apache Spark > Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui > objects from protobuf objects for Scala 2.13 > - > > Key: SPARK-41709 > URL: https://issues.apache.org/jira/browse/SPARK-41709 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41709) Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-41709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41709: Assignee: (was: Apache Spark) > Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui > objects from protobuf objects for Scala 2.13 > - > > Key: SPARK-41709 > URL: https://issues.apache.org/jira/browse/SPARK-41709 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41709) Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-41709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651939#comment-17651939 ] Apache Spark commented on SPARK-41709: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/39215 > Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui > objects from protobuf objects for Scala 2.13 > - > > Key: SPARK-41709 > URL: https://issues.apache.org/jira/browse/SPARK-41709 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41709) Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
Yang Jie created SPARK-41709: Summary: Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13 Key: SPARK-41709 URL: https://issues.apache.org/jira/browse/SPARK-41709 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.4.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41708) Pull v1write information to write file node
XiDuo You created SPARK-41708: - Summary: Pull v1write information to write file node Key: SPARK-41708 URL: https://issues.apache.org/jira/browse/SPARK-41708 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: XiDuo You Make WriteFiles hold v1 write information -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41707) Implement initial Catalog.* API
[ https://issues.apache.org/jira/browse/SPARK-41707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651929#comment-17651929 ] Apache Spark commented on SPARK-41707: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/39214 > Implement initial Catalog.* API > --- > > Key: SPARK-41707 > URL: https://issues.apache.org/jira/browse/SPARK-41707 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41707) Implement initial Catalog.* API
[ https://issues.apache.org/jira/browse/SPARK-41707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41707: Assignee: (was: Apache Spark) > Implement initial Catalog.* API > --- > > Key: SPARK-41707 > URL: https://issues.apache.org/jira/browse/SPARK-41707 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41707) Implement initial Catalog.* API
[ https://issues.apache.org/jira/browse/SPARK-41707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41707: Assignee: Apache Spark > Implement initial Catalog.* API > --- > > Key: SPARK-41707 > URL: https://issues.apache.org/jira/browse/SPARK-41707 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41707) Implement initial Catalog.* API
[ https://issues.apache.org/jira/browse/SPARK-41707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651928#comment-17651928 ] Apache Spark commented on SPARK-41707: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/39214 > Implement initial Catalog.* API > --- > > Key: SPARK-41707 > URL: https://issues.apache.org/jira/browse/SPARK-41707 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41707) Implement initial Catalog.* API
Hyukjin Kwon created SPARK-41707: Summary: Implement initial Catalog.* API Key: SPARK-41707 URL: https://issues.apache.org/jira/browse/SPARK-41707 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41706) pyspark_types_to_proto_types should supports MapType
[ https://issues.apache.org/jira/browse/SPARK-41706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651925#comment-17651925 ] Apache Spark commented on SPARK-41706: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/39213 > pyspark_types_to_proto_types should supports MapType > > > Key: SPARK-41706 > URL: https://issues.apache.org/jira/browse/SPARK-41706 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Priority: Major > > pyspark_types_to_proto_types doesn't support MapType now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41706) pyspark_types_to_proto_types should supports MapType
[ https://issues.apache.org/jira/browse/SPARK-41706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651924#comment-17651924 ] Apache Spark commented on SPARK-41706: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/39213 > pyspark_types_to_proto_types should supports MapType > > > Key: SPARK-41706 > URL: https://issues.apache.org/jira/browse/SPARK-41706 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Priority: Major > > pyspark_types_to_proto_types doesn't support MapType now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41706) pyspark_types_to_proto_types should supports MapType
[ https://issues.apache.org/jira/browse/SPARK-41706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41706: Assignee: Apache Spark > pyspark_types_to_proto_types should supports MapType > > > Key: SPARK-41706 > URL: https://issues.apache.org/jira/browse/SPARK-41706 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Assignee: Apache Spark >Priority: Major > > pyspark_types_to_proto_types doesn't support MapType now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41706) pyspark_types_to_proto_types should supports MapType
[ https://issues.apache.org/jira/browse/SPARK-41706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41706: Assignee: (was: Apache Spark) > pyspark_types_to_proto_types should supports MapType > > > Key: SPARK-41706 > URL: https://issues.apache.org/jira/browse/SPARK-41706 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Priority: Major > > pyspark_types_to_proto_types doesn't support MapType now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41706) pyspark_types_to_proto_types should supports MapType
jiaan.geng created SPARK-41706: -- Summary: pyspark_types_to_proto_types should supports MapType Key: SPARK-41706 URL: https://issues.apache.org/jira/browse/SPARK-41706 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: jiaan.geng pyspark_types_to_proto_types doesn't support MapType now. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] (SPARK-41464) Implement DataFrame.to
[ https://issues.apache.org/jira/browse/SPARK-41464 ] jiaan.geng deleted comment on SPARK-41464: was (Author: beliefer): OK > Implement DataFrame.to > -- > > Key: SPARK-41464 > URL: https://issues.apache.org/jira/browse/SPARK-41464 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: jiaan.geng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41705) Move generate_protos.sh to dev/
[ https://issues.apache.org/jira/browse/SPARK-41705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651900#comment-17651900 ] Apache Spark commented on SPARK-41705: -- User 'tedyu' has created a pull request for this issue: https://github.com/apache/spark/pull/39211 > Move generate_protos.sh to dev/ > > > Key: SPARK-41705 > URL: https://issues.apache.org/jira/browse/SPARK-41705 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ted Yu >Priority: Minor > > connector/connect/dev only contains one script. Moving generate_protos.sh to > dev follows practice for other scripts. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41705) Move generate_protos.sh to dev/
[ https://issues.apache.org/jira/browse/SPARK-41705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41705: Assignee: (was: Apache Spark) > Move generate_protos.sh to dev/ > > > Key: SPARK-41705 > URL: https://issues.apache.org/jira/browse/SPARK-41705 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ted Yu >Priority: Minor > > connector/connect/dev only contains one script. Moving generate_protos.sh to > dev follows practice for other scripts. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41705) Move generate_protos.sh to dev/
[ https://issues.apache.org/jira/browse/SPARK-41705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41705: Assignee: Apache Spark > Move generate_protos.sh to dev/ > > > Key: SPARK-41705 > URL: https://issues.apache.org/jira/browse/SPARK-41705 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ted Yu >Assignee: Apache Spark >Priority: Minor > > connector/connect/dev only contains one script. Moving generate_protos.sh to > dev follows practice for other scripts. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41705) Move generate_protos.sh to dev/
Ted Yu created SPARK-41705: -- Summary: Move generate_protos.sh to dev/ Key: SPARK-41705 URL: https://issues.apache.org/jira/browse/SPARK-41705 Project: Spark Issue Type: Task Components: Connect Affects Versions: 3.4.0 Reporter: Ted Yu connector/connect/dev only contains one script. Moving generate_protos.sh to dev follows practice for other scripts. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41533) GRPC Errors on the client should be cleaned up
[ https://issues.apache.org/jira/browse/SPARK-41533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651898#comment-17651898 ] Apache Spark commented on SPARK-41533: -- User 'grundprinzip' has created a pull request for this issue: https://github.com/apache/spark/pull/39212 > GRPC Errors on the client should be cleaned up > -- > > Key: SPARK-41533 > URL: https://issues.apache.org/jira/browse/SPARK-41533 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > When the server throws an exception we report a very deep stack trace that is > not helpful for the user. > We need to separate the cause from the user visible exception and wrap the > error into custom exception instead of publishing the RPCError from GRPC -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41533) GRPC Errors on the client should be cleaned up
[ https://issues.apache.org/jira/browse/SPARK-41533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41533: Assignee: Apache Spark > GRPC Errors on the client should be cleaned up > -- > > Key: SPARK-41533 > URL: https://issues.apache.org/jira/browse/SPARK-41533 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Assignee: Apache Spark >Priority: Major > > When the server throws an exception we report a very deep stack trace that is > not helpful for the user. > We need to separate the cause from the user visible exception and wrap the > error into custom exception instead of publishing the RPCError from GRPC -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41533) GRPC Errors on the client should be cleaned up
[ https://issues.apache.org/jira/browse/SPARK-41533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41533: Assignee: (was: Apache Spark) > GRPC Errors on the client should be cleaned up > -- > > Key: SPARK-41533 > URL: https://issues.apache.org/jira/browse/SPARK-41533 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > When the server throws an exception we report a very deep stack trace that is > not helpful for the user. > We need to separate the cause from the user visible exception and wrap the > error into custom exception instead of publishing the RPCError from GRPC -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41533) GRPC Errors on the client should be cleaned up
[ https://issues.apache.org/jira/browse/SPARK-41533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651897#comment-17651897 ] Apache Spark commented on SPARK-41533: -- User 'grundprinzip' has created a pull request for this issue: https://github.com/apache/spark/pull/39212 > GRPC Errors on the client should be cleaned up > -- > > Key: SPARK-41533 > URL: https://issues.apache.org/jira/browse/SPARK-41533 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > When the server throws an exception we report a very deep stack trace that is > not helpful for the user. > We need to separate the cause from the user visible exception and wrap the > error into custom exception instead of publishing the RPCError from GRPC -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41277) Save and leverage shuffle key in tblproperties
[ https://issues.apache.org/jira/browse/SPARK-41277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651877#comment-17651877 ] Ohad Raviv commented on SPARK-41277: I managed to do some quick-and-dirty solution, just to be able to check it on existing processes. I had to change `{_}spark.sql.legacy.createHiveTableByDefault=false{_}` as Hive provider, Spark and bucketing do not play nicely together (Spark uses a different hash function from Hive). then I added a custom optimization rule: {code:java} object BucketingRule extends Rule[LogicalPlan] { override def apply(plan: LogicalPlan): LogicalPlan = { plan transform { case c @ CreateDataSourceTableAsSelectCommand(table, SaveMode.ErrorIfExists, query, _) if query.resolved => query match { case Aggregate(grouping, _, _) => val numBuckets = SQLConf.get.numShufflePartitions val bucketSpec = BucketSpec(numBuckets, grouping.map(_.asInstanceOf[AttributeReference].name), Nil) c.copy(table = table.copy(bucketSpec = Some(bucketSpec))) case _ => c } } } } spark.sessionState.experimentalMethods.extraOptimizations ++= BucketingRule :: Nil{code} And it works on this mock: {code:java} (1 to 30).map(i => ("k_" + (i-(1-i%2)), "v_" + i)) .toDF("id", "val") .createOrReplaceTempView("t") spark.sql(s"create table tbl1 select id,max(val) val, count(1) cnt from t group by id") spark.table("t").write.bucketBy(3, "id").saveAsTable("tbl2") spark.conf.set("spark.sql.autoBroadcastJoinThreshold", -1) val dfPlan = spark.sql("create table tbl3 as select tbl1.* from tbl1" + " join tbl2 on tbl1.id=tbl2.id") dfPlan.explain(true) spark.table("tbl3").show() {code} you could see that `tbl1` gets created as a bucketed table. I will try to see if we get any noticeable performance gain. meanwhile, could you suggest/direct to a better solution? > Save and leverage shuffle key in tblproperties > -- > > Key: SPARK-41277 > URL: https://issues.apache.org/jira/browse/SPARK-41277 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.1 >Reporter: Ohad Raviv >Priority: Minor > > I'm not sure if I'm not missing anything trivial. > In a typical process, many datasets get materialized and many of them after a > shuffle (e.g join). then they would again be involved in further actions and > often use the same key. > Wouldn't it make sense to save the shuffle key along with the table to avoid > unnecessary shuffles? > Also, the implementation seems quite straightforward - to just leverage the > bucketing mechanism. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41704) Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
[ https://issues.apache.org/jira/browse/SPARK-41704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41704: Assignee: (was: Apache Spark) > Upgrade `sbt-assembly` from 2.0.0 to 2.1.0 > --- > > Key: SPARK-41704 > URL: https://issues.apache.org/jira/browse/SPARK-41704 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41704) Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
[ https://issues.apache.org/jira/browse/SPARK-41704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651874#comment-17651874 ] Apache Spark commented on SPARK-41704: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/39210 > Upgrade `sbt-assembly` from 2.0.0 to 2.1.0 > --- > > Key: SPARK-41704 > URL: https://issues.apache.org/jira/browse/SPARK-41704 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41704) Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
[ https://issues.apache.org/jira/browse/SPARK-41704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41704: Assignee: Apache Spark > Upgrade `sbt-assembly` from 2.0.0 to 2.1.0 > --- > > Key: SPARK-41704 > URL: https://issues.apache.org/jira/browse/SPARK-41704 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0 >Reporter: BingKun Pan >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41704) Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
BingKun Pan created SPARK-41704: --- Summary: Upgrade `sbt-assembly` from 2.0.0 to 2.1.0 Key: SPARK-41704 URL: https://issues.apache.org/jira/browse/SPARK-41704 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.4.0 Reporter: BingKun Pan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41702) Add invalid ops
[ https://issues.apache.org/jira/browse/SPARK-41702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-41702. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39206 [https://github.com/apache/spark/pull/39206] > Add invalid ops > --- > > Key: SPARK-41702 > URL: https://issues.apache.org/jira/browse/SPARK-41702 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41701) Make column op support `decimal`
[ https://issues.apache.org/jira/browse/SPARK-41701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-41701. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39205 [https://github.com/apache/spark/pull/39205] > Make column op support `decimal` > > > Key: SPARK-41701 > URL: https://issues.apache.org/jira/browse/SPARK-41701 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41701) Make column op support `decimal`
[ https://issues.apache.org/jira/browse/SPARK-41701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-41701: Assignee: Ruifeng Zheng > Make column op support `decimal` > > > Key: SPARK-41701 > URL: https://issues.apache.org/jira/browse/SPARK-41701 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41702) Add invalid ops
[ https://issues.apache.org/jira/browse/SPARK-41702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-41702: Assignee: Ruifeng Zheng > Add invalid ops > --- > > Key: SPARK-41702 > URL: https://issues.apache.org/jira/browse/SPARK-41702 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41703) Combine NullType and typed_null
[ https://issues.apache.org/jira/browse/SPARK-41703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41703: Assignee: Apache Spark > Combine NullType and typed_null > --- > > Key: SPARK-41703 > URL: https://issues.apache.org/jira/browse/SPARK-41703 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41703) Combine NullType and typed_null
[ https://issues.apache.org/jira/browse/SPARK-41703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41703: Assignee: (was: Apache Spark) > Combine NullType and typed_null > --- > > Key: SPARK-41703 > URL: https://issues.apache.org/jira/browse/SPARK-41703 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41703) Combine NullType and typed_null
[ https://issues.apache.org/jira/browse/SPARK-41703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651870#comment-17651870 ] Apache Spark commented on SPARK-41703: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/39209 > Combine NullType and typed_null > --- > > Key: SPARK-41703 > URL: https://issues.apache.org/jira/browse/SPARK-41703 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41703) Combine NullType and typed_null
Ruifeng Zheng created SPARK-41703: - Summary: Combine NullType and typed_null Key: SPARK-41703 URL: https://issues.apache.org/jira/browse/SPARK-41703 Project: Spark Issue Type: Sub-task Components: Connect, PySpark Affects Versions: 3.4.0 Reporter: Ruifeng Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41334) move SortField from relations.proto to expressions.proto
[ https://issues.apache.org/jira/browse/SPARK-41334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651868#comment-17651868 ] Apache Spark commented on SPARK-41334: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/39208 > move SortField from relations.proto to expressions.proto > > > Key: SPARK-41334 > URL: https://issues.apache.org/jira/browse/SPARK-41334 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41334) move SortField from relations.proto to expressions.proto
[ https://issues.apache.org/jira/browse/SPARK-41334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651867#comment-17651867 ] Apache Spark commented on SPARK-41334: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/39208 > move SortField from relations.proto to expressions.proto > > > Key: SPARK-41334 > URL: https://issues.apache.org/jira/browse/SPARK-41334 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41111) Implement `DataFrame.show`
[ https://issues.apache.org/jira/browse/SPARK-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651865#comment-17651865 ] Apache Spark commented on SPARK-4: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/39207 > Implement `DataFrame.show` > -- > > Key: SPARK-4 > URL: https://issues.apache.org/jira/browse/SPARK-4 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org