[jira] [Updated] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40943: -- Issue Type: Improvement (was: Task) > Make MSCK optional in MSCK REPAIR TABLE commands > > > Key: SPARK-40943 > URL: https://issues.apache.org/jira/browse/SPARK-40943 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Ben Zhang >Assignee: Ben Zhang >Priority: Major > Fix For: 3.4.0 > > > The current syntax for `MSCK REPAIR TABLE` is complex and difficult to > understand. The proposal is to make the `MSCK` keyword optional so that > `REPAIR TABLE` may be used in its stead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40943: -- Affects Version/s: 3.4.0 (was: 3.3.1) > Make MSCK optional in MSCK REPAIR TABLE commands > > > Key: SPARK-40943 > URL: https://issues.apache.org/jira/browse/SPARK-40943 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Ben Zhang >Assignee: Ben Zhang >Priority: Major > Fix For: 3.4.0 > > > The current syntax for `MSCK REPAIR TABLE` is complex and difficult to > understand. The proposal is to make the `MSCK` keyword optional so that > `REPAIR TABLE` may be used in its stead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-40943. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38433 [https://github.com/apache/spark/pull/38433] > Make MSCK optional in MSCK REPAIR TABLE commands > > > Key: SPARK-40943 > URL: https://issues.apache.org/jira/browse/SPARK-40943 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.1 >Reporter: Ben Zhang >Assignee: Ben Zhang >Priority: Major > Fix For: 3.4.0 > > > The current syntax for `MSCK REPAIR TABLE` is complex and difficult to > understand. The proposal is to make the `MSCK` keyword optional so that > `REPAIR TABLE` may be used in its stead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands
[ https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-40943: - Assignee: Ben Zhang > Make MSCK optional in MSCK REPAIR TABLE commands > > > Key: SPARK-40943 > URL: https://issues.apache.org/jira/browse/SPARK-40943 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.1 >Reporter: Ben Zhang >Assignee: Ben Zhang >Priority: Major > > The current syntax for `MSCK REPAIR TABLE` is complex and difficult to > understand. The proposal is to make the `MSCK` keyword optional so that > `REPAIR TABLE` may be used in its stead. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180
[ https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689571#comment-17689571 ] Apache Spark commented on SPARK-42463: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40052 > Clean up the third-party Java source code introduced by SPARK-27180 > --- > > Key: SPARK-42463 > URL: https://issues.apache.org/jira/browse/SPARK-42463 > Project: Spark > Issue Type: Improvement > Components: Tests, YARN >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > * > resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180
[ https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42463: Assignee: Apache Spark > Clean up the third-party Java source code introduced by SPARK-27180 > --- > > Key: SPARK-42463 > URL: https://issues.apache.org/jira/browse/SPARK-42463 > Project: Spark > Issue Type: Improvement > Components: Tests, YARN >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > * > resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180
[ https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42463: Assignee: (was: Apache Spark) > Clean up the third-party Java source code introduced by SPARK-27180 > --- > > Key: SPARK-42463 > URL: https://issues.apache.org/jira/browse/SPARK-42463 > Project: Spark > Issue Type: Improvement > Components: Tests, YARN >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > * > resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27180) Fix testing issues with yarn module in Hadoop-3
[ https://issues.apache.org/jira/browse/SPARK-27180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689569#comment-17689569 ] Apache Spark commented on SPARK-27180: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40052 > Fix testing issues with yarn module in Hadoop-3 > --- > > Key: SPARK-27180 > URL: https://issues.apache.org/jira/browse/SPARK-27180 > Project: Spark > Issue Type: Sub-task > Components: Build, Spark Core, YARN >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180
[ https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689568#comment-17689568 ] Apache Spark commented on SPARK-42463: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40052 > Clean up the third-party Java source code introduced by SPARK-27180 > --- > > Key: SPARK-42463 > URL: https://issues.apache.org/jira/browse/SPARK-42463 > Project: Spark > Issue Type: Improvement > Components: Tests, YARN >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > * > resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java > * > resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42460) E2E test should clean-up results
[ https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-42460: - Assignee: Herman van Hövell > E2E test should clean-up results > > > Key: SPARK-42460 > URL: https://issues.apache.org/jira/browse/SPARK-42460 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42460) E2E test should clean-up results
[ https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42460. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40048 [https://github.com/apache/spark/pull/40048] > E2E test should clean-up results > > > Key: SPARK-42460 > URL: https://issues.apache.org/jira/browse/SPARK-42460 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180
Yang Jie created SPARK-42463: Summary: Clean up the third-party Java source code introduced by SPARK-27180 Key: SPARK-42463 URL: https://issues.apache.org/jira/browse/SPARK-42463 Project: Spark Issue Type: Improvement Components: Tests, YARN Affects Versions: 3.5.0 Reporter: Yang Jie * resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java * resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java * resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42462: -- Fix Version/s: 3.3.3 (was: 3.3.2) > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.4, 3.4.0, 3.3.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > Fix For: 3.2.4, 3.4.0, 3.3.3 > > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-42462: - Assignee: Dongjoon Hyun > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.4, 3.4.0, 3.3.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42462. --- Fix Version/s: 3.2.4 3.3.2 3.4.0 Resolution: Fixed Issue resolved by pull request 40051 [https://github.com/apache/spark/pull/40051] > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.4, 3.4.0, 3.3.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > Fix For: 3.2.4, 3.3.2, 3.4.0 > > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42462: -- Affects Version/s: 3.2.3 > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.3, 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42462: -- Affects Version/s: 3.2.4 3.3.3 (was: 3.2.3) (was: 3.3.2) > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.2.4, 3.4.0, 3.3.3 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689511#comment-17689511 ] Apache Spark commented on SPARK-42462: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40051 > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42462: Assignee: Apache Spark > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689509#comment-17689509 ] Apache Spark commented on SPARK-42462: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40051 > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42462: Assignee: (was: Apache Spark) > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests
[ https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42462: -- Summary: Prevent `docker-image-tool.sh` from publishing OCI manifests (was: Prevent `docker buildx` from publishing OCI manifests) > Prevent `docker-image-tool.sh` from publishing OCI manifests > > > Key: SPARK-42462 > URL: https://issues.apache.org/jira/browse/SPARK-42462 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Priority: Blocker > > https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42462) Prevent `docker buildx` from publishing OCI manifests
Dongjoon Hyun created SPARK-42462: - Summary: Prevent `docker buildx` from publishing OCI manifests Key: SPARK-42462 URL: https://issues.apache.org/jira/browse/SPARK-42462 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.3.2, 3.4.0 Reporter: Dongjoon Hyun https://github.com/docker/buildx/issues/1509 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689484#comment-17689484 ] Yang Jie commented on SPARK-42452: -- Thanks for your explanation [~dongjoon] . Let's wait until the right time :D > Remove hadoop-2 profile from Apache Spark > - > > Key: SPARK-42452 > URL: https://issues.apache.org/jira/browse/SPARK-42452 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Major > > SPARK-40651 Drop Hadoop2 binary distribtuion from release process and > SPARK-42447 Remove Hadoop 2 GitHub Action job > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42461) Scala Client - Initial Set of Functions
[ https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689476#comment-17689476 ] Apache Spark commented on SPARK-42461: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40050 > Scala Client - Initial Set of Functions > --- > > Key: SPARK-42461 > URL: https://issues.apache.org/jira/browse/SPARK-42461 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42461) Scala Client - Initial Set of Functions
[ https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42461: Assignee: Apache Spark > Scala Client - Initial Set of Functions > --- > > Key: SPARK-42461 > URL: https://issues.apache.org/jira/browse/SPARK-42461 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42461) Scala Client - Initial Set of Functions
[ https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42461: Assignee: (was: Apache Spark) > Scala Client - Initial Set of Functions > --- > > Key: SPARK-42461 > URL: https://issues.apache.org/jira/browse/SPARK-42461 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42398) refine default column value framework
[ https://issues.apache.org/jira/browse/SPARK-42398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689474#comment-17689474 ] Apache Spark commented on SPARK-42398: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/40049 > refine default column value framework > - > > Key: SPARK-42398 > URL: https://issues.apache.org/jira/browse/SPARK-42398 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42398) refine default column value framework
[ https://issues.apache.org/jira/browse/SPARK-42398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689473#comment-17689473 ] Apache Spark commented on SPARK-42398: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/40049 > refine default column value framework > - > > Key: SPARK-42398 > URL: https://issues.apache.org/jira/browse/SPARK-42398 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42461) Scala Client - Initial Set of Functions
Herman van Hövell created SPARK-42461: - Summary: Scala Client - Initial Set of Functions Key: SPARK-42461 URL: https://issues.apache.org/jira/browse/SPARK-42461 Project: Spark Issue Type: Task Components: Connect Affects Versions: 3.4.0 Reporter: Herman van Hövell -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
[ https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42459: Assignee: Hyukjin Kwon > Create pyspark.sql.connect.utils to keep common codes > - > > Key: SPARK-42459 > URL: https://issues.apache.org/jira/browse/SPARK-42459 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > > SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is > actually unrelated from connect module. we should move all to a separate > utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
[ https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42459. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40047 [https://github.com/apache/spark/pull/40047] > Create pyspark.sql.connect.utils to keep common codes > - > > Key: SPARK-42459 > URL: https://issues.apache.org/jira/browse/SPARK-42459 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.4.0 > > > SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is > actually unrelated from connect module. we should move all to a separate > utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42460) E2E test should clean-up results
[ https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42460: Assignee: (was: Apache Spark) > E2E test should clean-up results > > > Key: SPARK-42460 > URL: https://issues.apache.org/jira/browse/SPARK-42460 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42460) E2E test should clean-up results
[ https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42460: Assignee: Apache Spark > E2E test should clean-up results > > > Key: SPARK-42460 > URL: https://issues.apache.org/jira/browse/SPARK-42460 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42460) E2E test should clean-up results
[ https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689459#comment-17689459 ] Apache Spark commented on SPARK-42460: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/40048 > E2E test should clean-up results > > > Key: SPARK-42460 > URL: https://issues.apache.org/jira/browse/SPARK-42460 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42460) E2E test should clean-up results
Herman van Hövell created SPARK-42460: - Summary: E2E test should clean-up results Key: SPARK-42460 URL: https://issues.apache.org/jira/browse/SPARK-42460 Project: Spark Issue Type: Task Components: Connect Affects Versions: 3.4.0 Reporter: Herman van Hövell -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-42451: --- Assignee: Yang Jie > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-42451. - Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 40039 [https://github.com/apache/spark/pull/40039] > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.5.0 > > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema
[ https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-41817: Assignee: Sandeep Singh > SparkSession.read support reading with schema > - > > Key: SPARK-41817 > URL: https://issues.apache.org/jira/browse/SPARK-41817 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Sandeep Singh >Assignee: Sandeep Singh >Priority: Major > > {code:java} > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", > line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load > Failed example: > with tempfile.TemporaryDirectory() as d: > # Write a DataFrame into a CSV file with a header > df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}]) > df.write.option("header", > True).mode("overwrite").format("csv").save(d) > # Read the CSV file as a DataFrame with 'nullValue' option set to > 'Hyukjin Kwon', > # and 'header' option set to `True`. > df = spark.read.load( > d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", > header=True) > df.printSchema() > df.show() > Exception raised: > Traceback (most recent call last): > File > "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py", > line 1350, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in > df.printSchema() > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1039, in printSchema > print(self._tree_string()) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1035, in _tree_string > query = self._plan.to_proto(self._session.client) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 92, in to_proto > plan.root.CopyFrom(self.plan(session)) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 245, in plan > plan.read.data_source.schema = self.schema > TypeError: bad argument type for built-in operation {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-41817) SparkSession.read support reading with schema
[ https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-41817. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40046 [https://github.com/apache/spark/pull/40046] > SparkSession.read support reading with schema > - > > Key: SPARK-41817 > URL: https://issues.apache.org/jira/browse/SPARK-41817 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Sandeep Singh >Assignee: Sandeep Singh >Priority: Major > Fix For: 3.4.0 > > > {code:java} > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", > line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load > Failed example: > with tempfile.TemporaryDirectory() as d: > # Write a DataFrame into a CSV file with a header > df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}]) > df.write.option("header", > True).mode("overwrite").format("csv").save(d) > # Read the CSV file as a DataFrame with 'nullValue' option set to > 'Hyukjin Kwon', > # and 'header' option set to `True`. > df = spark.read.load( > d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", > header=True) > df.printSchema() > df.show() > Exception raised: > Traceback (most recent call last): > File > "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py", > line 1350, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in > df.printSchema() > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1039, in printSchema > print(self._tree_string()) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1035, in _tree_string > query = self._plan.to_proto(self._session.client) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 92, in to_proto > plan.root.CopyFrom(self.plan(session)) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 245, in plan > plan.read.data_source.schema = self.schema > TypeError: bad argument type for built-in operation {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42453) Implement function max in Scala client
[ https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-42453. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40041 [https://github.com/apache/spark/pull/40041] > Implement function max in Scala client > -- > > Key: SPARK-42453 > URL: https://issues.apache.org/jira/browse/SPARK-42453 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42453) Implement function max in Scala client
[ https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-42453: --- Assignee: Rui Wang > Implement function max in Scala client > -- > > Key: SPARK-42453 > URL: https://issues.apache.org/jira/browse/SPARK-42453 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42456. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40044 [https://github.com/apache/spark/pull/40044] > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Assignee: Allan Folting >Priority: Major > Fix For: 3.4.0 > > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42456: Assignee: Allan Folting > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Assignee: Allan Folting >Priority: Major > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
[ https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689421#comment-17689421 ] Apache Spark commented on SPARK-42459: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/40047 > Create pyspark.sql.connect.utils to keep common codes > - > > Key: SPARK-42459 > URL: https://issues.apache.org/jira/browse/SPARK-42459 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > > SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is > actually unrelated from connect module. we should move all to a separate > utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
[ https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42459: Assignee: Apache Spark > Create pyspark.sql.connect.utils to keep common codes > - > > Key: SPARK-42459 > URL: https://issues.apache.org/jira/browse/SPARK-42459 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Major > > SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is > actually unrelated from connect module. we should move all to a separate > utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
[ https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42459: Assignee: (was: Apache Spark) > Create pyspark.sql.connect.utils to keep common codes > - > > Key: SPARK-42459 > URL: https://issues.apache.org/jira/browse/SPARK-42459 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > > SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is > actually unrelated from connect module. we should move all to a separate > utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema
[ https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41817: Assignee: (was: Apache Spark) > SparkSession.read support reading with schema > - > > Key: SPARK-41817 > URL: https://issues.apache.org/jira/browse/SPARK-41817 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Sandeep Singh >Priority: Major > > {code:java} > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", > line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load > Failed example: > with tempfile.TemporaryDirectory() as d: > # Write a DataFrame into a CSV file with a header > df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}]) > df.write.option("header", > True).mode("overwrite").format("csv").save(d) > # Read the CSV file as a DataFrame with 'nullValue' option set to > 'Hyukjin Kwon', > # and 'header' option set to `True`. > df = spark.read.load( > d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", > header=True) > df.printSchema() > df.show() > Exception raised: > Traceback (most recent call last): > File > "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py", > line 1350, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in > df.printSchema() > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1039, in printSchema > print(self._tree_string()) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1035, in _tree_string > query = self._plan.to_proto(self._session.client) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 92, in to_proto > plan.root.CopyFrom(self.plan(session)) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 245, in plan > plan.read.data_source.schema = self.schema > TypeError: bad argument type for built-in operation {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema
[ https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41817: Assignee: Apache Spark > SparkSession.read support reading with schema > - > > Key: SPARK-41817 > URL: https://issues.apache.org/jira/browse/SPARK-41817 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Sandeep Singh >Assignee: Apache Spark >Priority: Major > > {code:java} > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", > line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load > Failed example: > with tempfile.TemporaryDirectory() as d: > # Write a DataFrame into a CSV file with a header > df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}]) > df.write.option("header", > True).mode("overwrite").format("csv").save(d) > # Read the CSV file as a DataFrame with 'nullValue' option set to > 'Hyukjin Kwon', > # and 'header' option set to `True`. > df = spark.read.load( > d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", > header=True) > df.printSchema() > df.show() > Exception raised: > Traceback (most recent call last): > File > "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py", > line 1350, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in > df.printSchema() > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1039, in printSchema > print(self._tree_string()) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1035, in _tree_string > query = self._plan.to_proto(self._session.client) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 92, in to_proto > plan.root.CopyFrom(self.plan(session)) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 245, in plan > plan.read.data_source.schema = self.schema > TypeError: bad argument type for built-in operation {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41817) SparkSession.read support reading with schema
[ https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689420#comment-17689420 ] Apache Spark commented on SPARK-41817: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/40046 > SparkSession.read support reading with schema > - > > Key: SPARK-41817 > URL: https://issues.apache.org/jira/browse/SPARK-41817 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Sandeep Singh >Priority: Major > > {code:java} > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", > line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load > Failed example: > with tempfile.TemporaryDirectory() as d: > # Write a DataFrame into a CSV file with a header > df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}]) > df.write.option("header", > True).mode("overwrite").format("csv").save(d) > # Read the CSV file as a DataFrame with 'nullValue' option set to > 'Hyukjin Kwon', > # and 'header' option set to `True`. > df = spark.read.load( > d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", > header=True) > df.printSchema() > df.show() > Exception raised: > Traceback (most recent call last): > File > "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py", > line 1350, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in > df.printSchema() > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1039, in printSchema > print(self._tree_string()) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", > line 1035, in _tree_string > query = self._plan.to_proto(self._session.client) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 92, in to_proto > plan.root.CopyFrom(self.plan(session)) > File > "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line > 245, in plan > plan.read.data_source.schema = self.schema > TypeError: bad argument type for built-in operation {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes
Hyukjin Kwon created SPARK-42459: Summary: Create pyspark.sql.connect.utils to keep common codes Key: SPARK-42459 URL: https://issues.apache.org/jira/browse/SPARK-42459 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Hyukjin Kwon SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is actually unrelated from connect module. we should move all to a separate utils directory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42458) createDataFrame should support DDL string as schema
Takuya Ueshin created SPARK-42458: - Summary: createDataFrame should support DDL string as schema Key: SPARK-42458 URL: https://issues.apache.org/jira/browse/SPARK-42458 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Takuya Ueshin {code:python} File "/.../python/pyspark/sql/connect/readwriter.py", line 393, in pyspark.sql.connect.readwriter.DataFrameWriter.option Failed example: with tempfile.TemporaryDirectory() as d: # Write a DataFrame into a CSV file with 'nullValue' option set to 'Hyukjin Kwon'. df = spark.createDataFrame([(100, None)], "age INT, name STRING") df.write.option("nullValue", "Hyukjin Kwon").mode("overwrite").format("csv").save(d) # Read the CSV file as a DataFrame. spark.read.schema(df.schema).format('csv').load(d).show() Exception raised: Traceback (most recent call last): File "/.../lib/python3.9/doctest.py", line 1334, in __run exec(compile(example.source, filename, "single", File "", line 3, in df = spark.createDataFrame([(100, None)], "age INT, name STRING") File "/.../python/pyspark/sql/connect/session.py", line 312, in createDataFrame raise ValueError( ValueError: Some of types cannot be determined after inferring, a StructType Schema is required in this case {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42426) insertInto fails when the column names are different from the table columns
[ https://issues.apache.org/jira/browse/SPARK-42426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42426. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40024 [https://github.com/apache/spark/pull/40024] > insertInto fails when the column names are different from the table columns > --- > > Key: SPARK-42426 > URL: https://issues.apache.org/jira/browse/SPARK-42426 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > Fix For: 3.4.0 > > > {noformat} > File "/.../python/pyspark/sql/connect/readwriter.py", line 518, in > pyspark.sql.connect.readwriter.DataFrameWriter.insertInto > Failed example: > df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA") > Exception raised: > Traceback (most recent call last): > File "/.../lib/python3.9/doctest.py", line 1334, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameWriter.insertInto[3]>", line 1, in > > df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA") > File "/.../python/pyspark/sql/connect/readwriter.py", line 477, in > insertInto > self.saveAsTable(tableName) > File "/.../python/pyspark/sql/connect/readwriter.py", line 495, in > saveAsTable > > self._spark.client.execute_command(self._write.command(self._spark.client)) > File "/.../python/pyspark/sql/connect/client.py", line 553, in > execute_command > self._execute(req) > File "/.../python/pyspark/sql/connect/client.py", line 648, in _execute > self._handle_error(rpc_error) > File "/.../python/pyspark/sql/connect/client.py", line 718, in > _handle_error > raise convert_exception(info, status.message) from None > pyspark.errors.exceptions.connect.AnalysisException: Cannot resolve 'age' > given input columns: [col1, col2]. > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42426) insertInto fails when the column names are different from the table columns
[ https://issues.apache.org/jira/browse/SPARK-42426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-42426: Assignee: Takuya Ueshin > insertInto fails when the column names are different from the table columns > --- > > Key: SPARK-42426 > URL: https://issues.apache.org/jira/browse/SPARK-42426 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > > {noformat} > File "/.../python/pyspark/sql/connect/readwriter.py", line 518, in > pyspark.sql.connect.readwriter.DataFrameWriter.insertInto > Failed example: > df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA") > Exception raised: > Traceback (most recent call last): > File "/.../lib/python3.9/doctest.py", line 1334, in __run > exec(compile(example.source, filename, "single", > File " pyspark.sql.connect.readwriter.DataFrameWriter.insertInto[3]>", line 1, in > > df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA") > File "/.../python/pyspark/sql/connect/readwriter.py", line 477, in > insertInto > self.saveAsTable(tableName) > File "/.../python/pyspark/sql/connect/readwriter.py", line 495, in > saveAsTable > > self._spark.client.execute_command(self._write.command(self._spark.client)) > File "/.../python/pyspark/sql/connect/client.py", line 553, in > execute_command > self._execute(req) > File "/.../python/pyspark/sql/connect/client.py", line 648, in _execute > self._handle_error(rpc_error) > File "/.../python/pyspark/sql/connect/client.py", line 718, in > _handle_error > raise convert_exception(info, status.message) from None > pyspark.errors.exceptions.connect.AnalysisException: Cannot resolve 'age' > given input columns: [col1, col2]. > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
[ https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-42455. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40042 [https://github.com/apache/spark/pull/40042] > Rename JDBC option inferTimestampNTZType as preferTimestampNTZ > -- > > Key: SPARK-42455 > URL: https://issues.apache.org/jira/browse/SPARK-42455 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42384) Mask function's generated code does not handle null input
[ https://issues.apache.org/jira/browse/SPARK-42384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-42384. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39945 [https://github.com/apache/spark/pull/39945] > Mask function's generated code does not handle null input > - > > Key: SPARK-42384 > URL: https://issues.apache.org/jira/browse/SPARK-42384 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Bruce Robbins >Assignee: Bruce Robbins >Priority: Major > Fix For: 3.4.0 > > > Example: > {noformat} > create or replace temp view v1 as > select * from values > (null), > ('AbCD123-@$#') > as data(col1); > cache table v1; > select mask(col1) from v1; > {noformat} > This query results in a {{NullPointerException}}: > {noformat} > 23/02/07 16:36:06 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3) > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) > {noformat} > The generated code calls {{UnsafeWriter.write(0, value_0)}} regardless of > whether {{Mask.transformInput}} returns null or not. The > {{UnsafeWriter.write}} method for {{UTF8String}} does not expect a null > pointer. > {noformat} > /* 031 */ boolean isNull_1 = i.isNullAt(0); > /* 032 */ UTF8String value_1 = isNull_1 ? > /* 033 */ null : (i.getUTF8String(0)); > /* 034 */ > /* 035 */ > /* 036 */ > /* 037 */ > /* 038 */ UTF8String value_0 = null; > /* 039 */ value_0 = > org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, > ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* > literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) > references[3] /* literal */));; > /* 040 */ if (false) { > /* 041 */ mutableStateArray_0[0].setNullAt(0); > /* 042 */ } else { > /* 043 */ mutableStateArray_0[0].write(0, value_0); > /* 044 */ } > /* 045 */ return (mutableStateArray_0[0].getRow()); > /* 046 */ } > {noformat} > The bug is not exercised by a literal null input value, since there appears > to be some optimization that simply replaces the entire function call with a > null literal: > {noformat} > spark-sql> explain SELECT mask(NULL); > == Physical Plan == > *(1) Project [null AS mask(NULL, X, x, n, NULL)#47] > +- *(1) Scan OneRowRelation[] > Time taken: 0.026 seconds, Fetched 1 row(s) > spark-sql> SELECT mask(NULL); > NULL > Time taken: 0.042 seconds, Fetched 1 row(s) > spark-sql> > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42384) Mask function's generated code does not handle null input
[ https://issues.apache.org/jira/browse/SPARK-42384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-42384: -- Assignee: Bruce Robbins > Mask function's generated code does not handle null input > - > > Key: SPARK-42384 > URL: https://issues.apache.org/jira/browse/SPARK-42384 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Bruce Robbins >Assignee: Bruce Robbins >Priority: Major > > Example: > {noformat} > create or replace temp view v1 as > select * from values > (null), > ('AbCD123-@$#') > as data(col1); > cache table v1; > select mask(col1) from v1; > {noformat} > This query results in a {{NullPointerException}}: > {noformat} > 23/02/07 16:36:06 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3) > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760) > {noformat} > The generated code calls {{UnsafeWriter.write(0, value_0)}} regardless of > whether {{Mask.transformInput}} returns null or not. The > {{UnsafeWriter.write}} method for {{UTF8String}} does not expect a null > pointer. > {noformat} > /* 031 */ boolean isNull_1 = i.isNullAt(0); > /* 032 */ UTF8String value_1 = isNull_1 ? > /* 033 */ null : (i.getUTF8String(0)); > /* 034 */ > /* 035 */ > /* 036 */ > /* 037 */ > /* 038 */ UTF8String value_0 = null; > /* 039 */ value_0 = > org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, > ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* > literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) > references[3] /* literal */));; > /* 040 */ if (false) { > /* 041 */ mutableStateArray_0[0].setNullAt(0); > /* 042 */ } else { > /* 043 */ mutableStateArray_0[0].write(0, value_0); > /* 044 */ } > /* 045 */ return (mutableStateArray_0[0].getRow()); > /* 046 */ } > {noformat} > The bug is not exercised by a literal null input value, since there appears > to be some optimization that simply replaces the entire function call with a > null literal: > {noformat} > spark-sql> explain SELECT mask(NULL); > == Physical Plan == > *(1) Project [null AS mask(NULL, X, x, n, NULL)#47] > +- *(1) Scan OneRowRelation[] > Time taken: 0.026 seconds, Fetched 1 row(s) > spark-sql> SELECT mask(NULL); > NULL > Time taken: 0.042 seconds, Fetched 1 row(s) > spark-sql> > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41591) Implement functionality for training a PyTorch file locally
[ https://issues.apache.org/jira/browse/SPARK-41591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689407#comment-17689407 ] Apache Spark commented on SPARK-41591: -- User 'rithwik-db' has created a pull request for this issue: https://github.com/apache/spark/pull/40045 > Implement functionality for training a PyTorch file locally > --- > > Key: SPARK-41591 > URL: https://issues.apache.org/jira/browse/SPARK-41591 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 3.4.0 >Reporter: Rithwik Ediga Lakhamsani >Assignee: Rithwik Ediga Lakhamsani >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41591) Implement functionality for training a PyTorch file locally
[ https://issues.apache.org/jira/browse/SPARK-41591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689405#comment-17689405 ] Apache Spark commented on SPARK-41591: -- User 'rithwik-db' has created a pull request for this issue: https://github.com/apache/spark/pull/40045 > Implement functionality for training a PyTorch file locally > --- > > Key: SPARK-41591 > URL: https://issues.apache.org/jira/browse/SPARK-41591 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 3.4.0 >Reporter: Rithwik Ediga Lakhamsani >Assignee: Rithwik Ediga Lakhamsani >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42457) Scala Client Session Read API
[ https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42457: Assignee: Apache Spark > Scala Client Session Read API > - > > Key: SPARK-42457 > URL: https://issues.apache.org/jira/browse/SPARK-42457 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Assignee: Apache Spark >Priority: Major > > Add SparkSession#read impl to be able to read data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42457) Scala Client Session Read API
[ https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42457: Assignee: (was: Apache Spark) > Scala Client Session Read API > - > > Key: SPARK-42457 > URL: https://issues.apache.org/jira/browse/SPARK-42457 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Major > > Add SparkSession#read impl to be able to read data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42457) Scala Client Session Read API
[ https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689370#comment-17689370 ] Apache Spark commented on SPARK-42457: -- User 'zhenlineo' has created a pull request for this issue: https://github.com/apache/spark/pull/40025 > Scala Client Session Read API > - > > Key: SPARK-42457 > URL: https://issues.apache.org/jira/browse/SPARK-42457 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Major > > Add SparkSession#read impl to be able to read data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42457) Scala Client Session Read API
Zhen Li created SPARK-42457: --- Summary: Scala Client Session Read API Key: SPARK-42457 URL: https://issues.apache.org/jira/browse/SPARK-42457 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 3.4.0 Reporter: Zhen Li Add SparkSession#read impl to be able to read data. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42456: Assignee: (was: Apache Spark) > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Priority: Major > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42456: Assignee: Apache Spark > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Assignee: Apache Spark >Priority: Major > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689365#comment-17689365 ] Apache Spark commented on SPARK-42456: -- User 'allanf-db' has created a pull request for this issue: https://github.com/apache/spark/pull/40044 > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Priority: Major > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[ https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Folting updated SPARK-42456: -- Description: Creating a new PySpark migration guide sub page and consolidating the existing 9 separate pages into this one new page. This makes it easier to take a look across multiple version upgrades by simply scrolling on the page. Also, this is similar to the Spark Core Migration Guide page here: [https://spark.apache.org/docs/latest/core-migration-guide.html] Updating the existing main Migration Guide page to point to this new sub page and also making some minor language updates to help readers. was: Creating a new PySpark migration guide and consolidating the existing 9 separate pages into this one new page. This makes it easier to take a look across multiple version upgrades by simply scrolling on the page. Also, this is similar to the Spark Core Migration Guide page here: [https://spark.apache.org/docs/latest/core-migration-guide.html] Updating the existing main Migration Guide page to point to this new sub page and also making some minor language updates to help readers. > Consolidating the PySpark version upgrade note pages into a single page to > make it easier to read > - > > Key: SPARK-42456 > URL: https://issues.apache.org/jira/browse/SPARK-42456 > Project: Spark > Issue Type: Documentation > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Allan Folting >Priority: Major > > Creating a new PySpark migration guide sub page and consolidating the > existing 9 separate pages into this one new page. This makes it easier to > take a look across multiple version upgrades by simply scrolling on the page. > Also, this is similar to the Spark Core Migration Guide page here: > [https://spark.apache.org/docs/latest/core-migration-guide.html] > > Updating the existing main Migration Guide page to point to this new sub page > and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
Allan Folting created SPARK-42456: - Summary: Consolidating the PySpark version upgrade note pages into a single page to make it easier to read Key: SPARK-42456 URL: https://issues.apache.org/jira/browse/SPARK-42456 Project: Spark Issue Type: Documentation Components: PySpark Affects Versions: 3.4.0 Reporter: Allan Folting Creating a new PySpark migration guide and consolidating the existing 9 separate pages into this one new page. This makes it easier to take a look across multiple version upgrades by simply scrolling on the page. Also, this is similar to the Spark Core Migration Guide page here: [https://spark.apache.org/docs/latest/core-migration-guide.html] Updating the existing main Migration Guide page to point to this new sub page and also making some minor language updates to help readers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689354#comment-17689354 ] Dongjoon Hyun commented on SPARK-42452: --- Not yet~ I simply removed the broken one because no one will take a look. Before Apache Spark 3.4 release, we cannot change `master` branch dramatically. We still need to back-port many bug fixes during RC1 ~ RCx, [~LuciferYang]. So, please hold on your passion a little more. ;) > Remove hadoop-2 profile from Apache Spark > - > > Key: SPARK-42452 > URL: https://issues.apache.org/jira/browse/SPARK-42452 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Major > > SPARK-40651 Drop Hadoop2 binary distribtuion from release process and > SPARK-42447 Remove Hadoop 2 GitHub Action job > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39904) Rename inferDate to preferDate and fix an issue when inferring schema
[ https://issues.apache.org/jira/browse/SPARK-39904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689349#comment-17689349 ] Apache Spark commented on SPARK-39904: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/40043 > Rename inferDate to preferDate and fix an issue when inferring schema > - > > Key: SPARK-39904 > URL: https://issues.apache.org/jira/browse/SPARK-39904 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Ivan Sadikov >Assignee: Ivan Sadikov >Priority: Major > Fix For: 3.4.0 > > > Follow-up for https://issues.apache.org/jira/browse/SPARK-39469. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
[ https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42455: Assignee: Gengliang Wang (was: Apache Spark) > Rename JDBC option inferTimestampNTZType as preferTimestampNTZ > -- > > Key: SPARK-42455 > URL: https://issues.apache.org/jira/browse/SPARK-42455 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
[ https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42455: Assignee: Apache Spark (was: Gengliang Wang) > Rename JDBC option inferTimestampNTZType as preferTimestampNTZ > -- > > Key: SPARK-42455 > URL: https://issues.apache.org/jira/browse/SPARK-42455 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
[ https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689343#comment-17689343 ] Apache Spark commented on SPARK-42455: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/40042 > Rename JDBC option inferTimestampNTZType as preferTimestampNTZ > -- > > Key: SPARK-42455 > URL: https://issues.apache.org/jira/browse/SPARK-42455 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
Gengliang Wang created SPARK-42455: -- Summary: Rename JDBC option inferTimestampNTZType as preferTimestampNTZ Key: SPARK-42455 URL: https://issues.apache.org/jira/browse/SPARK-42455 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42454) SPJ: encapsulate all SPJ related parameters in BatchScanExec
Chao Sun created SPARK-42454: Summary: SPJ: encapsulate all SPJ related parameters in BatchScanExec Key: SPARK-42454 URL: https://issues.apache.org/jira/browse/SPARK-42454 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.1 Reporter: Chao Sun The list of SPJ parameters in {{BatchScanExec}} keeps growing, which is annoying since there are many places which do pattern-matching on {{BatchScanExec}} and they have to change accordingly. To make this less disruptive, we can introduce a struct for all the SPJ classes and use that as the parameter for {{BatchScanExec}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-40653) Protobuf Support in Structured Streaming
[ https://issues.apache.org/jira/browse/SPARK-40653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi resolved SPARK-40653. -- Fix Version/s: 3.4.0 Resolution: Fixed Protobuf functions have been in use for couple of months. > Protobuf Support in Structured Streaming > > > Key: SPARK-40653 > URL: https://issues.apache.org/jira/browse/SPARK-40653 > Project: Spark > Issue Type: Epic > Components: Protobuf, Structured Streaming >Affects Versions: 3.4.0 >Reporter: Raghu Angadi >Priority: Major > Fix For: 3.4.0 > > > Add support for Protobuf messages in streaming sources. This would be similar > to Avro format support. This includes features like schema-registry, Python > support, schema evolution, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42453) Implement function max in Scala client
[ https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42453: Assignee: (was: Apache Spark) > Implement function max in Scala client > -- > > Key: SPARK-42453 > URL: https://issues.apache.org/jira/browse/SPARK-42453 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42453) Implement function max in Scala client
[ https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689307#comment-17689307 ] Apache Spark commented on SPARK-42453: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40041 > Implement function max in Scala client > -- > > Key: SPARK-42453 > URL: https://issues.apache.org/jira/browse/SPARK-42453 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42453) Implement function max in Scala client
[ https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42453: Assignee: Apache Spark > Implement function max in Scala client > -- > > Key: SPARK-42453 > URL: https://issues.apache.org/jira/browse/SPARK-42453 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42441) Scala Client - Implement Column API
[ https://issues.apache.org/jira/browse/SPARK-42441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hövell resolved SPARK-42441. --- Fix Version/s: 3.4.0 Assignee: Herman van Hövell Resolution: Fixed > Scala Client - Implement Column API > --- > > Key: SPARK-42441 > URL: https://issues.apache.org/jira/browse/SPARK-42441 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Herman van Hövell >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42453) Implement function max in Scala client
Rui Wang created SPARK-42453: Summary: Implement function max in Scala client Key: SPARK-42453 URL: https://issues.apache.org/jira/browse/SPARK-42453 Project: Spark Issue Type: Task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42445) Fix SparkR install.spark function
[ https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42445: -- Fix Version/s: 3.3.3 > Fix SparkR install.spark function > - > > Key: SPARK-42445 > URL: https://issues.apache.org/jira/browse/SPARK-42445 > Project: Spark > Issue Type: Bug > Components: R >Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.4.0, 3.3.3 > > > {code} > $ R > R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid" > Copyright (C) 2022 The R Foundation for Statistical Computing > Platform: aarch64-apple-darwin20 (64-bit) > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > Natural language support but running in an English locale > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > library(SparkR) > Attaching package: ‘SparkR’ > The following objects are masked from ‘package:stats’: > cov, filter, lag, na.omit, predict, sd, var, window > The following objects are masked from ‘package:base’: > as.data.frame, colnames, colnames<-, drop, endsWith, intersect, > rank, rbind, sample, startsWith, subset, summary, transform, union > > install.spark() > Spark not found in the cache directory. Installation will start. > MirrorUrl not provided. > Looking for preferred site from apache website... > Preferred mirror site found: https://dlcdn.apache.org/spark > Downloading spark-3.3.2 for Hadoop 2.7 from: > - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz > trying URL > 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz' > simpleWarning in download.file(remotePath, localPath): downloaded length 0 != > reported length 196 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-42445) Fix SparkR install.spark function
[ https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-42445. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 40031 [https://github.com/apache/spark/pull/40031] > Fix SparkR install.spark function > - > > Key: SPARK-42445 > URL: https://issues.apache.org/jira/browse/SPARK-42445 > Project: Spark > Issue Type: Bug > Components: R >Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.4.0 > > > {code} > $ R > R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid" > Copyright (C) 2022 The R Foundation for Statistical Computing > Platform: aarch64-apple-darwin20 (64-bit) > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > Natural language support but running in an English locale > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > library(SparkR) > Attaching package: ‘SparkR’ > The following objects are masked from ‘package:stats’: > cov, filter, lag, na.omit, predict, sd, var, window > The following objects are masked from ‘package:base’: > as.data.frame, colnames, colnames<-, drop, endsWith, intersect, > rank, rbind, sample, startsWith, subset, summary, transform, union > > install.spark() > Spark not found in the cache directory. Installation will start. > MirrorUrl not provided. > Looking for preferred site from apache website... > Preferred mirror site found: https://dlcdn.apache.org/spark > Downloading spark-3.3.2 for Hadoop 2.7 from: > - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz > trying URL > 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz' > simpleWarning in download.file(remotePath, localPath): downloaded length 0 != > reported length 196 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42445) Fix SparkR install.spark function
[ https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-42445: - Assignee: Dongjoon Hyun > Fix SparkR install.spark function > - > > Key: SPARK-42445 > URL: https://issues.apache.org/jira/browse/SPARK-42445 > Project: Spark > Issue Type: Bug > Components: R >Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > > {code} > $ R > R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid" > Copyright (C) 2022 The R Foundation for Statistical Computing > Platform: aarch64-apple-darwin20 (64-bit) > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > Natural language support but running in an English locale > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > library(SparkR) > Attaching package: ‘SparkR’ > The following objects are masked from ‘package:stats’: > cov, filter, lag, na.omit, predict, sd, var, window > The following objects are masked from ‘package:base’: > as.data.frame, colnames, colnames<-, drop, endsWith, intersect, > rank, rbind, sample, startsWith, subset, summary, transform, union > > install.spark() > Spark not found in the cache directory. Installation will start. > MirrorUrl not provided. > Looking for preferred site from apache website... > Preferred mirror site found: https://dlcdn.apache.org/spark > Downloading spark-3.3.2 for Hadoop 2.7 from: > - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz > trying URL > 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz' > simpleWarning in download.file(remotePath, localPath): downloaded length 0 != > reported length 196 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42399) CONV() silently overflows returning wrong results
[ https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42399: Assignee: Apache Spark > CONV() silently overflows returning wrong results > - > > Key: SPARK-42399 > URL: https://issues.apache.org/jira/browse/SPARK-42399 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Assignee: Apache Spark >Priority: Critical > > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 2.114 seconds, Fetched 1 row(s) > spark-sql> set spark.sql.ansi.enabled = true; > spark.sql.ansi.enabled true > Time taken: 0.068 seconds, Fetched 1 row(s) > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 0.05 seconds, Fetched 1 row(s) > In ANSI mode we should raise an error for sure. > In non ANSI either an error or a NULL maybe be acceptable. > Alternatively, of course, we could consider if we can support arbitrary > domains since the result is a STRING again. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42399) CONV() silently overflows returning wrong results
[ https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689226#comment-17689226 ] Apache Spark commented on SPARK-42399: -- User 'NarekDW' has created a pull request for this issue: https://github.com/apache/spark/pull/40040 > CONV() silently overflows returning wrong results > - > > Key: SPARK-42399 > URL: https://issues.apache.org/jira/browse/SPARK-42399 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Critical > > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 2.114 seconds, Fetched 1 row(s) > spark-sql> set spark.sql.ansi.enabled = true; > spark.sql.ansi.enabled true > Time taken: 0.068 seconds, Fetched 1 row(s) > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 0.05 seconds, Fetched 1 row(s) > In ANSI mode we should raise an error for sure. > In non ANSI either an error or a NULL maybe be acceptable. > Alternatively, of course, we could consider if we can support arbitrary > domains since the result is a STRING again. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42399) CONV() silently overflows returning wrong results
[ https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42399: Assignee: (was: Apache Spark) > CONV() silently overflows returning wrong results > - > > Key: SPARK-42399 > URL: https://issues.apache.org/jira/browse/SPARK-42399 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Critical > > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 2.114 seconds, Fetched 1 row(s) > spark-sql> set spark.sql.ansi.enabled = true; > spark.sql.ansi.enabled true > Time taken: 0.068 seconds, Fetched 1 row(s) > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 0.05 seconds, Fetched 1 row(s) > In ANSI mode we should raise an error for sure. > In non ANSI either an error or a NULL maybe be acceptable. > Alternatively, of course, we could consider if we can support arbitrary > domains since the result is a STRING again. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689222#comment-17689222 ] Yang Jie commented on SPARK-42452: -- Is it time to clean up hadoop-2 profile ? [~dongjoon] [~gurwls223] > Remove hadoop-2 profile from Apache Spark > - > > Key: SPARK-42452 > URL: https://issues.apache.org/jira/browse/SPARK-42452 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Major > > SPARK-40651 Drop Hadoop2 binary distribtuion from release process and > SPARK-42447 Remove Hadoop 2 GitHub Action job > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42452) Remove hadoop-2 profile from Apache Spark
[ https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-42452: - Summary: Remove hadoop-2 profile from Apache Spark (was: Remove hadoop-2 profile from Spark) > Remove hadoop-2 profile from Apache Spark > - > > Key: SPARK-42452 > URL: https://issues.apache.org/jira/browse/SPARK-42452 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Major > > SPARK-40651 Drop Hadoop2 binary distribtuion from release process and > SPARK-42447 Remove Hadoop 2 GitHub Action job > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42452) Remove hadoop-2 profile from Spark
[ https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-42452: - Description: SPARK-40651 Drop Hadoop2 binary distribtuion from release process and SPARK-42447 Remove Hadoop 2 GitHub Action job > Remove hadoop-2 profile from Spark > -- > > Key: SPARK-42452 > URL: https://issues.apache.org/jira/browse/SPARK-42452 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Major > > SPARK-40651 Drop Hadoop2 binary distribtuion from release process and > SPARK-42447 Remove Hadoop 2 GitHub Action job > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42452) Remove hadoop-2 profile from Spark
Yang Jie created SPARK-42452: Summary: Remove hadoop-2 profile from Spark Key: SPARK-42452 URL: https://issues.apache.org/jira/browse/SPARK-42452 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.5.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689215#comment-17689215 ] Apache Spark commented on SPARK-42451: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40039 > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42451: Assignee: (was: Apache Spark) > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689214#comment-17689214 ] Apache Spark commented on SPARK-42451: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40039 > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42451: Assignee: Apache Spark > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-42451: - Summary: Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite` (was: Remove 3.1 and Java 17 check from `testingVersions` in `HiveExternalCatalogVersionsSuite`) > Remove 3.1 and Java 17 check from filter condition of `testingVersions` in > `HiveExternalCatalogVersionsSuite` > -- > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42451) Remove 3.1 and Java 17 check from `testingVersions` in `HiveExternalCatalogVersionsSuite`
[ https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-42451: - Summary: Remove 3.1 and Java 17 check from `testingVersions` in `HiveExternalCatalogVersionsSuite` (was: Remove 3.1 and Java 17 condition check from `testingVersions` in `HiveExternalCatalogVersionsSuite`) > Remove 3.1 and Java 17 check from `testingVersions` in > `HiveExternalCatalogVersionsSuite` > - > > Key: SPARK-42451 > URL: https://issues.apache.org/jira/browse/SPARK-42451 > Project: Spark > Issue Type: Improvement > Components: SQL, Tests >Affects Versions: 3.5.0 >Reporter: Yang Jie >Priority: Minor > > Spark 3.1 already EOL and has been deleted from > [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the > filter conditions of `testingVersions`, all version already support Java 17 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42451) Remove 3.1 and Java 17 condition check from `testingVersions` in `HiveExternalCatalogVersionsSuite`
Yang Jie created SPARK-42451: Summary: Remove 3.1 and Java 17 condition check from `testingVersions` in `HiveExternalCatalogVersionsSuite` Key: SPARK-42451 URL: https://issues.apache.org/jira/browse/SPARK-42451 Project: Spark Issue Type: Improvement Components: SQL, Tests Affects Versions: 3.5.0 Reporter: Yang Jie Spark 3.1 already EOL and has been deleted from [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the filter conditions of `testingVersions`, all version already support Java 17 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42449) Fix `native-image.propertie` in Scala Client
[ https://issues.apache.org/jira/browse/SPARK-42449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhen Li updated SPARK-42449: Description: The content of `native-image.propertie` file is not correct. This file is used to create a native image using GraalVM see more info: https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/ https://www.graalvm.org/22.1/reference-manual/native-image/BuildConfiguration/ e.g. The content in `META-INF/native-image/io.netty` should also relocated, just as in `grpc-netty-shaded`. Now, the content of `META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is ``` Args = --initialize-at-build-time=io.netty \ --initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter ``` but it should like ``` Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \ --initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter ``` Other Transformer may need to be added See more info in this discussion thread https://github.com/apache/spark/pull/39866#discussion_r1098833915 was: The content of `native-image.propertie` file is not correct. This file is used by GraalVM see https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/. e.g. The content in `META-INF/native-image/io.netty` should also relocated, just as in `grpc-netty-shaded`. Now, the content of `META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is ``` Args = --initialize-at-build-time=io.netty \ --initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter ``` but it should like ``` Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \ --initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter ``` Other Transformer may need to be added See more info in this discussion thread https://github.com/apache/spark/pull/39866#discussion_r1098833915 > Fix `native-image.propertie` in Scala Client > > > Key: SPARK-42449 > URL: https://issues.apache.org/jira/browse/SPARK-42449 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Priority: Minor > > The content of `native-image.propertie` file is not correct. This file is > used to create a native image using GraalVM see more info: > https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/ > https://www.graalvm.org/22.1/reference-manual/native-image/BuildConfiguration/ > e.g. > The content in `META-INF/native-image/io.netty` should also relocated, just > as in `grpc-netty-shaded`. > Now, the content of > `META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is > ``` > Args = --initialize-at-build-time=io.netty \ > > --initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter > ``` > but it should like > ``` > Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \ > > --initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter > > ``` > Other Transformer may need to be added > See more info in this discussion thread > https://github.com/apache/spark/pull/39866#discussion_r1098833915 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To
[jira] [Updated] (SPARK-42450) dataset.where() omit quotes if where IN clause has more than 10 operands
[ https://issues.apache.org/jira/browse/SPARK-42450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vadim updated SPARK-42450: -- Description: dataset.where()/filter() omit string quotes if where IN clause has more than 10 operands. With datasourceV1 works as expected. Attached files: java-code.txt, stacktrace.txt, sql.txt - Spark verison 3.3.0 - Scala version 2.12 - DatasourceV2 - Postgres - Postrgres JDBC Driver: 42+ - Java8 was: dataset.where()/filter() omit string quotes if where IN clause has more than 10 operands. With datasourceV1 works as expected. Attached files: java-code.txt, stacktrace.txt - Spark verison 3.3.0 - Scala version 2.12 - DatasourceV2 - Postgres - Postrgres JDBC Driver: 42+ - Java8 *Expected query:* SELECT "flight_id", "flight_no" FROM "bookings"."flights" WHERE ( "flight_no"IN ( '55', '826', '845', '799', '561', '39', '385', '549', '576', '15', '857', '248', '324', '569', '267' ) ) *actual query:* SELECT "flight_id", "flight_no", FROM "bookings"."flights" WHERE ( "flight_no"IN ( 55, 826, 845, 799, 561, 39, 385, 549, 576, 15, 857, 248, 324, 569, 267 ) ) > dataset.where() omit quotes if where IN clause has more than 10 operands > > > Key: SPARK-42450 > URL: https://issues.apache.org/jira/browse/SPARK-42450 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Vadim >Priority: Major > Fix For: 3.3.2, 3.4.0 > > Attachments: java-code.txt, sql.txt, stacktrace.txt > > > dataset.where()/filter() omit string quotes if where IN clause has more than > 10 operands. With datasourceV1 works as expected. > Attached files: java-code.txt, stacktrace.txt, sql.txt > - Spark verison 3.3.0 > - Scala version 2.12 > - DatasourceV2 > - Postgres > - Postrgres JDBC Driver: 42+ > - Java8 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org