[jira] [Assigned] (SPARK-42759) Avoid repeated downloads of maven.tar.gz
[ https://issues.apache.org/jira/browse/SPARK-42759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42759: Assignee: Apache Spark > Avoid repeated downloads of maven.tar.gz > > > Key: SPARK-42759 > URL: https://issues.apache.org/jira/browse/SPARK-42759 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42758) Remove dependency on breeze
[ https://issues.apache.org/jira/browse/SPARK-42758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42758: Assignee: Apache Spark > Remove dependency on breeze > --- > > Key: SPARK-42758 > URL: https://issues.apache.org/jira/browse/SPARK-42758 > Project: Spark > Issue Type: Improvement > Components: Build, MLlib >Affects Versions: 3.5.0 >Reporter: BingKun Pan >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42758) Remove dependency on breeze
[ https://issues.apache.org/jira/browse/SPARK-42758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42758: Assignee: (was: Apache Spark) > Remove dependency on breeze > --- > > Key: SPARK-42758 > URL: https://issues.apache.org/jira/browse/SPARK-42758 > Project: Spark > Issue Type: Improvement > Components: Build, MLlib >Affects Versions: 3.5.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42758) Remove dependency on breeze
[ https://issues.apache.org/jira/browse/SPARK-42758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699214#comment-17699214 ] Apache Spark commented on SPARK-42758: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40378 > Remove dependency on breeze > --- > > Key: SPARK-42758 > URL: https://issues.apache.org/jira/browse/SPARK-42758 > Project: Spark > Issue Type: Improvement > Components: Build, MLlib >Affects Versions: 3.5.0 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42757) Implement textFile for DataFrameReader
[ https://issues.apache.org/jira/browse/SPARK-42757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699204#comment-17699204 ] Apache Spark commented on SPARK-42757: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40377 > Implement textFile for DataFrameReader > -- > > Key: SPARK-42757 > URL: https://issues.apache.org/jira/browse/SPARK-42757 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42757) Implement textFile for DataFrameReader
[ https://issues.apache.org/jira/browse/SPARK-42757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699203#comment-17699203 ] Apache Spark commented on SPARK-42757: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40377 > Implement textFile for DataFrameReader > -- > > Key: SPARK-42757 > URL: https://issues.apache.org/jira/browse/SPARK-42757 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42757) Implement textFile for DataFrameReader
[ https://issues.apache.org/jira/browse/SPARK-42757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42757: Assignee: Apache Spark > Implement textFile for DataFrameReader > -- > > Key: SPARK-42757 > URL: https://issues.apache.org/jira/browse/SPARK-42757 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42757) Implement textFile for DataFrameReader
[ https://issues.apache.org/jira/browse/SPARK-42757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42757: Assignee: (was: Apache Spark) > Implement textFile for DataFrameReader > -- > > Key: SPARK-42757 > URL: https://issues.apache.org/jira/browse/SPARK-42757 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42756) Helper function to convert proto literal to value in Python Client
[ https://issues.apache.org/jira/browse/SPARK-42756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42756: Assignee: Apache Spark > Helper function to convert proto literal to value in Python Client > -- > > Key: SPARK-42756 > URL: https://issues.apache.org/jira/browse/SPARK-42756 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42756) Helper function to convert proto literal to value in Python Client
[ https://issues.apache.org/jira/browse/SPARK-42756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42756: Assignee: (was: Apache Spark) > Helper function to convert proto literal to value in Python Client > -- > > Key: SPARK-42756 > URL: https://issues.apache.org/jira/browse/SPARK-42756 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42756) Helper function to convert proto literal to value in Python Client
[ https://issues.apache.org/jira/browse/SPARK-42756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699196#comment-17699196 ] Apache Spark commented on SPARK-42756: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40376 > Helper function to convert proto literal to value in Python Client > -- > > Key: SPARK-42756 > URL: https://issues.apache.org/jira/browse/SPARK-42756 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42755) Factor literal value conversion out to connect-common
[ https://issues.apache.org/jira/browse/SPARK-42755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699184#comment-17699184 ] Apache Spark commented on SPARK-42755: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40375 > Factor literal value conversion out to connect-common > - > > Key: SPARK-42755 > URL: https://issues.apache.org/jira/browse/SPARK-42755 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42755) Factor literal value conversion out to connect-common
[ https://issues.apache.org/jira/browse/SPARK-42755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42755: Assignee: Apache Spark > Factor literal value conversion out to connect-common > - > > Key: SPARK-42755 > URL: https://issues.apache.org/jira/browse/SPARK-42755 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42755) Factor literal value conversion out to connect-common
[ https://issues.apache.org/jira/browse/SPARK-42755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699183#comment-17699183 ] Apache Spark commented on SPARK-42755: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40375 > Factor literal value conversion out to connect-common > - > > Key: SPARK-42755 > URL: https://issues.apache.org/jira/browse/SPARK-42755 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42755) Factor literal value conversion out to connect-common
[ https://issues.apache.org/jira/browse/SPARK-42755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42755: Assignee: (was: Apache Spark) > Factor literal value conversion out to connect-common > - > > Key: SPARK-42755 > URL: https://issues.apache.org/jira/browse/SPARK-42755 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42721) Add an Interceptor to log RPCs in connect-server
[ https://issues.apache.org/jira/browse/SPARK-42721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699153#comment-17699153 ] Apache Spark commented on SPARK-42721: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/40374 > Add an Interceptor to log RPCs in connect-server > > > Key: SPARK-42721 > URL: https://issues.apache.org/jira/browse/SPARK-42721 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.5.0 >Reporter: Raghu Angadi >Assignee: Raghu Angadi >Priority: Major > Fix For: 3.4.1 > > > It would be useful to be able to log RPC to connect server during > development. It makes simpler to see the flow of messages. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42752: Assignee: Apache Spark > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Assignee: Apache Spark >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699108#comment-17699108 ] Apache Spark commented on SPARK-42752: -- User 'gerashegalov' has created a pull request for this issue: https://github.com/apache/spark/pull/40372 > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42752: Assignee: (was: Apache Spark) > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699107#comment-17699107 ] Apache Spark commented on SPARK-42752: -- User 'gerashegalov' has created a pull request for this issue: https://github.com/apache/spark/pull/40372 > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41498) Union does not propagate Metadata output
[ https://issues.apache.org/jira/browse/SPARK-41498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41498: Assignee: Apache Spark > Union does not propagate Metadata output > > > Key: SPARK-41498 > URL: https://issues.apache.org/jira/browse/SPARK-41498 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1 >Reporter: Fredrik Klauß >Assignee: Apache Spark >Priority: Major > > Currently, the Union operator does not propagate any metadata output. This > makes it impossible to access any metadata if a Union operator is used, even > though the children have the exact same metadata output. > Example: > > {code:java} > val df1 = spark.read.load(path1) > val df2 = spark.read.load(path2) > df1.union(df2).select("_metadata.file_path"). // <-- fails{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-41498) Union does not propagate Metadata output
[ https://issues.apache.org/jira/browse/SPARK-41498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41498: Assignee: (was: Apache Spark) > Union does not propagate Metadata output > > > Key: SPARK-41498 > URL: https://issues.apache.org/jira/browse/SPARK-41498 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1 >Reporter: Fredrik Klauß >Priority: Major > > Currently, the Union operator does not propagate any metadata output. This > makes it impossible to access any metadata if a Union operator is used, even > though the children have the exact same metadata output. > Example: > > {code:java} > val df1 = spark.read.load(path1) > val df2 = spark.read.load(path2) > df1.union(df2).select("_metadata.file_path"). // <-- fails{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41498) Union does not propagate Metadata output
[ https://issues.apache.org/jira/browse/SPARK-41498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17699010#comment-17699010 ] Apache Spark commented on SPARK-41498: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/40371 > Union does not propagate Metadata output > > > Key: SPARK-41498 > URL: https://issues.apache.org/jira/browse/SPARK-41498 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1 >Reporter: Fredrik Klauß >Assignee: Fredrik Klauß >Priority: Major > Fix For: 3.4.0 > > > Currently, the Union operator does not propagate any metadata output. This > makes it impossible to access any metadata if a Union operator is used, even > though the children have the exact same metadata output. > Example: > > {code:java} > val df1 = spark.read.load(path1) > val df2 = spark.read.load(path2) > df1.union(df2).select("_metadata.file_path"). // <-- fails{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42620) Add `inclusive` parameter for (DataFrame|Series).between_time
[ https://issues.apache.org/jira/browse/SPARK-42620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42620: Assignee: (was: Apache Spark) > Add `inclusive` parameter for (DataFrame|Series).between_time > - > > Key: SPARK-42620 > URL: https://issues.apache.org/jira/browse/SPARK-42620 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > See https://github.com/pandas-dev/pandas/pull/43248 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42620) Add `inclusive` parameter for (DataFrame|Series).between_time
[ https://issues.apache.org/jira/browse/SPARK-42620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42620: Assignee: Apache Spark > Add `inclusive` parameter for (DataFrame|Series).between_time > - > > Key: SPARK-42620 > URL: https://issues.apache.org/jira/browse/SPARK-42620 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Assignee: Apache Spark >Priority: Major > > See https://github.com/pandas-dev/pandas/pull/43248 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42620) Add `inclusive` parameter for (DataFrame|Series).between_time
[ https://issues.apache.org/jira/browse/SPARK-42620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698999#comment-17698999 ] Apache Spark commented on SPARK-42620: -- User 'dzhigimont' has created a pull request for this issue: https://github.com/apache/spark/pull/40370 > Add `inclusive` parameter for (DataFrame|Series).between_time > - > > Key: SPARK-42620 > URL: https://issues.apache.org/jira/browse/SPARK-42620 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 3.5.0 >Reporter: Haejoon Lee >Priority: Major > > See https://github.com/pandas-dev/pandas/pull/43248 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42398) refine default column value framework
[ https://issues.apache.org/jira/browse/SPARK-42398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698982#comment-17698982 ] Apache Spark commented on SPARK-42398: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/40369 > refine default column value framework > - > > Key: SPARK-42398 > URL: https://issues.apache.org/jira/browse/SPARK-42398 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42748) Server-side Artifact Management
[ https://issues.apache.org/jira/browse/SPARK-42748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698952#comment-17698952 ] Apache Spark commented on SPARK-42748: -- User 'vicennial' has created a pull request for this issue: https://github.com/apache/spark/pull/40368 > Server-side Artifact Management > --- > > Key: SPARK-42748 > URL: https://issues.apache.org/jira/browse/SPARK-42748 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Venkata Sai Akhil Gudesa >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-42653 implements the client-side > transfer of artifacts to the server but currently, the server does not > process these requests. > > We need to implement a server-side management mechanism to handle storage of > these artifacts on the driver as well as perform further processing (such as > adding jars and moving class files to the right directories) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42748) Server-side Artifact Management
[ https://issues.apache.org/jira/browse/SPARK-42748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42748: Assignee: Apache Spark > Server-side Artifact Management > --- > > Key: SPARK-42748 > URL: https://issues.apache.org/jira/browse/SPARK-42748 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Venkata Sai Akhil Gudesa >Assignee: Apache Spark >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-42653 implements the client-side > transfer of artifacts to the server but currently, the server does not > process these requests. > > We need to implement a server-side management mechanism to handle storage of > these artifacts on the driver as well as perform further processing (such as > adding jars and moving class files to the right directories) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42748) Server-side Artifact Management
[ https://issues.apache.org/jira/browse/SPARK-42748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42748: Assignee: (was: Apache Spark) > Server-side Artifact Management > --- > > Key: SPARK-42748 > URL: https://issues.apache.org/jira/browse/SPARK-42748 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Venkata Sai Akhil Gudesa >Priority: Major > > https://issues.apache.org/jira/browse/SPARK-42653 implements the client-side > transfer of artifacts to the server but currently, the server does not > process these requests. > > We need to implement a server-side management mechanism to handle storage of > these artifacts on the driver as well as perform further processing (such as > adding jars and moving class files to the right directories) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42747) Fix incorrect internal status of LoR and AFT
[ https://issues.apache.org/jira/browse/SPARK-42747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698943#comment-17698943 ] Apache Spark commented on SPARK-42747: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40367 > Fix incorrect internal status of LoR and AFT > > > Key: SPARK-42747 > URL: https://issues.apache.org/jira/browse/SPARK-42747 > Project: Spark > Issue Type: Bug > Components: ML, PySpark >Affects Versions: 3.1.0, 3.2.0, 3.3.0, 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > > LoR and AFT applied internal status to optimize prediction/transform, but the > status is not correctly updated in some case: > {code:java} > from pyspark.sql import Row > from pyspark.ml.classification import * > from pyspark.ml.linalg import Vectors > df = spark.createDataFrame( > [ > (1.0, 1.0, Vectors.dense(0.0, 5.0)), > (0.0, 2.0, Vectors.dense(1.0, 2.0)), > (1.0, 3.0, Vectors.dense(2.0, 1.0)), > (0.0, 4.0, Vectors.dense(3.0, 3.0)), > ], > ["label", "weight", "features"], > ) > lor = LogisticRegression(weightCol="weight") > model = lor.fit(df) > # status changes 1 > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > model.setThreshold(t).transform(df) > # status changes 2 > [model.setThreshold(t).predict(Vectors.dense(0.0, 5.0)) for t in [0.0, 0.1, > 0.2, 0.5, 1.0]] > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > print(t) > model.setThreshold(t).transform(df).show() > # <- error results > {code} > results: > {code:java} > 0.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.1 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.2 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.5 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 1.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| >
[jira] [Assigned] (SPARK-42747) Fix incorrect internal status of LoR and AFT
[ https://issues.apache.org/jira/browse/SPARK-42747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42747: Assignee: Apache Spark > Fix incorrect internal status of LoR and AFT > > > Key: SPARK-42747 > URL: https://issues.apache.org/jira/browse/SPARK-42747 > Project: Spark > Issue Type: Bug > Components: ML, PySpark >Affects Versions: 3.1.0, 3.2.0, 3.3.0, 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > > LoR and AFT applied internal status to optimize prediction/transform, but the > status is not correctly updated in some case: > {code:java} > from pyspark.sql import Row > from pyspark.ml.classification import * > from pyspark.ml.linalg import Vectors > df = spark.createDataFrame( > [ > (1.0, 1.0, Vectors.dense(0.0, 5.0)), > (0.0, 2.0, Vectors.dense(1.0, 2.0)), > (1.0, 3.0, Vectors.dense(2.0, 1.0)), > (0.0, 4.0, Vectors.dense(3.0, 3.0)), > ], > ["label", "weight", "features"], > ) > lor = LogisticRegression(weightCol="weight") > model = lor.fit(df) > # status changes 1 > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > model.setThreshold(t).transform(df) > # status changes 2 > [model.setThreshold(t).predict(Vectors.dense(0.0, 5.0)) for t in [0.0, 0.1, > 0.2, 0.5, 1.0]] > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > print(t) > model.setThreshold(t).transform(df).show() > # <- error results > {code} > results: > {code:java} > 0.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.1 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.2 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.5 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 1.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > {code} -- This message was sent by Atlassian Jira
[jira] [Assigned] (SPARK-42747) Fix incorrect internal status of LoR and AFT
[ https://issues.apache.org/jira/browse/SPARK-42747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42747: Assignee: (was: Apache Spark) > Fix incorrect internal status of LoR and AFT > > > Key: SPARK-42747 > URL: https://issues.apache.org/jira/browse/SPARK-42747 > Project: Spark > Issue Type: Bug > Components: ML, PySpark >Affects Versions: 3.1.0, 3.2.0, 3.3.0, 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > > LoR and AFT applied internal status to optimize prediction/transform, but the > status is not correctly updated in some case: > {code:java} > from pyspark.sql import Row > from pyspark.ml.classification import * > from pyspark.ml.linalg import Vectors > df = spark.createDataFrame( > [ > (1.0, 1.0, Vectors.dense(0.0, 5.0)), > (0.0, 2.0, Vectors.dense(1.0, 2.0)), > (1.0, 3.0, Vectors.dense(2.0, 1.0)), > (0.0, 4.0, Vectors.dense(3.0, 3.0)), > ], > ["label", "weight", "features"], > ) > lor = LogisticRegression(weightCol="weight") > model = lor.fit(df) > # status changes 1 > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > model.setThreshold(t).transform(df) > # status changes 2 > [model.setThreshold(t).predict(Vectors.dense(0.0, 5.0)) for t in [0.0, 0.1, > 0.2, 0.5, 1.0]] > for t in [0.0, 0.1, 0.2, 0.5, 1.0]: > print(t) > model.setThreshold(t).transform(df).show() > # <- error results > {code} > results: > {code:java} > 0.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.1 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.2 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 0.5 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > 1.0 > +-+--+-+++--+ > |label|weight| features| rawPrediction| probability|prediction| > +-+--+-+++--+ > | 1.0| 1.0|[0.0,5.0]|[0.10932013376341...|[0.52730284774069...| 0.0| > | 0.0| 2.0|[1.0,2.0]|[-0.8619624039359...|[0.29692950635762...| 0.0| > | 1.0| 3.0|[2.0,1.0]|[-0.3634508721860...|[0.41012446452385...| 0.0| > | 0.0| 4.0|[3.0,3.0]|[2.33975176373760...|[0.91211618852612...| 0.0| > +-+--+-+++--+ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (SPARK-42691) Implement Dataset.semanticHash
[ https://issues.apache.org/jira/browse/SPARK-42691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42691: Assignee: (was: Apache Spark) > Implement Dataset.semanticHash > -- > > Key: SPARK-42691 > URL: https://issues.apache.org/jira/browse/SPARK-42691 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > > Implement Dataset.semanticHash: > {code:java} > /** > * Returns a `hashCode` of the logical query plan against this [[Dataset]]. > * > * @note Unlike the standard `hashCode`, the hash is calculated against the > query plan > * simplified by tolerating the cosmetic differences such as attribute names. > * @since 3.4.0 > */ > @DeveloperApi > def semanticHash(): Int{code} > This has to be computed on the spark connect server to do this. Please extend > the > AnalyzePlanRequest and AnalyzePlanResponse messages for this. > Also make sure this works in PySpark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42691) Implement Dataset.semanticHash
[ https://issues.apache.org/jira/browse/SPARK-42691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42691: Assignee: Apache Spark > Implement Dataset.semanticHash > -- > > Key: SPARK-42691 > URL: https://issues.apache.org/jira/browse/SPARK-42691 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > Implement Dataset.semanticHash: > {code:java} > /** > * Returns a `hashCode` of the logical query plan against this [[Dataset]]. > * > * @note Unlike the standard `hashCode`, the hash is calculated against the > query plan > * simplified by tolerating the cosmetic differences such as attribute names. > * @since 3.4.0 > */ > @DeveloperApi > def semanticHash(): Int{code} > This has to be computed on the spark connect server to do this. Please extend > the > AnalyzePlanRequest and AnalyzePlanResponse messages for this. > Also make sure this works in PySpark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42691) Implement Dataset.semanticHash
[ https://issues.apache.org/jira/browse/SPARK-42691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698935#comment-17698935 ] Apache Spark commented on SPARK-42691: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/40366 > Implement Dataset.semanticHash > -- > > Key: SPARK-42691 > URL: https://issues.apache.org/jira/browse/SPARK-42691 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > > Implement Dataset.semanticHash: > {code:java} > /** > * Returns a `hashCode` of the logical query plan against this [[Dataset]]. > * > * @note Unlike the standard `hashCode`, the hash is calculated against the > query plan > * simplified by tolerating the cosmetic differences such as attribute names. > * @since 3.4.0 > */ > @DeveloperApi > def semanticHash(): Int{code} > This has to be computed on the spark connect server to do this. Please extend > the > AnalyzePlanRequest and AnalyzePlanResponse messages for this. > Also make sure this works in PySpark. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42743) Support analyze TimestampNTZ columns
[ https://issues.apache.org/jira/browse/SPARK-42743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42743: Assignee: Gengliang Wang (was: Apache Spark) > Support analyze TimestampNTZ columns > > > Key: SPARK-42743 > URL: https://issues.apache.org/jira/browse/SPARK-42743 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42743) Support analyze TimestampNTZ columns
[ https://issues.apache.org/jira/browse/SPARK-42743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42743: Assignee: Apache Spark (was: Gengliang Wang) > Support analyze TimestampNTZ columns > > > Key: SPARK-42743 > URL: https://issues.apache.org/jira/browse/SPARK-42743 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42743) Support analyze TimestampNTZ columns
[ https://issues.apache.org/jira/browse/SPARK-42743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698876#comment-17698876 ] Apache Spark commented on SPARK-42743: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/40362 > Support analyze TimestampNTZ columns > > > Key: SPARK-42743 > URL: https://issues.apache.org/jira/browse/SPARK-42743 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42745) Improved AliasAwareOutputExpression works with DSv2
[ https://issues.apache.org/jira/browse/SPARK-42745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42745: Assignee: (was: Apache Spark) > Improved AliasAwareOutputExpression works with DSv2 > --- > > Key: SPARK-42745 > URL: https://issues.apache.org/jira/browse/SPARK-42745 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Peter Toth >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42745) Improved AliasAwareOutputExpression works with DSv2
[ https://issues.apache.org/jira/browse/SPARK-42745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698851#comment-17698851 ] Apache Spark commented on SPARK-42745: -- User 'peter-toth' has created a pull request for this issue: https://github.com/apache/spark/pull/40364 > Improved AliasAwareOutputExpression works with DSv2 > --- > > Key: SPARK-42745 > URL: https://issues.apache.org/jira/browse/SPARK-42745 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Peter Toth >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42745) Improved AliasAwareOutputExpression works with DSv2
[ https://issues.apache.org/jira/browse/SPARK-42745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42745: Assignee: Apache Spark > Improved AliasAwareOutputExpression works with DSv2 > --- > > Key: SPARK-42745 > URL: https://issues.apache.org/jira/browse/SPARK-42745 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0, 3.5.0 >Reporter: Peter Toth >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42741) Do not unwrap casts in binary comparison when literal is null
[ https://issues.apache.org/jira/browse/SPARK-42741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698750#comment-17698750 ] Apache Spark commented on SPARK-42741: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40360 > Do not unwrap casts in binary comparison when literal is null > - > > Key: SPARK-42741 > URL: https://issues.apache.org/jira/browse/SPARK-42741 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42741) Do not unwrap casts in binary comparison when literal is null
[ https://issues.apache.org/jira/browse/SPARK-42741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42741: Assignee: (was: Apache Spark) > Do not unwrap casts in binary comparison when literal is null > - > > Key: SPARK-42741 > URL: https://issues.apache.org/jira/browse/SPARK-42741 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42741) Do not unwrap casts in binary comparison when literal is null
[ https://issues.apache.org/jira/browse/SPARK-42741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42741: Assignee: Apache Spark > Do not unwrap casts in binary comparison when literal is null > - > > Key: SPARK-42741 > URL: https://issues.apache.org/jira/browse/SPARK-42741 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42741) Do not unwrap casts in binary comparison when literal is null
[ https://issues.apache.org/jira/browse/SPARK-42741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698749#comment-17698749 ] Apache Spark commented on SPARK-42741: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/40360 > Do not unwrap casts in binary comparison when literal is null > - > > Key: SPARK-42741 > URL: https://issues.apache.org/jira/browse/SPARK-42741 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.5.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42740) Fix the bug that pushdown offset or paging is invalid for some built-in dialect
[ https://issues.apache.org/jira/browse/SPARK-42740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42740: Assignee: (was: Apache Spark) > Fix the bug that pushdown offset or paging is invalid for some built-in > dialect > > > Key: SPARK-42740 > URL: https://issues.apache.org/jira/browse/SPARK-42740 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Priority: Major > > Currently, the default pushdown offset like OFFSET n. But some built-in > dialect doesn't support the syntax. So when the Spark pushdown offset into > these databases, them throwing errors. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42740) Fix the bug that pushdown offset or paging is invalid for some built-in dialect
[ https://issues.apache.org/jira/browse/SPARK-42740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698742#comment-17698742 ] Apache Spark commented on SPARK-42740: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/40359 > Fix the bug that pushdown offset or paging is invalid for some built-in > dialect > > > Key: SPARK-42740 > URL: https://issues.apache.org/jira/browse/SPARK-42740 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Priority: Major > > Currently, the default pushdown offset like OFFSET n. But some built-in > dialect doesn't support the syntax. So when the Spark pushdown offset into > these databases, them throwing errors. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42740) Fix the bug that pushdown offset or paging is invalid for some built-in dialect
[ https://issues.apache.org/jira/browse/SPARK-42740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42740: Assignee: Apache Spark > Fix the bug that pushdown offset or paging is invalid for some built-in > dialect > > > Key: SPARK-42740 > URL: https://issues.apache.org/jira/browse/SPARK-42740 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: jiaan.geng >Assignee: Apache Spark >Priority: Major > > Currently, the default pushdown offset like OFFSET n. But some built-in > dialect doesn't support the syntax. So when the Spark pushdown offset into > these databases, them throwing errors. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42733) df.write.format().save() should support calling with no path or table name
[ https://issues.apache.org/jira/browse/SPARK-42733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698729#comment-17698729 ] Apache Spark commented on SPARK-42733: -- User 'zhenlineo' has created a pull request for this issue: https://github.com/apache/spark/pull/40358 > df.write.format().save() should support calling with no path or table name > -- > > Key: SPARK-42733 > URL: https://issues.apache.org/jira/browse/SPARK-42733 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > Fix For: 3.4.0 > > > When calling `session.range(5).write.format("xxx").options().save()` Spark > Connect currently throws an assertion error because it expects that either > path or tableName are present. According to our current PySpark > implementation that is not necessary though. > > {code:python} > if format is not None: > self.format(format) > if path is None: > self._jwrite.save() > else: > self._jwrite.save(path) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42702) Support parameterized CTE
[ https://issues.apache.org/jira/browse/SPARK-42702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42702: Assignee: Max Gekk (was: Apache Spark) > Support parameterized CTE > - > > Key: SPARK-42702 > URL: https://issues.apache.org/jira/browse/SPARK-42702 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Support named parameters in named common table expressions (CTE). At the > moment, such queries failed: > {code:java} > CREATE TABLE tbl(namespace STRING) USING parquet > INSERT INTO tbl SELECT 'abc' > WITH transitions AS ( > SELECT * FROM tbl WHERE namespace = :namespace > ) SELECT * FROM transitions {code} > w/ the following error: > {code:java} > [UNBOUND_SQL_PARAMETER] Found the unbound parameter: `namespace`. Please, fix > `args` and provide a mapping of the parameter to a SQL literal.; line 3 pos > 38; > 'WithCTE > :- 'CTERelationDef 0, false > : +- 'SubqueryAlias transitions > : +- 'Project [*] > : +- 'Filter (namespace#3 = parameter(namespace)) > : +- SubqueryAlias spark_catalog.default.tbl > : +- Relation spark_catalog.default.tbl[namespace#3] parquet > +- 'Project [*] > +- 'SubqueryAlias transitions > +- 'CTERelationRef 0, falseorg.apache.spark.sql.AnalysisException: > [UNBOUND_SQL_PARAMETER] Found the unbound parameter: `namespace`. Please, fix > `args` and provide a mapping of the parameter to a SQL literal.; line 3 pos > 38; > 'WithCTE > :- 'CTERelationDef 0, false > : +- 'SubqueryAlias transitions > : +- 'Project [*] > : +- 'Filter (namespace#3 = parameter(namespace)) > : +- SubqueryAlias spark_catalog.default.tbl > : +- Relation spark_catalog.default.tbl[namespace#3] parquet > +- 'Project [*] > +- 'SubqueryAlias transitions > +- 'CTERelationRef 0, false at > org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:52) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$5(CheckAnalysis.scala:339) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$5$adapted(CheckAnalysis.scala:244) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42702) Support parameterized CTE
[ https://issues.apache.org/jira/browse/SPARK-42702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42702: Assignee: Apache Spark (was: Max Gekk) > Support parameterized CTE > - > > Key: SPARK-42702 > URL: https://issues.apache.org/jira/browse/SPARK-42702 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Support named parameters in named common table expressions (CTE). At the > moment, such queries failed: > {code:java} > CREATE TABLE tbl(namespace STRING) USING parquet > INSERT INTO tbl SELECT 'abc' > WITH transitions AS ( > SELECT * FROM tbl WHERE namespace = :namespace > ) SELECT * FROM transitions {code} > w/ the following error: > {code:java} > [UNBOUND_SQL_PARAMETER] Found the unbound parameter: `namespace`. Please, fix > `args` and provide a mapping of the parameter to a SQL literal.; line 3 pos > 38; > 'WithCTE > :- 'CTERelationDef 0, false > : +- 'SubqueryAlias transitions > : +- 'Project [*] > : +- 'Filter (namespace#3 = parameter(namespace)) > : +- SubqueryAlias spark_catalog.default.tbl > : +- Relation spark_catalog.default.tbl[namespace#3] parquet > +- 'Project [*] > +- 'SubqueryAlias transitions > +- 'CTERelationRef 0, falseorg.apache.spark.sql.AnalysisException: > [UNBOUND_SQL_PARAMETER] Found the unbound parameter: `namespace`. Please, fix > `args` and provide a mapping of the parameter to a SQL literal.; line 3 pos > 38; > 'WithCTE > :- 'CTERelationDef 0, false > : +- 'SubqueryAlias transitions > : +- 'Project [*] > : +- 'Filter (namespace#3 = parameter(namespace)) > : +- SubqueryAlias spark_catalog.default.tbl > : +- Relation spark_catalog.default.tbl[namespace#3] parquet > +- 'Project [*] > +- 'SubqueryAlias transitions > +- 'CTERelationRef 0, false at > org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:52) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$5(CheckAnalysis.scala:339) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$5$adapted(CheckAnalysis.scala:244) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42739) Ensure release tag to be pushed to release branch
[ https://issues.apache.org/jira/browse/SPARK-42739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698663#comment-17698663 ] Apache Spark commented on SPARK-42739: -- User 'xinrong-meng' has created a pull request for this issue: https://github.com/apache/spark/pull/40357 > Ensure release tag to be pushed to release branch > - > > Key: SPARK-42739 > URL: https://issues.apache.org/jira/browse/SPARK-42739 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Xinrong Meng >Priority: Major > > In the release script, add a check to ensure release tag to be pushed to > release branch -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42739) Ensure release tag to be pushed to release branch
[ https://issues.apache.org/jira/browse/SPARK-42739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698662#comment-17698662 ] Apache Spark commented on SPARK-42739: -- User 'xinrong-meng' has created a pull request for this issue: https://github.com/apache/spark/pull/40357 > Ensure release tag to be pushed to release branch > - > > Key: SPARK-42739 > URL: https://issues.apache.org/jira/browse/SPARK-42739 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Xinrong Meng >Priority: Major > > In the release script, add a check to ensure release tag to be pushed to > release branch -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42739) Ensure release tag to be pushed to release branch
[ https://issues.apache.org/jira/browse/SPARK-42739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42739: Assignee: (was: Apache Spark) > Ensure release tag to be pushed to release branch > - > > Key: SPARK-42739 > URL: https://issues.apache.org/jira/browse/SPARK-42739 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Xinrong Meng >Priority: Major > > In the release script, add a check to ensure release tag to be pushed to > release branch -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42739) Ensure release tag to be pushed to release branch
[ https://issues.apache.org/jira/browse/SPARK-42739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42739: Assignee: Apache Spark > Ensure release tag to be pushed to release branch > - > > Key: SPARK-42739 > URL: https://issues.apache.org/jira/browse/SPARK-42739 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.4.0, 3.5.0 >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > > In the release script, add a check to ensure release tag to be pushed to > release branch -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42733) df.write.format().save() should support calling with no path or table name
[ https://issues.apache.org/jira/browse/SPARK-42733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698608#comment-17698608 ] Apache Spark commented on SPARK-42733: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/40356 > df.write.format().save() should support calling with no path or table name > -- > > Key: SPARK-42733 > URL: https://issues.apache.org/jira/browse/SPARK-42733 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > When calling `session.range(5).write.format("xxx").options().save()` Spark > Connect currently throws an assertion error because it expects that either > path or tableName are present. According to our current PySpark > implementation that is not necessary though. > > {code:python} > if format is not None: > self.format(format) > if path is None: > self._jwrite.save() > else: > self._jwrite.save(path) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42733) df.write.format().save() should support calling with no path or table name
[ https://issues.apache.org/jira/browse/SPARK-42733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42733: Assignee: Apache Spark > df.write.format().save() should support calling with no path or table name > -- > > Key: SPARK-42733 > URL: https://issues.apache.org/jira/browse/SPARK-42733 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Assignee: Apache Spark >Priority: Major > > When calling `session.range(5).write.format("xxx").options().save()` Spark > Connect currently throws an assertion error because it expects that either > path or tableName are present. According to our current PySpark > implementation that is not necessary though. > > {code:python} > if format is not None: > self.format(format) > if path is None: > self._jwrite.save() > else: > self._jwrite.save(path) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42733) df.write.format().save() should support calling with no path or table name
[ https://issues.apache.org/jira/browse/SPARK-42733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42733: Assignee: (was: Apache Spark) > df.write.format().save() should support calling with no path or table name > -- > > Key: SPARK-42733 > URL: https://issues.apache.org/jira/browse/SPARK-42733 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Martin Grund >Priority: Major > > When calling `session.range(5).write.format("xxx").options().save()` Spark > Connect currently throws an assertion error because it expects that either > path or tableName are present. According to our current PySpark > implementation that is not necessary though. > > {code:python} > if format is not None: > self.format(format) > if path is None: > self._jwrite.save() > else: > self._jwrite.save(path) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42604) Implement functions.typedlit
[ https://issues.apache.org/jira/browse/SPARK-42604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698434#comment-17698434 ] Apache Spark commented on SPARK-42604: -- User 'beliefer' has created a pull request for this issue: https://github.com/apache/spark/pull/40355 > Implement functions.typedlit > > > Key: SPARK-42604 > URL: https://issues.apache.org/jira/browse/SPARK-42604 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > > We need to add functions.typedlit. This requires a change to the connect > protocol. See SPARK-42579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42604) Implement functions.typedlit
[ https://issues.apache.org/jira/browse/SPARK-42604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42604: Assignee: (was: Apache Spark) > Implement functions.typedlit > > > Key: SPARK-42604 > URL: https://issues.apache.org/jira/browse/SPARK-42604 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Priority: Major > > We need to add functions.typedlit. This requires a change to the connect > protocol. See SPARK-42579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42604) Implement functions.typedlit
[ https://issues.apache.org/jira/browse/SPARK-42604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42604: Assignee: Apache Spark > Implement functions.typedlit > > > Key: SPARK-42604 > URL: https://issues.apache.org/jira/browse/SPARK-42604 > Project: Spark > Issue Type: New Feature > Components: Connect >Affects Versions: 3.4.0 >Reporter: Herman van Hövell >Assignee: Apache Spark >Priority: Major > > We need to add functions.typedlit. This requires a change to the connect > protocol. See SPARK-42579 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42735) Allow passing additional confs to server in RemoteSparkSession
[ https://issues.apache.org/jira/browse/SPARK-42735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698427#comment-17698427 ] Apache Spark commented on SPARK-42735: -- User 'tomvanbussel' has created a pull request for this issue: https://github.com/apache/spark/pull/40354 > Allow passing additional confs to server in RemoteSparkSession > -- > > Key: SPARK-42735 > URL: https://issues.apache.org/jira/browse/SPARK-42735 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Tom van Bussel >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42735) Allow passing additional confs to server in RemoteSparkSession
[ https://issues.apache.org/jira/browse/SPARK-42735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42735: Assignee: Apache Spark > Allow passing additional confs to server in RemoteSparkSession > -- > > Key: SPARK-42735 > URL: https://issues.apache.org/jira/browse/SPARK-42735 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Tom van Bussel >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42735) Allow passing additional confs to server in RemoteSparkSession
[ https://issues.apache.org/jira/browse/SPARK-42735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42735: Assignee: (was: Apache Spark) > Allow passing additional confs to server in RemoteSparkSession > -- > > Key: SPARK-42735 > URL: https://issues.apache.org/jira/browse/SPARK-42735 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Tom van Bussel >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42732) Support spark connect session getActiveSession
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698337#comment-17698337 ] Apache Spark commented on SPARK-42732: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/40353 > Support spark connect session getActiveSession > -- > > Key: SPARK-42732 > URL: https://issues.apache.org/jira/browse/SPARK-42732 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 3.5.0 >Reporter: Weichen Xu >Priority: Major > > Support spark connect session getActiveSession method. > Spark connect ML needs this API to get active session in some cases (e.g. > fetching model attributes from server side). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42732) Support spark connect session getActiveSession
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42732: Assignee: Apache Spark > Support spark connect session getActiveSession > -- > > Key: SPARK-42732 > URL: https://issues.apache.org/jira/browse/SPARK-42732 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 3.5.0 >Reporter: Weichen Xu >Assignee: Apache Spark >Priority: Major > > Support spark connect session getActiveSession method. > Spark connect ML needs this API to get active session in some cases (e.g. > fetching model attributes from server side). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42732) Support spark connect session getActiveSession
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698334#comment-17698334 ] Apache Spark commented on SPARK-42732: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/40353 > Support spark connect session getActiveSession > -- > > Key: SPARK-42732 > URL: https://issues.apache.org/jira/browse/SPARK-42732 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 3.5.0 >Reporter: Weichen Xu >Priority: Major > > Support spark connect session getActiveSession method. > Spark connect ML needs this API to get active session in some cases (e.g. > fetching model attributes from server side). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42732) Support spark connect session getActiveSession
[ https://issues.apache.org/jira/browse/SPARK-42732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42732: Assignee: (was: Apache Spark) > Support spark connect session getActiveSession > -- > > Key: SPARK-42732 > URL: https://issues.apache.org/jira/browse/SPARK-42732 > Project: Spark > Issue Type: New Feature > Components: Connect, PySpark >Affects Versions: 3.5.0 >Reporter: Weichen Xu >Priority: Major > > Support spark connect session getActiveSession method. > Spark connect ML needs this API to get active session in some cases (e.g. > fetching model attributes from server side). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42664) Support bloomFilter for DataFrameStatFunctions
[ https://issues.apache.org/jira/browse/SPARK-42664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698332#comment-17698332 ] Apache Spark commented on SPARK-42664: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40352 > Support bloomFilter for DataFrameStatFunctions > -- > > Key: SPARK-42664 > URL: https://issues.apache.org/jira/browse/SPARK-42664 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42664) Support bloomFilter for DataFrameStatFunctions
[ https://issues.apache.org/jira/browse/SPARK-42664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42664: Assignee: (was: Apache Spark) > Support bloomFilter for DataFrameStatFunctions > -- > > Key: SPARK-42664 > URL: https://issues.apache.org/jira/browse/SPARK-42664 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42664) Support bloomFilter for DataFrameStatFunctions
[ https://issues.apache.org/jira/browse/SPARK-42664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698331#comment-17698331 ] Apache Spark commented on SPARK-42664: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/40352 > Support bloomFilter for DataFrameStatFunctions > -- > > Key: SPARK-42664 > URL: https://issues.apache.org/jira/browse/SPARK-42664 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Yang Jie >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42664) Support bloomFilter for DataFrameStatFunctions
[ https://issues.apache.org/jira/browse/SPARK-42664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42664: Assignee: Apache Spark > Support bloomFilter for DataFrameStatFunctions > -- > > Key: SPARK-42664 > URL: https://issues.apache.org/jira/browse/SPARK-42664 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42727) Support executing spark commands in the root directory when local mode is specified
[ https://issues.apache.org/jira/browse/SPARK-42727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698279#comment-17698279 ] Apache Spark commented on SPARK-42727: -- User 'huangxiaopingRD' has created a pull request for this issue: https://github.com/apache/spark/pull/40351 > Support executing spark commands in the root directory when local mode is > specified > --- > > Key: SPARK-42727 > URL: https://issues.apache.org/jira/browse/SPARK-42727 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42727) Support executing spark commands in the root directory when local mode is specified
[ https://issues.apache.org/jira/browse/SPARK-42727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42727: Assignee: Apache Spark > Support executing spark commands in the root directory when local mode is > specified > --- > > Key: SPARK-42727 > URL: https://issues.apache.org/jira/browse/SPARK-42727 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42727) Support executing spark commands in the root directory when local mode is specified
[ https://issues.apache.org/jira/browse/SPARK-42727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698277#comment-17698277 ] Apache Spark commented on SPARK-42727: -- User 'huangxiaopingRD' has created a pull request for this issue: https://github.com/apache/spark/pull/40351 > Support executing spark commands in the root directory when local mode is > specified > --- > > Key: SPARK-42727 > URL: https://issues.apache.org/jira/browse/SPARK-42727 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42727) Support executing spark commands in the root directory when local mode is specified
[ https://issues.apache.org/jira/browse/SPARK-42727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42727: Assignee: (was: Apache Spark) > Support executing spark commands in the root directory when local mode is > specified > --- > > Key: SPARK-42727 > URL: https://issues.apache.org/jira/browse/SPARK-42727 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: xiaoping.huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42726) Implement `DataFrame.mapInArrow`
[ https://issues.apache.org/jira/browse/SPARK-42726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42726: Assignee: Apache Spark > Implement `DataFrame.mapInArrow` > > > Key: SPARK-42726 > URL: https://issues.apache.org/jira/browse/SPARK-42726 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > > Implement `DataFrame.mapInArrow` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42726) Implement `DataFrame.mapInArrow`
[ https://issues.apache.org/jira/browse/SPARK-42726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42726: Assignee: (was: Apache Spark) > Implement `DataFrame.mapInArrow` > > > Key: SPARK-42726 > URL: https://issues.apache.org/jira/browse/SPARK-42726 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Implement `DataFrame.mapInArrow` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42726) Implement `DataFrame.mapInArrow`
[ https://issues.apache.org/jira/browse/SPARK-42726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698273#comment-17698273 ] Apache Spark commented on SPARK-42726: -- User 'xinrong-meng' has created a pull request for this issue: https://github.com/apache/spark/pull/40350 > Implement `DataFrame.mapInArrow` > > > Key: SPARK-42726 > URL: https://issues.apache.org/jira/browse/SPARK-42726 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Implement `DataFrame.mapInArrow` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42710) Rename FrameMap proto to MapPartitions
[ https://issues.apache.org/jira/browse/SPARK-42710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698271#comment-17698271 ] Apache Spark commented on SPARK-42710: -- User 'xinrong-meng' has created a pull request for this issue: https://github.com/apache/spark/pull/40350 > Rename FrameMap proto to MapPartitions > -- > > Key: SPARK-42710 > URL: https://issues.apache.org/jira/browse/SPARK-42710 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Fix For: 3.4.0 > > > For readability. > Frame Map API refers to mapInPandas and mapInArrow, which are equivalent to > MapPartitions. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42725) Make LiteralExpression support array
[ https://issues.apache.org/jira/browse/SPARK-42725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698207#comment-17698207 ] Apache Spark commented on SPARK-42725: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40349 > Make LiteralExpression support array > > > Key: SPARK-42725 > URL: https://issues.apache.org/jira/browse/SPARK-42725 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42725) Make LiteralExpression support array
[ https://issues.apache.org/jira/browse/SPARK-42725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42725: Assignee: (was: Apache Spark) > Make LiteralExpression support array > > > Key: SPARK-42725 > URL: https://issues.apache.org/jira/browse/SPARK-42725 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42725) Make LiteralExpression support array
[ https://issues.apache.org/jira/browse/SPARK-42725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42725: Assignee: Apache Spark > Make LiteralExpression support array > > > Key: SPARK-42725 > URL: https://issues.apache.org/jira/browse/SPARK-42725 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42725) Make LiteralExpression support array
[ https://issues.apache.org/jira/browse/SPARK-42725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698206#comment-17698206 ] Apache Spark commented on SPARK-42725: -- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/40349 > Make LiteralExpression support array > > > Key: SPARK-42725 > URL: https://issues.apache.org/jira/browse/SPARK-42725 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42711) build/sbt usage error messages about java-home
[ https://issues.apache.org/jira/browse/SPARK-42711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698155#comment-17698155 ] Apache Spark commented on SPARK-42711: -- User 'liang3zy22' has created a pull request for this issue: https://github.com/apache/spark/pull/40347 > build/sbt usage error messages about java-home > -- > > Key: SPARK-42711 > URL: https://issues.apache.org/jira/browse/SPARK-42711 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.2 >Reporter: Liang Yan >Priority: Minor > > The build/sbt tool's usage information about java-home is wrong: > # java version (default: java from PATH, currently $(java -version 2>&1 | > grep version)) > -java-home alternate JAVA_HOME -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42711) build/sbt usage error messages about java-home
[ https://issues.apache.org/jira/browse/SPARK-42711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42711: Assignee: (was: Apache Spark) > build/sbt usage error messages about java-home > -- > > Key: SPARK-42711 > URL: https://issues.apache.org/jira/browse/SPARK-42711 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.2 >Reporter: Liang Yan >Priority: Minor > > The build/sbt tool's usage information about java-home is wrong: > # java version (default: java from PATH, currently $(java -version 2>&1 | > grep version)) > -java-home alternate JAVA_HOME -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42711) build/sbt usage error messages about java-home
[ https://issues.apache.org/jira/browse/SPARK-42711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42711: Assignee: Apache Spark > build/sbt usage error messages about java-home > -- > > Key: SPARK-42711 > URL: https://issues.apache.org/jira/browse/SPARK-42711 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 3.3.2 >Reporter: Liang Yan >Assignee: Apache Spark >Priority: Minor > > The build/sbt tool's usage information about java-home is wrong: > # java version (default: java from PATH, currently $(java -version 2>&1 | > grep version)) > -java-home alternate JAVA_HOME -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42724) Upgrade buf to v1.15.1
[ https://issues.apache.org/jira/browse/SPARK-42724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42724: Assignee: (was: Apache Spark) > Upgrade buf to v1.15.1 > -- > > Key: SPARK-42724 > URL: https://issues.apache.org/jira/browse/SPARK-42724 > Project: Spark > Issue Type: Sub-task > Components: Build, Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42724) Upgrade buf to v1.15.1
[ https://issues.apache.org/jira/browse/SPARK-42724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698156#comment-17698156 ] Apache Spark commented on SPARK-42724: -- User 'panbingkun' has created a pull request for this issue: https://github.com/apache/spark/pull/40348 > Upgrade buf to v1.15.1 > -- > > Key: SPARK-42724 > URL: https://issues.apache.org/jira/browse/SPARK-42724 > Project: Spark > Issue Type: Sub-task > Components: Build, Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42724) Upgrade buf to v1.15.1
[ https://issues.apache.org/jira/browse/SPARK-42724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42724: Assignee: Apache Spark > Upgrade buf to v1.15.1 > -- > > Key: SPARK-42724 > URL: https://issues.apache.org/jira/browse/SPARK-42724 > Project: Spark > Issue Type: Sub-task > Components: Build, Connect >Affects Versions: 3.4.1 >Reporter: BingKun Pan >Assignee: Apache Spark >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42667) Spark Connect: newSession API
[ https://issues.apache.org/jira/browse/SPARK-42667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698119#comment-17698119 ] Apache Spark commented on SPARK-42667: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40346 > Spark Connect: newSession API > - > > Key: SPARK-42667 > URL: https://issues.apache.org/jira/browse/SPARK-42667 > Project: Spark > Issue Type: Task > Components: Connect >Affects Versions: 3.4.1 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42723) Support parser data type json "timestamp_ltz" as TimestampType
[ https://issues.apache.org/jira/browse/SPARK-42723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698096#comment-17698096 ] Apache Spark commented on SPARK-42723: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/40345 > Support parser data type json "timestamp_ltz" as TimestampType > -- > > Key: SPARK-42723 > URL: https://issues.apache.org/jira/browse/SPARK-42723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42723) Support parser data type json "timestamp_ltz" as TimestampType
[ https://issues.apache.org/jira/browse/SPARK-42723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42723: Assignee: Apache Spark (was: Gengliang Wang) > Support parser data type json "timestamp_ltz" as TimestampType > -- > > Key: SPARK-42723 > URL: https://issues.apache.org/jira/browse/SPARK-42723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42723) Support parser data type json "timestamp_ltz" as TimestampType
[ https://issues.apache.org/jira/browse/SPARK-42723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42723: Assignee: Gengliang Wang (was: Apache Spark) > Support parser data type json "timestamp_ltz" as TimestampType > -- > > Key: SPARK-42723 > URL: https://issues.apache.org/jira/browse/SPARK-42723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42723) Support parser data type json "timestamp_ltz" as TimestampType
[ https://issues.apache.org/jira/browse/SPARK-42723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698095#comment-17698095 ] Apache Spark commented on SPARK-42723: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/40345 > Support parser data type json "timestamp_ltz" as TimestampType > -- > > Key: SPARK-42723 > URL: https://issues.apache.org/jira/browse/SPARK-42723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42656) Spark Connect Scala Client Shell Script
[ https://issues.apache.org/jira/browse/SPARK-42656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698081#comment-17698081 ] Apache Spark commented on SPARK-42656: -- User 'zhenlineo' has created a pull request for this issue: https://github.com/apache/spark/pull/40344 > Spark Connect Scala Client Shell Script > --- > > Key: SPARK-42656 > URL: https://issues.apache.org/jira/browse/SPARK-42656 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Zhen Li >Assignee: Zhen Li >Priority: Major > Fix For: 3.4.0 > > > Adding a shell script to run scala client in a scala REPL to allow users to > connect to spark connect. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42722) Python Connect def schema() should not cache the schema
[ https://issues.apache.org/jira/browse/SPARK-42722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698073#comment-17698073 ] Apache Spark commented on SPARK-42722: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40343 > Python Connect def schema() should not cache the schema > > > Key: SPARK-42722 > URL: https://issues.apache.org/jira/browse/SPARK-42722 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-42722) Python Connect def schema() should not cache the schema
[ https://issues.apache.org/jira/browse/SPARK-42722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-42722: Assignee: Apache Spark (was: Rui Wang) > Python Connect def schema() should not cache the schema > > > Key: SPARK-42722 > URL: https://issues.apache.org/jira/browse/SPARK-42722 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42722) Python Connect def schema() should not cache the schema
[ https://issues.apache.org/jira/browse/SPARK-42722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17698072#comment-17698072 ] Apache Spark commented on SPARK-42722: -- User 'amaliujia' has created a pull request for this issue: https://github.com/apache/spark/pull/40343 > Python Connect def schema() should not cache the schema > > > Key: SPARK-42722 > URL: https://issues.apache.org/jira/browse/SPARK-42722 > Project: Spark > Issue Type: Improvement > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org