[jira] [Created] (SPARK-41110) Implement `DataFrame.sparkSession` in Python client
Rui Wang created SPARK-41110: Summary: Implement `DataFrame.sparkSession` in Python client Key: SPARK-41110 URL: https://issues.apache.org/jira/browse/SPARK-41110 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41105) Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset
Rui Wang created SPARK-41105: Summary: Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset Key: SPARK-41105 URL: https://issues.apache.org/jira/browse/SPARK-41105 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41104) Can insert NULL into hive table table with NOT NULL column
[ https://issues.apache.org/jira/browse/SPARK-41104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631946#comment-17631946 ] Rui Wang commented on SPARK-41104: -- Looks like HIVE only enforce `NOT NULL` since Hive 3.0.0 https://issues.apache.org/jira/browse/HIVE-16575 > Can insert NULL into hive table table with NOT NULL column > -- > > Key: SPARK-41104 > URL: https://issues.apache.org/jira/browse/SPARK-41104 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Critical > > spark-sql> CREATE TABLE tttd(c1 int not null); > 22/11/10 14:04:28 WARN ResolveSessionCatalog: A Hive serde table will be > created as there is no table provider specified. You can set > spark.sql.legacy.createHiveTableByDefault to false so that native data source > table will be created instead. > 22/11/10 14:04:28 WARN HiveMetaStore: Location: > file:/Users/serge.rielau/spark/spark-warehouse/tttd specified for > non-external table:tttd > Time taken: 0.078 seconds > spark-sql> INSERT INTO tttd VALUES(null); > Time taken: 0.36 seconds > spark-sql> SELECT * FROM tttd; > NULL > Time taken: 0.074 seconds, Fetched 1 row(s) > spark-sql> > Does hive not support NOT NULL? That's fine, but then we should fail on > CREATE TABLE -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41103) Document how to add a new proto field of messages
Rui Wang created SPARK-41103: Summary: Document how to add a new proto field of messages Key: SPARK-41103 URL: https://issues.apache.org/jira/browse/SPARK-41103 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41102) Merge SparkConnectPlanner and SparkConnectCommandPlanner
Rui Wang created SPARK-41102: Summary: Merge SparkConnectPlanner and SparkConnectCommandPlanner Key: SPARK-41102 URL: https://issues.apache.org/jira/browse/SPARK-41102 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41090) Enhance Dataset.createTempView testing coverage for db_name.view_name
[ https://issues.apache.org/jira/browse/SPARK-41090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41090: - Description: Add the following test case to `DatasetSuite`: dataset.createTempView("test_db.tempView") spark.catalog.tableExists("test_db.tempView") > Enhance Dataset.createTempView testing coverage for db_name.view_name > - > > Key: SPARK-41090 > URL: https://issues.apache.org/jira/browse/SPARK-41090 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > Add the following test case to `DatasetSuite`: > dataset.createTempView("test_db.tempView") > spark.catalog.tableExists("test_db.tempView") -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41090) Enhance Dataset.createTempView testing coverage for db_name.view_name
Rui Wang created SPARK-41090: Summary: Enhance Dataset.createTempView testing coverage for db_name.view_name Key: SPARK-41090 URL: https://issues.apache.org/jira/browse/SPARK-41090 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41086) Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE
Rui Wang created SPARK-41086: Summary: Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE Key: SPARK-41086 URL: https://issues.apache.org/jira/browse/SPARK-41086 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41086) Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE
[ https://issues.apache.org/jira/browse/SPARK-41086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41086: - Description: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER _LEGACY_ERROR_TEMP_1104 > Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE > -- > > Key: SPARK-41086 > URL: https://issues.apache.org/jira/browse/SPARK-41086 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > SECOND_FUNCTION_ARGUMENT_NOT_INTEGER > _LEGACY_ERROR_TEMP_1104 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41078) DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto
Rui Wang created SPARK-41078: Summary: DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto Key: SPARK-41078 URL: https://issues.apache.org/jira/browse/SPARK-41078 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41077) Rename `ColumnRef` to `Column` in Python client implementation
Rui Wang created SPARK-41077: Summary: Rename `ColumnRef` to `Column` in Python client implementation Key: SPARK-41077 URL: https://issues.apache.org/jira/browse/SPARK-41077 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41061) Support SelectExpr which apply Projection by expressions in Strings in Connect DSL
Rui Wang created SPARK-41061: Summary: Support SelectExpr which apply Projection by expressions in Strings in Connect DSL Key: SPARK-41061 URL: https://issues.apache.org/jira/browse/SPARK-41061 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter
[ https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630706#comment-17630706 ] Rui Wang edited comment on SPARK-41057 at 11/9/22 2:55 AM: --- [~dengziming] [~dengziming] Are you interested in this JIRA? was (Author: amaliujia): @dengziming Are you interested in this JIRA? > Support other data type conversion in the DataTypeProtoConverter > > > Key: SPARK-41057 > URL: https://issues.apache.org/jira/browse/SPARK-41057 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > In > https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34 > we only support INT, STRING and STRUCT type conversion to and from catalyst > and connect proto. > We should be able to support all the types defined by > https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter
[ https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630706#comment-17630706 ] Rui Wang commented on SPARK-41057: -- @dengziming Are you interested in this JIRA? > Support other data type conversion in the DataTypeProtoConverter > > > Key: SPARK-41057 > URL: https://issues.apache.org/jira/browse/SPARK-41057 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > In > https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34 > we only support INT, STRING and STRUCT type conversion to and from catalyst > and connect proto. > We should be able to support all the types defined by > https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter
[ https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41057: - Description: In https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34 we only support INT, STRING and STRUCT type conversion to and from catalyst and connect proto. We should be able to support all the types defined by https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto > Support other data type conversion in the DataTypeProtoConverter > > > Key: SPARK-41057 > URL: https://issues.apache.org/jira/browse/SPARK-41057 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > In > https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34 > we only support INT, STRING and STRUCT type conversion to and from catalyst > and connect proto. > We should be able to support all the types defined by > https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter
Rui Wang created SPARK-41057: Summary: Support other data type conversion in the DataTypeProtoConverter Key: SPARK-41057 URL: https://issues.apache.org/jira/browse/SPARK-41057 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-41046) Support CreateView in Connect DSL
[ https://issues.apache.org/jira/browse/SPARK-41046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630630#comment-17630630 ] Rui Wang commented on SPARK-41046: -- [~dengziming] Ah I have a working version locally already. Sorry about that. > Support CreateView in Connect DSL > - > > Key: SPARK-41046 > URL: https://issues.apache.org/jira/browse/SPARK-41046 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41046) Support CreateView in Connect DSL
Rui Wang created SPARK-41046: Summary: Support CreateView in Connect DSL Key: SPARK-41046 URL: https://issues.apache.org/jira/browse/SPARK-41046 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41036) `columns` API should use `schema` API to avoid data fetching
Rui Wang created SPARK-41036: Summary: `columns` API should use `schema` API to avoid data fetching Key: SPARK-41036 URL: https://issues.apache.org/jira/browse/SPARK-41036 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41034) Connect DataFrame should require RemoteSparkSession
Rui Wang created SPARK-41034: Summary: Connect DataFrame should require RemoteSparkSession Key: SPARK-41034 URL: https://issues.apache.org/jira/browse/SPARK-41034 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41033) RemoteSparkSession should only accept one `user_id`
Rui Wang created SPARK-41033: Summary: RemoteSparkSession should only accept one `user_id` Key: SPARK-41033 URL: https://issues.apache.org/jira/browse/SPARK-41033 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41026) Support Repartition in Connect DSL
Rui Wang created SPARK-41026: Summary: Support Repartition in Connect DSL Key: SPARK-41026 URL: https://issues.apache.org/jira/browse/SPARK-41026 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41010) Complete Support for Except and Intersect in Python client
Rui Wang created SPARK-41010: Summary: Complete Support for Except and Intersect in Python client Key: SPARK-41010 URL: https://issues.apache.org/jira/browse/SPARK-41010 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41002) Compatible `take`, `head` and `first` API in Python client
[ https://issues.apache.org/jira/browse/SPARK-41002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41002: - Summary: Compatible `take`, `head` and `first` API in Python client (was: Compatible `take` and `head` API in Python client ) > Compatible `take`, `head` and `first` API in Python client > --- > > Key: SPARK-41002 > URL: https://issues.apache.org/jira/browse/SPARK-41002 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41002) Compatible `take` and `head` API in Python client
Rui Wang created SPARK-41002: Summary: Compatible `take` and `head` API in Python client Key: SPARK-41002 URL: https://issues.apache.org/jira/browse/SPARK-41002 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40993) Migrate markdown style README to python/docs/development/testing.rst
Rui Wang created SPARK-40993: Summary: Migrate markdown style README to python/docs/development/testing.rst Key: SPARK-40993 URL: https://issues.apache.org/jira/browse/SPARK-40993 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40992) Support toDF(columnNames) in Connect DSL
Rui Wang created SPARK-40992: Summary: Support toDF(columnNames) in Connect DSL Key: SPARK-40992 URL: https://issues.apache.org/jira/browse/SPARK-40992 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40989) Improve `session.sql` testing coverage in Python client
Rui Wang created SPARK-40989: Summary: Improve `session.sql` testing coverage in Python client Key: SPARK-40989 URL: https://issues.apache.org/jira/browse/SPARK-40989 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40981) Support session.range in Python client
Rui Wang created SPARK-40981: Summary: Support session.range in Python client Key: SPARK-40981 URL: https://issues.apache.org/jira/browse/SPARK-40981 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40980) Support session.sql in Connect DSL
[ https://issues.apache.org/jira/browse/SPARK-40980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40980: - Summary: Support session.sql in Connect DSL (was: Improve test coverage for session.sql) > Support session.sql in Connect DSL > -- > > Key: SPARK-40980 > URL: https://issues.apache.org/jira/browse/SPARK-40980 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40980) Improve test coverage for session.sql
Rui Wang created SPARK-40980: Summary: Improve test coverage for session.sql Key: SPARK-40980 URL: https://issues.apache.org/jira/browse/SPARK-40980 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist
[ https://issues.apache.org/jira/browse/SPARK-40951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17626878#comment-17626878 ] Rui Wang commented on SPARK-40951: -- [~dongjoon] Is this JIRA fully resolved already? Can we close this JIRA now? > pyspark-connect tests should be skipped if pandas doesn't exist > --- > > Key: SPARK-40951 > URL: https://issues.apache.org/jira/browse/SPARK-40951 > Project: Spark > Issue Type: Sub-task > Components: PySpark, Tests >Affects Versions: 3.4.0 >Reporter: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40977) Complete Support for Union in Python client
Rui Wang created SPARK-40977: Summary: Complete Support for Union in Python client Key: SPARK-40977 URL: https://issues.apache.org/jira/browse/SPARK-40977 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40971) Imports more from connect proto package to avoid calling `proto.` for Connect DSL
Rui Wang created SPARK-40971: Summary: Imports more from connect proto package to avoid calling `proto.` for Connect DSL Key: SPARK-40971 URL: https://issues.apache.org/jira/browse/SPARK-40971 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40970) Support List[ColumnRef] for Join's on argument.
[ https://issues.apache.org/jira/browse/SPARK-40970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40970: - Description: Right now Join's on does not support a list of ColumnRef: [df.age == df2.age, df.name == df2.name], we can improve the expression system to figure out a way to support it. > Support List[ColumnRef] for Join's on argument. > --- > > Key: SPARK-40970 > URL: https://issues.apache.org/jira/browse/SPARK-40970 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > Right now Join's on does not support a list of ColumnRef: [df.age == df2.age, > df.name == df2.name], we can improve the expression system to figure out a > way to support it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40970) Support List[ColumnRef] for Join's on argument.
Rui Wang created SPARK-40970: Summary: Support List[ColumnRef] for Join's on argument. Key: SPARK-40970 URL: https://issues.apache.org/jira/browse/SPARK-40970 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40938) Support Alias for every Relation
Rui Wang created SPARK-40938: Summary: Support Alias for every Relation Key: SPARK-40938 URL: https://issues.apache.org/jira/browse/SPARK-40938 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40930) Support Collect() in Python client
Rui Wang created SPARK-40930: Summary: Support Collect() in Python client Key: SPARK-40930 URL: https://issues.apache.org/jira/browse/SPARK-40930 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40926) Refactor server side tests to only use DataFrame API
Rui Wang created SPARK-40926: Summary: Refactor server side tests to only use DataFrame API Key: SPARK-40926 URL: https://issues.apache.org/jira/browse/SPARK-40926 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40915) Improve `on` in Join in Python client
Rui Wang created SPARK-40915: Summary: Improve `on` in Join in Python client Key: SPARK-40915 URL: https://issues.apache.org/jira/browse/SPARK-40915 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40914) Mark internal API to be private[connect]
[ https://issues.apache.org/jira/browse/SPARK-40914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40914: - Summary: Mark internal API to be private[connect] (was: Mark private API to be private[connect]) > Mark internal API to be private[connect] > > > Key: SPARK-40914 > URL: https://issues.apache.org/jira/browse/SPARK-40914 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40914) Mark private API to be private[connect]
Rui Wang created SPARK-40914: Summary: Mark private API to be private[connect] Key: SPARK-40914 URL: https://issues.apache.org/jira/browse/SPARK-40914 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40883) Support Range in Connect proto
Rui Wang created SPARK-40883: Summary: Support Range in Connect proto Key: SPARK-40883 URL: https://issues.apache.org/jira/browse/SPARK-40883 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40879) Support Join UsingColumns in proto
Rui Wang created SPARK-40879: Summary: Support Join UsingColumns in proto Key: SPARK-40879 URL: https://issues.apache.org/jira/browse/SPARK-40879 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40875) Add .agg() to Connect DSL
Rui Wang created SPARK-40875: Summary: Add .agg() to Connect DSL Key: SPARK-40875 URL: https://issues.apache.org/jira/browse/SPARK-40875 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40875) Add .agg() to Connect DSL
[ https://issues.apache.org/jira/browse/SPARK-40875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622510#comment-17622510 ] Rui Wang commented on SPARK-40875: -- I am working on this. > Add .agg() to Connect DSL > - > > Key: SPARK-40875 > URL: https://issues.apache.org/jira/browse/SPARK-40875 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40839) [Python] Implement `DataFrame.sample`
[ https://issues.apache.org/jira/browse/SPARK-40839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40839: - Summary: [Python] Implement `DataFrame.sample` (was: Implement `DataFrame.sample`) > [Python] Implement `DataFrame.sample` > - > > Key: SPARK-40839 > URL: https://issues.apache.org/jira/browse/SPARK-40839 > Project: Spark > Issue Type: Sub-task > Components: Connect, PySpark >Affects Versions: 3.4.0 >Reporter: Ruifeng Zheng >Assignee: Ruifeng Zheng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40836) AnalyzeResult should use struct for schema
Rui Wang created SPARK-40836: Summary: AnalyzeResult should use struct for schema Key: SPARK-40836 URL: https://issues.apache.org/jira/browse/SPARK-40836 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40828) Drop Python test tables before and after unit tests
Rui Wang created SPARK-40828: Summary: Drop Python test tables before and after unit tests Key: SPARK-40828 URL: https://issues.apache.org/jira/browse/SPARK-40828 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40823) Connect Proto should carry unparsed identifiers
Rui Wang created SPARK-40823: Summary: Connect Proto should carry unparsed identifiers Key: SPARK-40823 URL: https://issues.apache.org/jira/browse/SPARK-40823 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40816) Python: rename LogicalPlan.collect to LogicalPlan.to_proto
Rui Wang created SPARK-40816: Summary: Python: rename LogicalPlan.collect to LogicalPlan.to_proto Key: SPARK-40816 URL: https://issues.apache.org/jira/browse/SPARK-40816 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40813) Add limit and offset to Connect DSL
Rui Wang created SPARK-40813: Summary: Add limit and offset to Connect DSL Key: SPARK-40813 URL: https://issues.apache.org/jira/browse/SPARK-40813 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40812) Add Deduplicate to Connect proto
Rui Wang created SPARK-40812: Summary: Add Deduplicate to Connect proto Key: SPARK-40812 URL: https://issues.apache.org/jira/browse/SPARK-40812 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40809) Add as(alias: String) to connect DSL
[ https://issues.apache.org/jira/browse/SPARK-40809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40809: - Summary: Add as(alias: String) to connect DSL (was: Add as(alias) to connect DSL) > Add as(alias: String) to connect DSL > > > Key: SPARK-40809 > URL: https://issues.apache.org/jira/browse/SPARK-40809 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40809) Add as(alias) to connect DSL
Rui Wang created SPARK-40809: Summary: Add as(alias) to connect DSL Key: SPARK-40809 URL: https://issues.apache.org/jira/browse/SPARK-40809 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40780) Add WHERE to Connect proto and DSL
Rui Wang created SPARK-40780: Summary: Add WHERE to Connect proto and DSL Key: SPARK-40780 URL: https://issues.apache.org/jira/browse/SPARK-40780 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40774) Add Sample to proto and DSL
Rui Wang created SPARK-40774: Summary: Add Sample to proto and DSL Key: SPARK-40774 URL: https://issues.apache.org/jira/browse/SPARK-40774 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40743) StructType should contain a list of StructField and each field should have a name
Rui Wang created SPARK-40743: Summary: StructType should contain a list of StructField and each field should have a name Key: SPARK-40743 URL: https://issues.apache.org/jira/browse/SPARK-40743 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40717) Support Column Alias in connect DSL
Rui Wang created SPARK-40717: Summary: Support Column Alias in connect DSL Key: SPARK-40717 URL: https://issues.apache.org/jira/browse/SPARK-40717 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40713) Improve SET operation support in the proto and the server
Rui Wang created SPARK-40713: Summary: Improve SET operation support in the proto and the server Key: SPARK-40713 URL: https://issues.apache.org/jira/browse/SPARK-40713 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40707) Add groupby to connect DSL and test more than one grouping expressions
Rui Wang created SPARK-40707: Summary: Add groupby to connect DSL and test more than one grouping expressions Key: SPARK-40707 URL: https://issues.apache.org/jira/browse/SPARK-40707 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method
[ https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40693: - Description: This is one example: for SparkConnectTestsPlanOnly, those unit tests access the mock remote session by `self.connect`, however mypy complains in as `error: "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` was: This is one example: for SparkConnectTestsPlanOnly, those unit tests access the mock remote session by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` > mypy complains accessing the variable defined in the class method > -- > > Key: SPARK-40693 > URL: https://issues.apache.org/jira/browse/SPARK-40693 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.0 > > > This is one example: > for SparkConnectTestsPlanOnly, those unit tests access the mock remote > session by `self.connect`, however mypy complains in as `error: > "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method
[ https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40693: - Description: This is one example: for SparkConnectTestsPlanOnly, those unit tests access the mock remote session by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` was: This is one example: for SparkConnectTestsPlanOnly, those unit tests access the remote session by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` > mypy complains accessing the variable defined in the class method > -- > > Key: SPARK-40693 > URL: https://issues.apache.org/jira/browse/SPARK-40693 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > This is one example: > for SparkConnectTestsPlanOnly, those unit tests access the mock remote > session by `self.connect`, how mypy complains in as `error: > "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method
[ https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-40693: - Description: This is one example: for SparkConnectTestsPlanOnly, those unit tests access the remote session by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" has no attribute "connect" [attr-defined]` > mypy complains accessing the variable defined in the class method > -- > > Key: SPARK-40693 > URL: https://issues.apache.org/jira/browse/SPARK-40693 > Project: Spark > Issue Type: Sub-task > Components: Connect >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > > This is one example: > for SparkConnectTestsPlanOnly, those unit tests access the remote session by > `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" > has no attribute "connect" [attr-defined]` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40693) mypy complains accessing the variable defined in the class method
Rui Wang created SPARK-40693: Summary: mypy complains accessing the variable defined in the class method Key: SPARK-40693 URL: https://issues.apache.org/jira/browse/SPARK-40693 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40645) Throw exception for Collect() and recommend to use toPandas()
Rui Wang created SPARK-40645: Summary: Throw exception for Collect() and recommend to use toPandas() Key: SPARK-40645 URL: https://issues.apache.org/jira/browse/SPARK-40645 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang Current connect `Collect()` return Pandas DataFrame, which does not match with PySpark DataFrame API: https://github.com/apache/spark/blob/ceb8527413288b4d5c54d3afd76d00c9e26817a1/python/pyspark/sql/connect/data_frame.py#L227. The underlying implementation has been generating Pandas DataFrame though. In this case, we can choose to use to `toPandas()` and throw exception for `Collect()`. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40587) SELECT * shouldn't be empty project list in proto.
Rui Wang created SPARK-40587: Summary: SELECT * shouldn't be empty project list in proto. Key: SPARK-40587 URL: https://issues.apache.org/jira/browse/SPARK-40587 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang Current proto uses empty project list for `SELECT *`. However, this is an implicit way that it is hard to differentiate `not set` and `set but empty`. For longer term proto compatibility, we should always use explicit fields for passing through information. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40586) Decouple plan transformation and validation on server side
Rui Wang created SPARK-40586: Summary: Decouple plan transformation and validation on server side Key: SPARK-40586 URL: https://issues.apache.org/jira/browse/SPARK-40586 Project: Spark Issue Type: Sub-task Components: Connect Affects Versions: 3.4.0 Reporter: Rui Wang Project connect, from some perspectives, can be thought as replacing the SQL parser to generate a parsed (but the difference that is unresolved) plan, then the plan is passed to the analyzer. This means that connect should also do validation on the proto as there are many in-validate parser cases that analyzer does not expect to see, which potentially could cause problems if connect only pass through the proto (of course have it translated) to analyzer. Meanwhile I think this is a good idea to decouple the validation and transformation so that we have two stages: stage 1: proto validation. For example validate if necessary fields are populated or not. stage 2: transformation, which convert the proto to a plan with assumption that the plan is valid parsed version of the plan. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40296) Error Class for DISTINCT function not found
Rui Wang created SPARK-40296: Summary: Error Class for DISTINCT function not found Key: SPARK-40296 URL: https://issues.apache.org/jira/browse/SPARK-40296 Project: Spark Issue Type: Task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40055) listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog
Rui Wang created SPARK-40055: Summary: listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog Key: SPARK-40055 URL: https://issues.apache.org/jira/browse/SPARK-40055 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39810) Catalog.tableExists should handle nested namespace
[ https://issues.apache.org/jira/browse/SPARK-39810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39810: - Summary: Catalog.tableExists should handle nested namespace (was: tableExists can reuse getTable code) > Catalog.tableExists should handle nested namespace > -- > > Key: SPARK-39810 > URL: https://issues.apache.org/jira/browse/SPARK-39810 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39828) Catalog.listTables() should respect currentCatalog
[ https://issues.apache.org/jira/browse/SPARK-39828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569140#comment-17569140 ] Rui Wang commented on SPARK-39828: -- however it seems that temp table will not be checked. > Catalog.listTables() should respect currentCatalog > -- > > Key: SPARK-39828 > URL: https://issues.apache.org/jira/browse/SPARK-39828 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39828) Catalog.listTables() should respect currentCatalog
Rui Wang created SPARK-39828: Summary: Catalog.listTables() should respect currentCatalog Key: SPARK-39828 URL: https://issues.apache.org/jira/browse/SPARK-39828 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39810) tableExists can reuse getTable code
Rui Wang created SPARK-39810: Summary: tableExists can reuse getTable code Key: SPARK-39810 URL: https://issues.apache.org/jira/browse/SPARK-39810 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39700) Deprecate API that has parameters (DBName, tableName/FunctionName)
Rui Wang created SPARK-39700: Summary: Deprecate API that has parameters (DBName, tableName/FunctionName) Key: SPARK-39700 URL: https://issues.apache.org/jira/browse/SPARK-39700 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39583) Make RefreshTable be compatible with 3 layer namespace
Rui Wang created SPARK-39583: Summary: Make RefreshTable be compatible with 3 layer namespace Key: SPARK-39583 URL: https://issues.apache.org/jira/browse/SPARK-39583 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39548) CreateView Command with a window clause query hit a wrong window definition not found issue
[ https://issues.apache.org/jira/browse/SPARK-39548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39548: - Summary: CreateView Command with a window clause query hit a wrong window definition not found issue (was: CreateView Command with a window clause query hit a wrong window definition not found issue.) > CreateView Command with a window clause query hit a wrong window definition > not found issue > --- > > Key: SPARK-39548 > URL: https://issues.apache.org/jira/browse/SPARK-39548 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > This query will hit a w2 window definition not found in `WindowSubstitute` > rule, however this is a bug since the w2 definition is defined in the query. > ``` > create or replace temporary view test_temp_view as > with step_1 as ( > select * , min(a) over w2 as min_a_over_w2 from (select 1 as a, 2 as b, 3 as > c) window w2 as (partition by b order by c)) , step_2 as > ( > select *, max(e) over w1 as max_a_over_w1 > from (select 1 as e, 2 as f, 3 as g) > join step_1 on true > window w1 as (partition by f order by g) > ) > select * > from step_2 > ``` > Also we can move the unresolved window expression check from > `WindowSubstitute` rule to `CheckAnalysis` phrase. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39548) CreateView Command with a window clause query hit a wrong window definition not found issue.
Rui Wang created SPARK-39548: Summary: CreateView Command with a window clause query hit a wrong window definition not found issue. Key: SPARK-39548 URL: https://issues.apache.org/jira/browse/SPARK-39548 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang This query will hit a w2 window definition not found in `WindowSubstitute` rule, however this is a bug since the w2 definition is defined in the query. ``` create or replace temporary view test_temp_view as with step_1 as ( select * , min(a) over w2 as min_a_over_w2 from (select 1 as a, 2 as b, 3 as c) window w2 as (partition by b order by c)) , step_2 as ( select *, max(e) over w1 as max_a_over_w1 from (select 1 as e, 2 as f, 3 as g) join step_1 on true window w1 as (partition by f order by g) ) select * from step_2 ``` Also we can move the unresolved window expression check from `WindowSubstitute` rule to `CheckAnalysis` phrase. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39506) CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs
Rui Wang created SPARK-39506: Summary: CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs Key: SPARK-39506 URL: https://issues.apache.org/jira/browse/SPARK-39506 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39263) GetTable, TableExists and DatabaseExists
Rui Wang created SPARK-39263: Summary: GetTable, TableExists and DatabaseExists Key: SPARK-39263 URL: https://issues.apache.org/jira/browse/SPARK-39263 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39236) Make CreateTable API and ListTables API compatible
[ https://issues.apache.org/jira/browse/SPARK-39236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539762#comment-17539762 ] Rui Wang commented on SPARK-39236: -- https://github.com/apache/spark/pull/36586 > Make CreateTable API and ListTables API compatible > --- > > Key: SPARK-39236 > URL: https://issues.apache.org/jira/browse/SPARK-39236 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L364 > https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L99 -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39236) Make CreateTable API and ListTables API compatible
Rui Wang created SPARK-39236: Summary: Make CreateTable API and ListTables API compatible Key: SPARK-39236 URL: https://issues.apache.org/jira/browse/SPARK-39236 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L364 https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L99 -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39235) Make Catalog API be compatible with 3-layer-namespace
[ https://issues.apache.org/jira/browse/SPARK-39235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39235: - Component/s: SQL (was: Spark Core) > Make Catalog API be compatible with 3-layer-namespace > - > > Key: SPARK-39235 > URL: https://issues.apache.org/jira/browse/SPARK-39235 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > We can make Catalog API support 3 layer namespace: > catalog_name.database_name.table_name -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39235) Make Catalog API be compatible with 3-layer-namespace
Rui Wang created SPARK-39235: Summary: Make Catalog API be compatible with 3-layer-namespace Key: SPARK-39235 URL: https://issues.apache.org/jira/browse/SPARK-39235 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: Rui Wang We can make Catalog API support 3 layer namespace: catalog_name.database_name.table_name -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39144) Nested subquery expressions deduplicate relations should be done bottom up
[ https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39144: - Description: When we have nested subquery expressions, there is a chance that deduplicate relations could replace an attributes with a wrong one. This is because the attributes replacement is done by top down than bottom up. This could happen if the subplan gets deduplicate relations first (thus two same relation with different attributes id), then a more complex plan built on top of the subplan (e.g. a UNION of queries with nested subquery expressions) can trigger this wrong attribute replacement error. > Nested subquery expressions deduplicate relations should be done bottom up > -- > > Key: SPARK-39144 > URL: https://issues.apache.org/jira/browse/SPARK-39144 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When we have nested subquery expressions, there is a chance that deduplicate > relations could replace an attributes with a wrong one. This is because the > attributes replacement is done by top down than bottom up. This could happen > if the subplan gets deduplicate relations first (thus two same relation with > different attributes id), then a more complex plan built on top of the > subplan (e.g. a UNION of queries with nested subquery expressions) can > trigger this wrong attribute replacement error. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39144) Nested subquery expressions deduplicate relations should be done bottom up
[ https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39144: - Summary: Nested subquery expressions deduplicate relations should be done bottom up (was: Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation) > Nested subquery expressions deduplicate relations should be done bottom up > -- > > Key: SPARK-39144 > URL: https://issues.apache.org/jira/browse/SPARK-39144 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39144) Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation
[ https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534580#comment-17534580 ] Rui Wang commented on SPARK-39144: -- Testing in https://github.com/apache/spark/pull/36503. Will come up with an example. > Spark SQL replace wrong attributes for nested subquery expression in which > all tables are the same relation > --- > > Key: SPARK-39144 > URL: https://issues.apache.org/jira/browse/SPARK-39144 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39144) Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation
Rui Wang created SPARK-39144: Summary: Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation Key: SPARK-39144 URL: https://issues.apache.org/jira/browse/SPARK-39144 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39012) SparkSQL parse partition value does not support all data types
[ https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39012: - Summary: SparkSQL parse partition value does not support all data types (was: SparkSQL infer schema does not support all data types) > SparkSQL parse partition value does not support all data types > -- > > Key: SPARK-39012 > URL: https://issues.apache.org/jira/browse/SPARK-39012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When Spark needs to infer schema, it needs to parse string to a type. Not all > data types are supported so far in this path. For example, binary is known to > not be supported. If a user uses binary column, and if the user does not use > a metastore, then SparkSQL could fall back to schema inference thus fail to > execute during table scan. This should be a bug as schema inference is > supported but some types are missing. > string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also > because when converting from a string, small scale type won't be identified > if there is a larger scale type. For example, short and long > Based on Spark SQL data types: > https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support > the following types: > BINARY > BOOLEAN > And there are two types that I am not sure if SparkSQL is supporting: > YearMonthIntervalType > DayTimeIntervalType -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types
[ https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39012: - Description: When Spark needs to infer schema, it needs to parse string to a type. Not all data types are supported so far in this path. For example, binary is known to not be supported. If a user uses binary column, and if the user does not use a metastore, then SparkSQL could fall back to schema inference thus fail to execute during table scan. This should be a bug as schema inference is supported but some types are missing. string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also because when converting from a string, small scale type won't be identified if there is a larger scale type. For example, short and long Based on Spark SQL data types: https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the following types: BINARY BOOLEAN And there are two types that I am not sure if SparkSQL is supporting: YearMonthIntervalType DayTimeIntervalType was: When Spark needs to infer schema, it needs to parse string to a type. Not all data types are supported so far in this path. For example, binary is known to not be supported. string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also because when converting from a string, small scale type won't be identified if there is a larger scale type. For example, short and long Based on Spark SQL data types: https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the following types: BINARY BOOLEAN And there are two types that I am not sure if SparkSQL is supporting: YearMonthIntervalType DayTimeIntervalType > SparkSQL infer schema does not support all data types > - > > Key: SPARK-39012 > URL: https://issues.apache.org/jira/browse/SPARK-39012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When Spark needs to infer schema, it needs to parse string to a type. Not all > data types are supported so far in this path. For example, binary is known to > not be supported. If a user uses binary column, and if the user does not use > a metastore, then SparkSQL could fall back to schema inference thus fail to > execute during table scan. This should be a bug as schema inference is > supported but some types are missing. > string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also > because when converting from a string, small scale type won't be identified > if there is a larger scale type. For example, short and long > Based on Spark SQL data types: > https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support > the following types: > BINARY > BOOLEAN > And there are two types that I am not sure if SparkSQL is supporting: > YearMonthIntervalType > DayTimeIntervalType -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types
[ https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39012: - Description: When Spark needs to infer schema, it needs to parse string to a type. Not all data types are supported so far in this path. For example, binary is known to not be supported. string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also because when converting from a string, small scale type won't be identified if there is a larger scale type. For example, short and long Based on Spark SQL data types: https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the following types: BINARY BOOLEAN And there are two types that I am not sure if SparkSQL is supporting: YearMonthIntervalType DayTimeIntervalType was: When Spark needs to infer schema, it needs to parse string to a type. Not all data types are supported so far in this path. For example, binary is spotted to not supported. string might be converted to all types except ARRAY, MAP, STRUCT, etc. Spark SQL data types: https://spark.apache.org/docs/latest/sql-ref-datatypes.html > SparkSQL infer schema does not support all data types > - > > Key: SPARK-39012 > URL: https://issues.apache.org/jira/browse/SPARK-39012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When Spark needs to infer schema, it needs to parse string to a type. Not all > data types are supported so far in this path. For example, binary is known to > not be supported. > string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also > because when converting from a string, small scale type won't be identified > if there is a larger scale type. For example, short and long > Based on Spark SQL data types: > https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support > the following types: > BINARY > BOOLEAN > And there are two types that I am not sure if SparkSQL is supporting: > YearMonthIntervalType > DayTimeIntervalType -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39012) SparkSQL infer schema does not support all data types
[ https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527767#comment-17527767 ] Rui Wang commented on SPARK-39012: -- PR is ready to support binary type https://github.com/apache/spark/pull/36344 > SparkSQL infer schema does not support all data types > - > > Key: SPARK-39012 > URL: https://issues.apache.org/jira/browse/SPARK-39012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When Spark needs to infer schema, it needs to parse string to a type. Not all > data types are supported so far in this path. For example, binary is spotted > to not supported. > string might be converted to all types except ARRAY, MAP, STRUCT, etc. > Spark SQL data types: > https://spark.apache.org/docs/latest/sql-ref-datatypes.html -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39012) SparkSQL Infer schema path does not support all data types
Rui Wang created SPARK-39012: Summary: SparkSQL Infer schema path does not support all data types Key: SPARK-39012 URL: https://issues.apache.org/jira/browse/SPARK-39012 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang When Spark needs to infer schema, it needs to parse string to a type. Not all data types are supported so far in this path. For example, binary is spotted to not supported. string might be converted to all types except ARRAY, MAP, STRUCT, etc. Spark SQL data types: https://spark.apache.org/docs/latest/sql-ref-datatypes.html -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types
[ https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-39012: - Summary: SparkSQL infer schema does not support all data types (was: SparkSQL Infer schema path does not support all data types) > SparkSQL infer schema does not support all data types > - > > Key: SPARK-39012 > URL: https://issues.apache.org/jira/browse/SPARK-39012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > When Spark needs to infer schema, it needs to parse string to a type. Not all > data types are supported so far in this path. For example, binary is spotted > to not supported. > string might be converted to all types except ARRAY, MAP, STRUCT, etc. > Spark SQL data types: > https://spark.apache.org/docs/latest/sql-ref-datatypes.html -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns empty stirng. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6. Examples {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns empty stirng. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6. Examples {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > split_part(str, delimiter, partNum) > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns empty stirng. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the end of > the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > h6. Examples > {code:java} > > SELECT _FUNC_('11.12.13', '.', 3); > 13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns empty stirng. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6. Examples {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6. Examples {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > split_part(str, delimiter, partNum) > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns empty stirng. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the end of > the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > h6. Examples > {code:java} > > SELECT _FUNC_('11.12.13', '.', 3); > 13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6. Examples {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > split_part(str, delimiter, partNum) > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns null. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the end of > the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > h6. Examples > {code:java} > > SELECT _FUNC_('11.12.13', '.', 3); > 13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6.Examples: {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} `split_part(str, delimiter, partNum)` {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6.Examples: {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > split_part(str, delimiter, partNum) > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns null. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the end of > the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > h6.Examples: > {code:java} > > SELECT _FUNC_('11.12.13', '.', 3); >13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} split_part(str, delimiter, partNum) {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6.Examples: {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > split_part(str, delimiter, partNum) > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns null. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the end of > the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > > SELECT _FUNC_('11.12.13', '.', 3); > 13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38063) Support SQL split_part function
[ https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-38063: - Description: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton h6. Syntax {code:java} `split_part(str, delimiter, partNum)` {code} h6. Arguments {code:java} str: string type delimiter: string type partNum: Integer type {code} h6. Note {code:java} 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. {code} h6.Examples: {code:java} > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' {code} was: `split_part()` is a commonly supported function by other systems such as Postgres and some other systems. The Spark equivalent is `element_at(split(arg, delim), part)` h5. Function Specificaiton {code:java} `split_part(str, delimiter, partNum)` str: string type delimiter: string type partNum: Integer type 1. This function splits `str` by `delimiter` and return requested part of the split (1-based). 2. If any input parameter is NULL, return NULL. 3. If the index is out of range of split parts, returns null. 4. If `partNum` is 0, throws an error. 5. If `partNum` is negative, the parts are counted backward from the end of the string 6. when delimiter is empty, str is considered not split thus there is just 1 split part. Examples: ``` > SELECT _FUNC_('11.12.13', '.', 3); 13 > SELECT _FUNC_(NULL, '.', 3); NULL > SELECT _FUNC_('11.12.13', '', 1); '11.12.13' ``` {code} > Support SQL split_part function > --- > > Key: SPARK-38063 > URL: https://issues.apache.org/jira/browse/SPARK-38063 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > > `split_part()` is a commonly supported function by other systems such as > Postgres and some other systems. The Spark equivalent is > `element_at(split(arg, delim), part)` > h5. Function Specificaiton > h6. Syntax > {code:java} > `split_part(str, delimiter, partNum)` > {code} > h6. Arguments > {code:java} > str: string type > delimiter: string type > partNum: Integer type > {code} > h6. Note > {code:java} > 1. This function splits `str` by `delimiter` and return requested part of the > split (1-based). > 2. If any input parameter is NULL, return NULL. > 3. If the index is out of range of split parts, returns null. > 4. If `partNum` is 0, throws an error. > 5. If `partNum` is negative, the parts are counted backward from the > end of the string > 6. when delimiter is empty, str is considered not split thus there is just 1 > split part. > {code} > h6.Examples: > {code:java} > > SELECT _FUNC_('11.12.13', '.', 3); >13 > > SELECT _FUNC_(NULL, '.', 3); > NULL > > SELECT _FUNC_('11.12.13', '', 1); > '11.12.13' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org