[jira] [Created] (SPARK-41110) Implement `DataFrame.sparkSession` in Python client

2022-11-10 Thread Rui Wang (Jira)
Rui Wang created SPARK-41110:


 Summary: Implement `DataFrame.sparkSession` in Python client
 Key: SPARK-41110
 URL: https://issues.apache.org/jira/browse/SPARK-41110
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41105) Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset

2022-11-10 Thread Rui Wang (Jira)
Rui Wang created SPARK-41105:


 Summary: Adopt `optional` keyword from proto3 which offers 
`hasXXX` to differentiate if a field is set or unset 
 Key: SPARK-41105
 URL: https://issues.apache.org/jira/browse/SPARK-41105
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41104) Can insert NULL into hive table table with NOT NULL column

2022-11-10 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17631946#comment-17631946
 ] 

Rui Wang commented on SPARK-41104:
--

Looks like HIVE only enforce `NOT NULL` since Hive 3.0.0 
https://issues.apache.org/jira/browse/HIVE-16575

> Can insert NULL into hive table table with NOT NULL column
> --
>
> Key: SPARK-41104
> URL: https://issues.apache.org/jira/browse/SPARK-41104
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Priority: Critical
>
> spark-sql> CREATE TABLE tttd(c1 int not null);
> 22/11/10 14:04:28 WARN ResolveSessionCatalog: A Hive serde table will be 
> created as there is no table provider specified. You can set 
> spark.sql.legacy.createHiveTableByDefault to false so that native data source 
> table will be created instead.
> 22/11/10 14:04:28 WARN HiveMetaStore: Location: 
> file:/Users/serge.rielau/spark/spark-warehouse/tttd specified for 
> non-external table:tttd
> Time taken: 0.078 seconds
> spark-sql> INSERT INTO tttd VALUES(null);
> Time taken: 0.36 seconds
> spark-sql> SELECT * FROM tttd;
> NULL
> Time taken: 0.074 seconds, Fetched 1 row(s)
> spark-sql> 
> Does hive not support NOT NULL? That's fine, but then we should fail on 
> CREATE TABLE



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41103) Document how to add a new proto field of messages

2022-11-10 Thread Rui Wang (Jira)
Rui Wang created SPARK-41103:


 Summary: Document how to add a new proto field of messages
 Key: SPARK-41103
 URL: https://issues.apache.org/jira/browse/SPARK-41103
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41102) Merge SparkConnectPlanner and SparkConnectCommandPlanner

2022-11-10 Thread Rui Wang (Jira)
Rui Wang created SPARK-41102:


 Summary: Merge SparkConnectPlanner and SparkConnectCommandPlanner
 Key: SPARK-41102
 URL: https://issues.apache.org/jira/browse/SPARK-41102
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41090) Enhance Dataset.createTempView testing coverage for db_name.view_name

2022-11-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-41090:
-
Description: 
Add the following test case to `DatasetSuite`:

dataset.createTempView("test_db.tempView")
spark.catalog.tableExists("test_db.tempView")

> Enhance Dataset.createTempView testing coverage for db_name.view_name
> -
>
> Key: SPARK-41090
> URL: https://issues.apache.org/jira/browse/SPARK-41090
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> Add the following test case to `DatasetSuite`:
> dataset.createTempView("test_db.tempView")
> spark.catalog.tableExists("test_db.tempView")



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41090) Enhance Dataset.createTempView testing coverage for db_name.view_name

2022-11-09 Thread Rui Wang (Jira)
Rui Wang created SPARK-41090:


 Summary: Enhance Dataset.createTempView testing coverage for 
db_name.view_name
 Key: SPARK-41090
 URL: https://issues.apache.org/jira/browse/SPARK-41090
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41086) Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE

2022-11-09 Thread Rui Wang (Jira)
Rui Wang created SPARK-41086:


 Summary: Consolidate SecondArgumentXXX error to 
INVALID_PARAMETER_VALUE
 Key: SPARK-41086
 URL: https://issues.apache.org/jira/browse/SPARK-41086
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41086) Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE

2022-11-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-41086:
-
Description: 
SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
_LEGACY_ERROR_TEMP_1104

> Consolidate SecondArgumentXXX error to INVALID_PARAMETER_VALUE
> --
>
> Key: SPARK-41086
> URL: https://issues.apache.org/jira/browse/SPARK-41086
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
> _LEGACY_ERROR_TEMP_1104



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41078) DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto

2022-11-09 Thread Rui Wang (Jira)
Rui Wang created SPARK-41078:


 Summary: DataFrame `withColumnsRenamed` can be implemented through 
`RenameColumns` proto
 Key: SPARK-41078
 URL: https://issues.apache.org/jira/browse/SPARK-41078
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41077) Rename `ColumnRef` to `Column` in Python client implementation

2022-11-09 Thread Rui Wang (Jira)
Rui Wang created SPARK-41077:


 Summary: Rename `ColumnRef` to `Column` in Python client 
implementation 
 Key: SPARK-41077
 URL: https://issues.apache.org/jira/browse/SPARK-41077
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41061) Support SelectExpr which apply Projection by expressions in Strings in Connect DSL

2022-11-08 Thread Rui Wang (Jira)
Rui Wang created SPARK-41061:


 Summary: Support SelectExpr which apply Projection by expressions 
in Strings in Connect DSL
 Key: SPARK-41061
 URL: https://issues.apache.org/jira/browse/SPARK-41061
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter

2022-11-08 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630706#comment-17630706
 ] 

Rui Wang edited comment on SPARK-41057 at 11/9/22 2:55 AM:
---

[~dengziming] [~dengziming]
Are you interested in this JIRA?


was (Author: amaliujia):
@dengziming

Are you interested in this JIRA?

> Support other data type conversion in the DataTypeProtoConverter
> 
>
> Key: SPARK-41057
> URL: https://issues.apache.org/jira/browse/SPARK-41057
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34
>  we only support INT, STRING and STRUCT type conversion to and from catalyst 
> and connect proto.
> We should be able to support all the types defined by 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter

2022-11-08 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630706#comment-17630706
 ] 

Rui Wang commented on SPARK-41057:
--

@dengziming

Are you interested in this JIRA?

> Support other data type conversion in the DataTypeProtoConverter
> 
>
> Key: SPARK-41057
> URL: https://issues.apache.org/jira/browse/SPARK-41057
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34
>  we only support INT, STRING and STRUCT type conversion to and from catalyst 
> and connect proto.
> We should be able to support all the types defined by 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter

2022-11-08 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-41057:
-
Description: 
In 
https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34
 we only support INT, STRING and STRUCT type conversion to and from catalyst 
and connect proto.

We should be able to support all the types defined by 
https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto

> Support other data type conversion in the DataTypeProtoConverter
> 
>
> Key: SPARK-41057
> URL: https://issues.apache.org/jira/browse/SPARK-41057
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> In 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala#L34
>  we only support INT, STRING and STRUCT type conversion to and from catalyst 
> and connect proto.
> We should be able to support all the types defined by 
> https://github.com/apache/spark/blob/master/connector/connect/src/main/protobuf/spark/connect/types.proto



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41057) Support other data type conversion in the DataTypeProtoConverter

2022-11-08 Thread Rui Wang (Jira)
Rui Wang created SPARK-41057:


 Summary: Support other data type conversion in the 
DataTypeProtoConverter
 Key: SPARK-41057
 URL: https://issues.apache.org/jira/browse/SPARK-41057
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41046) Support CreateView in Connect DSL

2022-11-08 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630630#comment-17630630
 ] 

Rui Wang commented on SPARK-41046:
--

[~dengziming]

Ah I have a working version locally already. Sorry about that.

> Support CreateView in Connect DSL
> -
>
> Key: SPARK-41046
> URL: https://issues.apache.org/jira/browse/SPARK-41046
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41046) Support CreateView in Connect DSL

2022-11-07 Thread Rui Wang (Jira)
Rui Wang created SPARK-41046:


 Summary: Support CreateView in Connect DSL
 Key: SPARK-41046
 URL: https://issues.apache.org/jira/browse/SPARK-41046
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41036) `columns` API should use `schema` API to avoid data fetching

2022-11-07 Thread Rui Wang (Jira)
Rui Wang created SPARK-41036:


 Summary: `columns` API should use `schema` API to avoid data 
fetching
 Key: SPARK-41036
 URL: https://issues.apache.org/jira/browse/SPARK-41036
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41034) Connect DataFrame should require RemoteSparkSession

2022-11-07 Thread Rui Wang (Jira)
Rui Wang created SPARK-41034:


 Summary: Connect DataFrame should require RemoteSparkSession
 Key: SPARK-41034
 URL: https://issues.apache.org/jira/browse/SPARK-41034
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41033) RemoteSparkSession should only accept one `user_id`

2022-11-07 Thread Rui Wang (Jira)
Rui Wang created SPARK-41033:


 Summary: RemoteSparkSession should only accept one `user_id`
 Key: SPARK-41033
 URL: https://issues.apache.org/jira/browse/SPARK-41033
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41026) Support Repartition in Connect DSL

2022-11-06 Thread Rui Wang (Jira)
Rui Wang created SPARK-41026:


 Summary: Support Repartition in Connect DSL
 Key: SPARK-41026
 URL: https://issues.apache.org/jira/browse/SPARK-41026
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41010) Complete Support for Except and Intersect in Python client

2022-11-03 Thread Rui Wang (Jira)
Rui Wang created SPARK-41010:


 Summary: Complete Support for Except and Intersect in Python client
 Key: SPARK-41010
 URL: https://issues.apache.org/jira/browse/SPARK-41010
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41002) Compatible `take`, `head` and `first` API in Python client

2022-11-03 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-41002:
-
Summary: Compatible `take`, `head` and `first` API in Python client   (was: 
Compatible `take` and `head` API in Python client )

> Compatible `take`, `head` and `first` API in Python client 
> ---
>
> Key: SPARK-41002
> URL: https://issues.apache.org/jira/browse/SPARK-41002
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-41002) Compatible `take` and `head` API in Python client

2022-11-02 Thread Rui Wang (Jira)
Rui Wang created SPARK-41002:


 Summary: Compatible `take` and `head` API in Python client 
 Key: SPARK-41002
 URL: https://issues.apache.org/jira/browse/SPARK-41002
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40993) Migrate markdown style README to python/docs/development/testing.rst

2022-11-01 Thread Rui Wang (Jira)
Rui Wang created SPARK-40993:


 Summary: Migrate markdown style README to 
python/docs/development/testing.rst
 Key: SPARK-40993
 URL: https://issues.apache.org/jira/browse/SPARK-40993
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40992) Support toDF(columnNames) in Connect DSL

2022-11-01 Thread Rui Wang (Jira)
Rui Wang created SPARK-40992:


 Summary: Support toDF(columnNames) in Connect DSL
 Key: SPARK-40992
 URL: https://issues.apache.org/jira/browse/SPARK-40992
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40989) Improve `session.sql` testing coverage in Python client

2022-11-01 Thread Rui Wang (Jira)
Rui Wang created SPARK-40989:


 Summary: Improve `session.sql` testing coverage in Python client
 Key: SPARK-40989
 URL: https://issues.apache.org/jira/browse/SPARK-40989
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40981) Support session.range in Python client

2022-10-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40981:


 Summary: Support session.range in Python client
 Key: SPARK-40981
 URL: https://issues.apache.org/jira/browse/SPARK-40981
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40980) Support session.sql in Connect DSL

2022-10-31 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40980:
-
Summary: Support session.sql in Connect DSL  (was: Improve test coverage 
for session.sql)

> Support session.sql in Connect DSL
> --
>
> Key: SPARK-40980
> URL: https://issues.apache.org/jira/browse/SPARK-40980
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40980) Improve test coverage for session.sql

2022-10-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40980:


 Summary: Improve test coverage for session.sql
 Key: SPARK-40980
 URL: https://issues.apache.org/jira/browse/SPARK-40980
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40951) pyspark-connect tests should be skipped if pandas doesn't exist

2022-10-31 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17626878#comment-17626878
 ] 

Rui Wang commented on SPARK-40951:
--

[~dongjoon] Is this JIRA fully resolved already? Can we close this JIRA now? 

> pyspark-connect tests should be skipped if pandas doesn't exist
> ---
>
> Key: SPARK-40951
> URL: https://issues.apache.org/jira/browse/SPARK-40951
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, Tests
>Affects Versions: 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40977) Complete Support for Union in Python client

2022-10-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40977:


 Summary: Complete Support for Union in Python client
 Key: SPARK-40977
 URL: https://issues.apache.org/jira/browse/SPARK-40977
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40971) Imports more from connect proto package to avoid calling `proto.` for Connect DSL

2022-10-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40971:


 Summary: Imports more from connect proto package to avoid calling 
`proto.` for Connect DSL
 Key: SPARK-40971
 URL: https://issues.apache.org/jira/browse/SPARK-40971
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40970) Support List[ColumnRef] for Join's on argument.

2022-10-30 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40970:
-
Description: Right now Join's on does not support a list of ColumnRef: 
[df.age == df2.age, df.name == df2.name], we can improve the expression system 
to figure out a way to support it.

> Support List[ColumnRef] for Join's on argument.
> ---
>
> Key: SPARK-40970
> URL: https://issues.apache.org/jira/browse/SPARK-40970
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> Right now Join's on does not support a list of ColumnRef: [df.age == df2.age, 
> df.name == df2.name], we can improve the expression system to figure out a 
> way to support it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40970) Support List[ColumnRef] for Join's on argument.

2022-10-30 Thread Rui Wang (Jira)
Rui Wang created SPARK-40970:


 Summary: Support List[ColumnRef] for Join's on argument.
 Key: SPARK-40970
 URL: https://issues.apache.org/jira/browse/SPARK-40970
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40938) Support Alias for every Relation

2022-10-27 Thread Rui Wang (Jira)
Rui Wang created SPARK-40938:


 Summary: Support Alias for every Relation
 Key: SPARK-40938
 URL: https://issues.apache.org/jira/browse/SPARK-40938
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40930) Support Collect() in Python client

2022-10-26 Thread Rui Wang (Jira)
Rui Wang created SPARK-40930:


 Summary: Support Collect() in Python client
 Key: SPARK-40930
 URL: https://issues.apache.org/jira/browse/SPARK-40930
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40926) Refactor server side tests to only use DataFrame API

2022-10-26 Thread Rui Wang (Jira)
Rui Wang created SPARK-40926:


 Summary: Refactor server side tests to only use DataFrame API
 Key: SPARK-40926
 URL: https://issues.apache.org/jira/browse/SPARK-40926
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40915) Improve `on` in Join in Python client

2022-10-25 Thread Rui Wang (Jira)
Rui Wang created SPARK-40915:


 Summary: Improve `on` in Join in Python client
 Key: SPARK-40915
 URL: https://issues.apache.org/jira/browse/SPARK-40915
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40914) Mark internal API to be private[connect]

2022-10-25 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40914:
-
Summary: Mark internal API to be private[connect]  (was: Mark private API 
to be private[connect])

> Mark internal API to be private[connect]
> 
>
> Key: SPARK-40914
> URL: https://issues.apache.org/jira/browse/SPARK-40914
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40914) Mark private API to be private[connect]

2022-10-25 Thread Rui Wang (Jira)
Rui Wang created SPARK-40914:


 Summary: Mark private API to be private[connect]
 Key: SPARK-40914
 URL: https://issues.apache.org/jira/browse/SPARK-40914
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40883) Support Range in Connect proto

2022-10-22 Thread Rui Wang (Jira)
Rui Wang created SPARK-40883:


 Summary: Support Range in Connect proto
 Key: SPARK-40883
 URL: https://issues.apache.org/jira/browse/SPARK-40883
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40879) Support Join UsingColumns in proto

2022-10-21 Thread Rui Wang (Jira)
Rui Wang created SPARK-40879:


 Summary: Support Join UsingColumns in proto
 Key: SPARK-40879
 URL: https://issues.apache.org/jira/browse/SPARK-40879
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40875) Add .agg() to Connect DSL

2022-10-21 Thread Rui Wang (Jira)
Rui Wang created SPARK-40875:


 Summary: Add .agg() to Connect DSL
 Key: SPARK-40875
 URL: https://issues.apache.org/jira/browse/SPARK-40875
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-40875) Add .agg() to Connect DSL

2022-10-21 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-40875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17622510#comment-17622510
 ] 

Rui Wang commented on SPARK-40875:
--

I am working on this.

> Add .agg() to Connect DSL
> -
>
> Key: SPARK-40875
> URL: https://issues.apache.org/jira/browse/SPARK-40875
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40839) [Python] Implement `DataFrame.sample`

2022-10-18 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40839:
-
Summary: [Python] Implement `DataFrame.sample`  (was: Implement 
`DataFrame.sample`)

> [Python] Implement `DataFrame.sample`
> -
>
> Key: SPARK-40839
> URL: https://issues.apache.org/jira/browse/SPARK-40839
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 3.4.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40836) AnalyzeResult should use struct for schema

2022-10-18 Thread Rui Wang (Jira)
Rui Wang created SPARK-40836:


 Summary: AnalyzeResult should use struct for schema
 Key: SPARK-40836
 URL: https://issues.apache.org/jira/browse/SPARK-40836
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40828) Drop Python test tables before and after unit tests

2022-10-17 Thread Rui Wang (Jira)
Rui Wang created SPARK-40828:


 Summary: Drop Python test tables before and after unit tests
 Key: SPARK-40828
 URL: https://issues.apache.org/jira/browse/SPARK-40828
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40823) Connect Proto should carry unparsed identifiers

2022-10-17 Thread Rui Wang (Jira)
Rui Wang created SPARK-40823:


 Summary: Connect Proto should carry unparsed identifiers
 Key: SPARK-40823
 URL: https://issues.apache.org/jira/browse/SPARK-40823
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40816) Python: rename LogicalPlan.collect to LogicalPlan.to_proto

2022-10-16 Thread Rui Wang (Jira)
Rui Wang created SPARK-40816:


 Summary: Python: rename LogicalPlan.collect to LogicalPlan.to_proto
 Key: SPARK-40816
 URL: https://issues.apache.org/jira/browse/SPARK-40816
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40813) Add limit and offset to Connect DSL

2022-10-16 Thread Rui Wang (Jira)
Rui Wang created SPARK-40813:


 Summary: Add limit and offset to Connect DSL
 Key: SPARK-40813
 URL: https://issues.apache.org/jira/browse/SPARK-40813
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40812) Add Deduplicate to Connect proto

2022-10-16 Thread Rui Wang (Jira)
Rui Wang created SPARK-40812:


 Summary: Add Deduplicate to Connect proto
 Key: SPARK-40812
 URL: https://issues.apache.org/jira/browse/SPARK-40812
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40809) Add as(alias: String) to connect DSL

2022-10-16 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40809:
-
Summary: Add as(alias: String) to connect DSL  (was: Add as(alias) to 
connect DSL)

> Add as(alias: String) to connect DSL
> 
>
> Key: SPARK-40809
> URL: https://issues.apache.org/jira/browse/SPARK-40809
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40809) Add as(alias) to connect DSL

2022-10-16 Thread Rui Wang (Jira)
Rui Wang created SPARK-40809:


 Summary: Add as(alias) to connect DSL
 Key: SPARK-40809
 URL: https://issues.apache.org/jira/browse/SPARK-40809
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40780) Add WHERE to Connect proto and DSL

2022-10-12 Thread Rui Wang (Jira)
Rui Wang created SPARK-40780:


 Summary: Add WHERE to Connect proto and DSL
 Key: SPARK-40780
 URL: https://issues.apache.org/jira/browse/SPARK-40780
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40774) Add Sample to proto and DSL

2022-10-12 Thread Rui Wang (Jira)
Rui Wang created SPARK-40774:


 Summary: Add Sample to proto and DSL
 Key: SPARK-40774
 URL: https://issues.apache.org/jira/browse/SPARK-40774
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40743) StructType should contain a list of StructField and each field should have a name

2022-10-11 Thread Rui Wang (Jira)
Rui Wang created SPARK-40743:


 Summary: StructType should contain a list of StructField and each 
field should have a name
 Key: SPARK-40743
 URL: https://issues.apache.org/jira/browse/SPARK-40743
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40717) Support Column Alias in connect DSL

2022-10-09 Thread Rui Wang (Jira)
Rui Wang created SPARK-40717:


 Summary: Support Column Alias in connect DSL
 Key: SPARK-40717
 URL: https://issues.apache.org/jira/browse/SPARK-40717
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40713) Improve SET operation support in the proto and the server

2022-10-08 Thread Rui Wang (Jira)
Rui Wang created SPARK-40713:


 Summary: Improve SET operation support in the proto and the server
 Key: SPARK-40713
 URL: https://issues.apache.org/jira/browse/SPARK-40713
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40707) Add groupby to connect DSL and test more than one grouping expressions

2022-10-07 Thread Rui Wang (Jira)
Rui Wang created SPARK-40707:


 Summary: Add groupby to connect DSL and test more than one 
grouping expressions
 Key: SPARK-40707
 URL: https://issues.apache.org/jira/browse/SPARK-40707
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method

2022-10-06 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40693:
-
Description: 
This is one example: 

for SparkConnectTestsPlanOnly, those unit tests access the mock remote session 
by `self.connect`, however mypy complains in as `error: 
"SparkConnectTestsPlanOnly" has no attribute "connect"  [attr-defined]`


  was:
This is one example: 

for SparkConnectTestsPlanOnly, those unit tests access the mock remote session 
by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" 
has no attribute "connect"  [attr-defined]`



> mypy complains accessing the variable defined in the class method 
> --
>
> Key: SPARK-40693
> URL: https://issues.apache.org/jira/browse/SPARK-40693
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: 3.4.0
>
>
> This is one example: 
> for SparkConnectTestsPlanOnly, those unit tests access the mock remote 
> session by `self.connect`, however mypy complains in as `error: 
> "SparkConnectTestsPlanOnly" has no attribute "connect"  [attr-defined]`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method

2022-10-06 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40693:
-
Description: 
This is one example: 

for SparkConnectTestsPlanOnly, those unit tests access the mock remote session 
by `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" 
has no attribute "connect"  [attr-defined]`


  was:
This is one example: 

for SparkConnectTestsPlanOnly, those unit tests access the remote session by 
`self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" 
has no attribute "connect"  [attr-defined]`



> mypy complains accessing the variable defined in the class method 
> --
>
> Key: SPARK-40693
> URL: https://issues.apache.org/jira/browse/SPARK-40693
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> This is one example: 
> for SparkConnectTestsPlanOnly, those unit tests access the mock remote 
> session by `self.connect`, how mypy complains in as `error: 
> "SparkConnectTestsPlanOnly" has no attribute "connect"  [attr-defined]`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40693) mypy complains accessing the variable defined in the class method

2022-10-06 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-40693:
-
Description: 
This is one example: 

for SparkConnectTestsPlanOnly, those unit tests access the remote session by 
`self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" 
has no attribute "connect"  [attr-defined]`


> mypy complains accessing the variable defined in the class method 
> --
>
> Key: SPARK-40693
> URL: https://issues.apache.org/jira/browse/SPARK-40693
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>
> This is one example: 
> for SparkConnectTestsPlanOnly, those unit tests access the remote session by 
> `self.connect`, how mypy complains in as `error: "SparkConnectTestsPlanOnly" 
> has no attribute "connect"  [attr-defined]`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40693) mypy complains accessing the variable defined in the class method

2022-10-06 Thread Rui Wang (Jira)
Rui Wang created SPARK-40693:


 Summary: mypy complains accessing the variable defined in the 
class method 
 Key: SPARK-40693
 URL: https://issues.apache.org/jira/browse/SPARK-40693
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40645) Throw exception for Collect() and recommend to use toPandas()

2022-10-03 Thread Rui Wang (Jira)
Rui Wang created SPARK-40645:


 Summary: Throw exception for Collect() and recommend to use 
toPandas()
 Key: SPARK-40645
 URL: https://issues.apache.org/jira/browse/SPARK-40645
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang


Current connect `Collect()` return Pandas DataFrame, which does not match with 
PySpark DataFrame API: 
https://github.com/apache/spark/blob/ceb8527413288b4d5c54d3afd76d00c9e26817a1/python/pyspark/sql/connect/data_frame.py#L227.

The underlying implementation has been generating Pandas DataFrame though. In 
this case, we can choose to use to `toPandas()` and throw exception for 
`Collect()`. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40587) SELECT * shouldn't be empty project list in proto.

2022-09-27 Thread Rui Wang (Jira)
Rui Wang created SPARK-40587:


 Summary: SELECT * shouldn't be empty project list in proto.
 Key: SPARK-40587
 URL: https://issues.apache.org/jira/browse/SPARK-40587
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang


Current proto uses empty project list for `SELECT *`.  However, this is an 
implicit way that it is hard to differentiate `not set` and `set but empty`. 
For longer term proto compatibility, we should always use explicit fields for 
passing through information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40586) Decouple plan transformation and validation on server side

2022-09-27 Thread Rui Wang (Jira)
Rui Wang created SPARK-40586:


 Summary: Decouple plan transformation and validation on server 
side 
 Key: SPARK-40586
 URL: https://issues.apache.org/jira/browse/SPARK-40586
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang


Project connect, from some perspectives, can be thought as replacing the SQL 
parser to generate a parsed (but the difference that is unresolved) plan, then 
the plan is passed to the analyzer. This means that connect should also do 
validation on the proto as there are many in-validate parser cases that 
analyzer does not expect to see, which potentially could cause problems if 
connect only pass through the proto (of course have it translated) to analyzer.


Meanwhile I think this is a good idea to decouple the validation and 
transformation so that we have two stages:
stage 1: proto validation. For example validate if necessary fields are 
populated or not.
stage 2: transformation, which convert the proto to a plan with assumption that 
the plan is valid parsed version of the plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40296) Error Class for DISTINCT function not found

2022-08-31 Thread Rui Wang (Jira)
Rui Wang created SPARK-40296:


 Summary: Error Class for DISTINCT function not found
 Key: SPARK-40296
 URL: https://issues.apache.org/jira/browse/SPARK-40296
 Project: Spark
  Issue Type: Task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-40055) listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog

2022-08-11 Thread Rui Wang (Jira)
Rui Wang created SPARK-40055:


 Summary: listCatalogs should also return spark_catalog even 
spark_catalog implementation is defaultSessionCatalog
 Key: SPARK-40055
 URL: https://issues.apache.org/jira/browse/SPARK-40055
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39810) Catalog.tableExists should handle nested namespace

2022-07-20 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39810:
-
Summary: Catalog.tableExists should handle nested namespace  (was: 
tableExists can reuse getTable code)

> Catalog.tableExists should handle nested namespace
> --
>
> Key: SPARK-39810
> URL: https://issues.apache.org/jira/browse/SPARK-39810
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39828) Catalog.listTables() should respect currentCatalog

2022-07-20 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569140#comment-17569140
 ] 

Rui Wang commented on SPARK-39828:
--

however it seems that temp table will not be checked.

> Catalog.listTables() should respect currentCatalog
> --
>
> Key: SPARK-39828
> URL: https://issues.apache.org/jira/browse/SPARK-39828
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39828) Catalog.listTables() should respect currentCatalog

2022-07-20 Thread Rui Wang (Jira)
Rui Wang created SPARK-39828:


 Summary: Catalog.listTables() should respect currentCatalog
 Key: SPARK-39828
 URL: https://issues.apache.org/jira/browse/SPARK-39828
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39810) tableExists can reuse getTable code

2022-07-18 Thread Rui Wang (Jira)
Rui Wang created SPARK-39810:


 Summary: tableExists can reuse getTable code
 Key: SPARK-39810
 URL: https://issues.apache.org/jira/browse/SPARK-39810
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39700) Deprecate API that has parameters (DBName, tableName/FunctionName)

2022-07-06 Thread Rui Wang (Jira)
Rui Wang created SPARK-39700:


 Summary: Deprecate API that has parameters (DBName, 
tableName/FunctionName)
 Key: SPARK-39700
 URL: https://issues.apache.org/jira/browse/SPARK-39700
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39583) Make RefreshTable be compatible with 3 layer namespace

2022-06-24 Thread Rui Wang (Jira)
Rui Wang created SPARK-39583:


 Summary: Make RefreshTable be compatible with 3 layer namespace
 Key: SPARK-39583
 URL: https://issues.apache.org/jira/browse/SPARK-39583
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39548) CreateView Command with a window clause query hit a wrong window definition not found issue

2022-06-21 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39548:
-
Summary: CreateView Command with a window clause query hit a wrong window 
definition not found issue  (was: CreateView Command with a window clause query 
hit a wrong window definition not found issue.)

> CreateView Command with a window clause query hit a wrong window definition 
> not found issue
> ---
>
> Key: SPARK-39548
> URL: https://issues.apache.org/jira/browse/SPARK-39548
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> This query will hit a w2 window definition not found in `WindowSubstitute` 
> rule, however this is a bug since the w2 definition is defined in the query.
> ```
> create or replace temporary view test_temp_view as
> with step_1 as (
> select * , min(a) over w2 as min_a_over_w2 from (select 1 as a, 2 as b, 3 as 
> c) window w2 as (partition by b order by c)) , step_2 as
> (
> select *, max(e) over w1 as max_a_over_w1
> from (select 1 as e, 2 as f, 3 as g)
> join step_1 on true
> window w1 as (partition by f order by g)
> )
> select *
> from step_2
> ```
> Also we can move the unresolved window expression check from 
> `WindowSubstitute` rule  to `CheckAnalysis` phrase.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39548) CreateView Command with a window clause query hit a wrong window definition not found issue.

2022-06-21 Thread Rui Wang (Jira)
Rui Wang created SPARK-39548:


 Summary: CreateView Command with a window clause query hit a wrong 
window definition not found issue.
 Key: SPARK-39548
 URL: https://issues.apache.org/jira/browse/SPARK-39548
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.3.0
Reporter: Rui Wang


This query will hit a w2 window definition not found in `WindowSubstitute` 
rule, however this is a bug since the w2 definition is defined in the query.

```
create or replace temporary view test_temp_view as
with step_1 as (
select * , min(a) over w2 as min_a_over_w2 from (select 1 as a, 2 as b, 3 as c) 
window w2 as (partition by b order by c)) , step_2 as
(
select *, max(e) over w1 as max_a_over_w1
from (select 1 as e, 2 as f, 3 as g)
join step_1 on true
window w1 as (partition by f order by g)
)
select *
from step_2
```


Also we can move the unresolved window expression check from `WindowSubstitute` 
rule  to `CheckAnalysis` phrase.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39506) CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs

2022-06-17 Thread Rui Wang (Jira)
Rui Wang created SPARK-39506:


 Summary: CacheTable, isCached, UncacheTable, setCurrentCatalog, 
currentCatalog, listCatalogs
 Key: SPARK-39506
 URL: https://issues.apache.org/jira/browse/SPARK-39506
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39263) GetTable, TableExists and DatabaseExists

2022-05-23 Thread Rui Wang (Jira)
Rui Wang created SPARK-39263:


 Summary: GetTable, TableExists and DatabaseExists
 Key: SPARK-39263
 URL: https://issues.apache.org/jira/browse/SPARK-39263
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39236) Make CreateTable API and ListTables API compatible

2022-05-19 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539762#comment-17539762
 ] 

Rui Wang commented on SPARK-39236:
--

https://github.com/apache/spark/pull/36586

> Make CreateTable API and ListTables API compatible 
> ---
>
> Key: SPARK-39236
> URL: https://issues.apache.org/jira/browse/SPARK-39236
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L364
> https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L99



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39236) Make CreateTable API and ListTables API compatible

2022-05-19 Thread Rui Wang (Jira)
Rui Wang created SPARK-39236:


 Summary: Make CreateTable API and ListTables API compatible 
 Key: SPARK-39236
 URL: https://issues.apache.org/jira/browse/SPARK-39236
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.0
Reporter: Rui Wang


https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L364

https://github.com/apache/spark/blob/c6dccc7dd412a95007f5bb2584d69b85ff9ebf8e/sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L99



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39235) Make Catalog API be compatible with 3-layer-namespace

2022-05-19 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39235:
-
Component/s: SQL
 (was: Spark Core)

> Make Catalog API be compatible with 3-layer-namespace
> -
>
> Key: SPARK-39235
> URL: https://issues.apache.org/jira/browse/SPARK-39235
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> We can make Catalog API support 3 layer namespace: 
> catalog_name.database_name.table_name



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39235) Make Catalog API be compatible with 3-layer-namespace

2022-05-19 Thread Rui Wang (Jira)
Rui Wang created SPARK-39235:


 Summary: Make Catalog API be compatible with 3-layer-namespace
 Key: SPARK-39235
 URL: https://issues.apache.org/jira/browse/SPARK-39235
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: Rui Wang


We can make Catalog API support 3 layer namespace: 
catalog_name.database_name.table_name



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39144) Nested subquery expressions deduplicate relations should be done bottom up

2022-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39144:
-
Description: When we have nested subquery expressions, there is a chance 
that deduplicate relations could replace an attributes with a wrong one. This 
is because the attributes replacement is done by top down than bottom up. This 
could happen if the subplan gets deduplicate relations first (thus two same 
relation with different attributes id), then a more complex plan built on top 
of the subplan (e.g. a UNION of queries with nested subquery expressions) can 
trigger this wrong attribute replacement error.

> Nested subquery expressions deduplicate relations should be done bottom up
> --
>
> Key: SPARK-39144
> URL: https://issues.apache.org/jira/browse/SPARK-39144
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When we have nested subquery expressions, there is a chance that deduplicate 
> relations could replace an attributes with a wrong one. This is because the 
> attributes replacement is done by top down than bottom up. This could happen 
> if the subplan gets deduplicate relations first (thus two same relation with 
> different attributes id), then a more complex plan built on top of the 
> subplan (e.g. a UNION of queries with nested subquery expressions) can 
> trigger this wrong attribute replacement error.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39144) Nested subquery expressions deduplicate relations should be done bottom up

2022-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39144:
-
Summary: Nested subquery expressions deduplicate relations should be done 
bottom up  (was: Spark SQL replace wrong attributes for nested subquery 
expression in which all tables are the same relation)

> Nested subquery expressions deduplicate relations should be done bottom up
> --
>
> Key: SPARK-39144
> URL: https://issues.apache.org/jira/browse/SPARK-39144
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39144) Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation

2022-05-10 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534580#comment-17534580
 ] 

Rui Wang commented on SPARK-39144:
--

Testing in https://github.com/apache/spark/pull/36503. Will come up with an 
example.

> Spark SQL replace wrong attributes for nested subquery expression in which 
> all tables are the same relation
> ---
>
> Key: SPARK-39144
> URL: https://issues.apache.org/jira/browse/SPARK-39144
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39144) Spark SQL replace wrong attributes for nested subquery expression in which all tables are the same relation

2022-05-10 Thread Rui Wang (Jira)
Rui Wang created SPARK-39144:


 Summary: Spark SQL replace wrong attributes for nested subquery 
expression in which all tables are the same relation
 Key: SPARK-39144
 URL: https://issues.apache.org/jira/browse/SPARK-39144
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.3.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39012) SparkSQL parse partition value does not support all data types

2022-04-29 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39012:
-
Summary: SparkSQL parse partition value does not support all data types  
(was: SparkSQL infer schema does not support all data types)

> SparkSQL parse partition value does not support all data types
> --
>
> Key: SPARK-39012
> URL: https://issues.apache.org/jira/browse/SPARK-39012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is known to 
> not be supported. If a user uses binary column, and if the user does not use 
> a metastore, then SparkSQL could fall back to schema inference thus fail to 
> execute during table scan. This should be a bug as schema inference is 
> supported but some types are missing.
> string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
> because when converting from a string, small scale type won't be identified 
> if there is a larger scale type. For example, short and long 
> Based on Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support 
> the following types:
> BINARY
> BOOLEAN
> And there are two types that I am not sure if SparkSQL is supporting:
> YearMonthIntervalType
> DayTimeIntervalType



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types

2022-04-25 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39012:
-
Description: 
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is known to 
not be supported. If a user uses binary column, and if the user does not use a 
metastore, then SparkSQL could fall back to schema inference thus fail to 
execute during table scan. This should be a bug as schema inference is 
supported but some types are missing.

string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
because when converting from a string, small scale type won't be identified if 
there is a larger scale type. For example, short and long 

Based on Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the 
following types:

BINARY
BOOLEAN

And there are two types that I am not sure if SparkSQL is supporting:
YearMonthIntervalType
DayTimeIntervalType


  was:
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is known to 
not be supported. 

string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
because when converting from a string, small scale type won't be identified if 
there is a larger scale type. For example, short and long 

Based on Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the 
following types:

BINARY
BOOLEAN

And there are two types that I am not sure if SparkSQL is supporting:
YearMonthIntervalType
DayTimeIntervalType



> SparkSQL infer schema does not support all data types
> -
>
> Key: SPARK-39012
> URL: https://issues.apache.org/jira/browse/SPARK-39012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is known to 
> not be supported. If a user uses binary column, and if the user does not use 
> a metastore, then SparkSQL could fall back to schema inference thus fail to 
> execute during table scan. This should be a bug as schema inference is 
> supported but some types are missing.
> string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
> because when converting from a string, small scale type won't be identified 
> if there is a larger scale type. For example, short and long 
> Based on Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support 
> the following types:
> BINARY
> BOOLEAN
> And there are two types that I am not sure if SparkSQL is supporting:
> YearMonthIntervalType
> DayTimeIntervalType



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types

2022-04-25 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39012:
-
Description: 
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is known to 
not be supported. 

string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
because when converting from a string, small scale type won't be identified if 
there is a larger scale type. For example, short and long 

Based on Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support the 
following types:

BINARY
BOOLEAN

And there are two types that I am not sure if SparkSQL is supporting:
YearMonthIntervalType
DayTimeIntervalType


  was:
When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is spotted to 
not supported.

string might be converted to all types except ARRAY, MAP, STRUCT, etc.

Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html


> SparkSQL infer schema does not support all data types
> -
>
> Key: SPARK-39012
> URL: https://issues.apache.org/jira/browse/SPARK-39012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is known to 
> not be supported. 
> string might be converted to all types except ARRAY, MAP, STRUCT, etc. Also 
> because when converting from a string, small scale type won't be identified 
> if there is a larger scale type. For example, short and long 
> Based on Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html, we can support 
> the following types:
> BINARY
> BOOLEAN
> And there are two types that I am not sure if SparkSQL is supporting:
> YearMonthIntervalType
> DayTimeIntervalType



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39012) SparkSQL infer schema does not support all data types

2022-04-25 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527767#comment-17527767
 ] 

Rui Wang commented on SPARK-39012:
--

PR is ready to support binary type https://github.com/apache/spark/pull/36344

> SparkSQL infer schema does not support all data types
> -
>
> Key: SPARK-39012
> URL: https://issues.apache.org/jira/browse/SPARK-39012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is spotted 
> to not supported.
> string might be converted to all types except ARRAY, MAP, STRUCT, etc.
> Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39012) SparkSQL Infer schema path does not support all data types

2022-04-25 Thread Rui Wang (Jira)
Rui Wang created SPARK-39012:


 Summary: SparkSQL Infer schema path does not support all data types
 Key: SPARK-39012
 URL: https://issues.apache.org/jira/browse/SPARK-39012
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.3.0
Reporter: Rui Wang


When Spark needs to infer schema, it needs to parse string to a type. Not all 
data types are supported so far in this path. For example, binary is spotted to 
not supported.

string might be converted to all types except ARRAY, MAP, STRUCT, etc.

Spark SQL data types: 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39012) SparkSQL infer schema does not support all data types

2022-04-25 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-39012:
-
Summary: SparkSQL infer schema does not support all data types  (was: 
SparkSQL Infer schema path does not support all data types)

> SparkSQL infer schema does not support all data types
> -
>
> Key: SPARK-39012
> URL: https://issues.apache.org/jira/browse/SPARK-39012
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> When Spark needs to infer schema, it needs to parse string to a type. Not all 
> data types are supported so far in this path. For example, binary is spotted 
> to not supported.
> string might be converted to all types except ARRAY, MAP, STRUCT, etc.
> Spark SQL data types: 
> https://spark.apache.org/docs/latest/sql-ref-datatypes.html



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-10 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If the index is out of range of split parts, returns empty stirng.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6. Examples
{code:java}
> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns empty stirng.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6. Examples
{code:java}
> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}







> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> split_part(str, delimiter, partNum)
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If the index is out of range of split parts, returns empty stirng.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the end of 
> the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> h6. Examples
> {code:java}
> > SELECT _FUNC_('11.12.13', '.', 3);
> 13
> > SELECT _FUNC_(NULL, '.', 3);
> NULL
> > SELECT _FUNC_('11.12.13', '', 1);
> '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-10 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns empty stirng.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6. Examples
{code:java}
> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6. Examples
{code:java}
> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}







> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> split_part(str, delimiter, partNum)
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If  the index is out of range of split parts, returns empty stirng.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the end of 
> the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> h6. Examples
> {code:java}
> > SELECT _FUNC_('11.12.13', '.', 3);
> 13
> > SELECT _FUNC_(NULL, '.', 3);
> NULL
> > SELECT _FUNC_('11.12.13', '', 1);
> '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6. Examples
{code:java}
> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}







> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> split_part(str, delimiter, partNum)
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If  the index is out of range of split parts, returns null.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the end of 
> the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> h6. Examples
> {code:java}
> > SELECT _FUNC_('11.12.13', '.', 3);
> 13
> > SELECT _FUNC_(NULL, '.', 3);
> NULL
> > SELECT _FUNC_('11.12.13', '', 1);
> '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6.Examples:
{code:java}
  > SELECT _FUNC_('11.12.13', '.', 3);
   13
  > SELECT _FUNC_(NULL, '.', 3);
  NULL
  > SELECT _FUNC_('11.12.13', '', 1);
  '11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
`split_part(str, delimiter, partNum)`
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the
  end of the string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6.Examples:
{code:java}
  > SELECT _FUNC_('11.12.13', '.', 3);
   13
  > SELECT _FUNC_(NULL, '.', 3);
  NULL
  > SELECT _FUNC_('11.12.13', '', 1);
  '11.12.13'
{code}







> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> split_part(str, delimiter, partNum)
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If  the index is out of range of split parts, returns null.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the end of 
> the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> h6.Examples:
> {code:java}
>   > SELECT _FUNC_('11.12.13', '.', 3);
>13
>   > SELECT _FUNC_(NULL, '.', 3);
>   NULL
>   > SELECT _FUNC_('11.12.13', '', 1);
>   '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

> SELECT _FUNC_('11.12.13', '.', 3);
13
> SELECT _FUNC_(NULL, '.', 3);
NULL
> SELECT _FUNC_('11.12.13', '', 1);
'11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
split_part(str, delimiter, partNum)
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the end of the 
string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6.Examples:
{code:java}
  > SELECT _FUNC_('11.12.13', '.', 3);
   13
  > SELECT _FUNC_(NULL, '.', 3);
  NULL
  > SELECT _FUNC_('11.12.13', '', 1);
  '11.12.13'
{code}







> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> split_part(str, delimiter, partNum)
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If  the index is out of range of split parts, returns null.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the end of 
> the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> > SELECT _FUNC_('11.12.13', '.', 3);
> 13
> > SELECT _FUNC_(NULL, '.', 3);
> NULL
> > SELECT _FUNC_('11.12.13', '', 1);
> '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38063) Support SQL split_part function

2022-02-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated SPARK-38063:
-
Description: 
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton

h6. Syntax

{code:java}
`split_part(str, delimiter, partNum)`
{code}

h6. Arguments
{code:java}
str: string type
delimiter: string type
partNum: Integer type
{code}

h6. Note
{code:java}
1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the
  end of the string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 
{code}

h6.Examples:
{code:java}
  > SELECT _FUNC_('11.12.13', '.', 3);
   13
  > SELECT _FUNC_(NULL, '.', 3);
  NULL
  > SELECT _FUNC_('11.12.13', '', 1);
  '11.12.13'
{code}






  was:
`split_part()` is a commonly supported function by other systems such as 
Postgres and some other systems. The Spark equivalent  is 
`element_at(split(arg, delim), part)`



h5. Function Specificaiton


{code:java}

`split_part(str, delimiter, partNum)`

str: string type
delimiter: string type
partNum: Integer type

1. This function splits `str` by `delimiter` and return requested part of the 
split (1-based). 
2. If any input parameter is NULL, return NULL.
3. If  the index is out of range of split parts, returns null.
4. If `partNum` is 0, throws an error.
5. If `partNum` is negative, the parts are counted backward from the
  end of the string
6. when delimiter is empty, str is considered not split thus there is just 1 
split part. 

Examples:
```
  > SELECT _FUNC_('11.12.13', '.', 3);
   13
  > SELECT _FUNC_(NULL, '.', 3);
  NULL
  > SELECT _FUNC_('11.12.13', '', 1);
  '11.12.13'
```
{code}






> Support SQL split_part function
> ---
>
> Key: SPARK-38063
> URL: https://issues.apache.org/jira/browse/SPARK-38063
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Rui Wang
>Priority: Major
>
> `split_part()` is a commonly supported function by other systems such as 
> Postgres and some other systems. The Spark equivalent  is 
> `element_at(split(arg, delim), part)`
> h5. Function Specificaiton
> h6. Syntax
> {code:java}
> `split_part(str, delimiter, partNum)`
> {code}
> h6. Arguments
> {code:java}
> str: string type
> delimiter: string type
> partNum: Integer type
> {code}
> h6. Note
> {code:java}
> 1. This function splits `str` by `delimiter` and return requested part of the 
> split (1-based). 
> 2. If any input parameter is NULL, return NULL.
> 3. If  the index is out of range of split parts, returns null.
> 4. If `partNum` is 0, throws an error.
> 5. If `partNum` is negative, the parts are counted backward from the
>   end of the string
> 6. when delimiter is empty, str is considered not split thus there is just 1 
> split part. 
> {code}
> h6.Examples:
> {code:java}
>   > SELECT _FUNC_('11.12.13', '.', 3);
>13
>   > SELECT _FUNC_(NULL, '.', 3);
>   NULL
>   > SELECT _FUNC_('11.12.13', '', 1);
>   '11.12.13'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2   3   4   >