subject:"\[jira\] \[Updated\] \(SPARK\-47211\) Fix ignored PySpark Connect string collation"

[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation

2024-02-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-47211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47211:
---
Labels: pull-request-available  (was: )

> Fix ignored PySpark Connect string collation
> 
>
> Key: SPARK-47211
> URL: https://issues.apache.org/jira/browse/SPARK-47211
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When using Connect with PySpark, string collation silently gets dropped:
> {code:java}
> Client connected to the Spark Connect server at localhost
> SparkSession available as 'spark'.
> >>> spark.sql("select 'abc' collate 'UNICODE'")
> DataFrame[collate(abc): string]
> >>> from pyspark.sql.types import StructType, StringType, StructField
> >>> spark.createDataFrame([], StructType([StructField('id', StringType(2))]))
> DataFrame[id: string]
> {code}
> Instead of "string" type in dataframe, we should be seeing "string COLLATE 
> 'UNICODE'".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation

2024-02-28 Thread Nikola Mandic (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-47211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikola Mandic updated SPARK-47211:
--
Component/s: Connect

> Fix ignored PySpark Connect string collation
> 
>
> Key: SPARK-47211
> URL: https://issues.apache.org/jira/browse/SPARK-47211
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Nikola Mandic
>Priority: Major
> Fix For: 4.0.0
>
>
> When using Connect with PySpark, string collation silently gets dropped:
> {code:java}
> Client connected to the Spark Connect server at localhost
> SparkSession available as 'spark'.
> >>> spark.sql("select 'abc' collate 'UNICODE'")
> DataFrame[collate(abc): string]
> >>> from pyspark.sql.types import StructType, StringType, StructField
> >>> spark.createDataFrame([], StructType([StructField('id', StringType(2))]))
> DataFrame[id: string]
> {code}
> Instead of "string" type in dataframe, we should be seeing "string COLLATE 
> 'UNICODE'".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation

[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation

2 matches

Site Navigation

Mail list logo

Footer information