[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation
[ https://issues.apache.org/jira/browse/SPARK-47211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SPARK-47211: --- Labels: pull-request-available (was: ) > Fix ignored PySpark Connect string collation > > > Key: SPARK-47211 > URL: https://issues.apache.org/jira/browse/SPARK-47211 > Project: Spark > Issue Type: Bug > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Nikola Mandic >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > When using Connect with PySpark, string collation silently gets dropped: > {code:java} > Client connected to the Spark Connect server at localhost > SparkSession available as 'spark'. > >>> spark.sql("select 'abc' collate 'UNICODE'") > DataFrame[collate(abc): string] > >>> from pyspark.sql.types import StructType, StringType, StructField > >>> spark.createDataFrame([], StructType([StructField('id', StringType(2))])) > DataFrame[id: string] > {code} > Instead of "string" type in dataframe, we should be seeing "string COLLATE > 'UNICODE'". -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47211) Fix ignored PySpark Connect string collation
[ https://issues.apache.org/jira/browse/SPARK-47211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Mandic updated SPARK-47211: -- Component/s: Connect > Fix ignored PySpark Connect string collation > > > Key: SPARK-47211 > URL: https://issues.apache.org/jira/browse/SPARK-47211 > Project: Spark > Issue Type: Bug > Components: Connect, PySpark >Affects Versions: 4.0.0 >Reporter: Nikola Mandic >Priority: Major > Fix For: 4.0.0 > > > When using Connect with PySpark, string collation silently gets dropped: > {code:java} > Client connected to the Spark Connect server at localhost > SparkSession available as 'spark'. > >>> spark.sql("select 'abc' collate 'UNICODE'") > DataFrame[collate(abc): string] > >>> from pyspark.sql.types import StructType, StringType, StructField > >>> spark.createDataFrame([], StructType([StructField('id', StringType(2))])) > DataFrame[id: string] > {code} > Instead of "string" type in dataframe, we should be seeing "string COLLATE > 'UNICODE'". -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org