[ 
https://issues.apache.org/jira/browse/SPARK-46612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kent Yao reassigned SPARK-46612:
--------------------------------

    Assignee: Nguyen Phan Huy

> Clickhouse's JDBC throws `java.lang.IllegalArgumentException: Unknown data 
> type: string` when write array string with Apache Spark scala
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-46612
>                 URL: https://issues.apache.org/jira/browse/SPARK-46612
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.0
>            Reporter: Nguyen Phan Huy
>            Assignee: Nguyen Phan Huy
>            Priority: Major
>              Labels: pull-request-available
>
> Issue is also reported on Clickhouse's github: 
> [https://github.com/ClickHouse/clickhouse-java/issues/1505] 
> h3. Bug description
> When using Scala spark to write an array of string to Clickhouse, the driver 
> throws {{java.lang.IllegalArgumentException: Unknown data type: string}} 
> exception.
> Exception is thrown by: 
> [https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238]
> This was caused by Spark JDBC Utils tried to cast the type to lower case 
> ({{{}String{}}} -> {{{}string{}}}).
> [https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639]
> h3. Steps to reproduce
>  # Create Clickhouse table with String Array field 
> ([https://clickhouse.com/]).
>  # Write data to the table with scala Spark, via Clickhouse's JDBC 
> ([https://github.com/ClickHouse/clickhouse-java)] 
> {code:java}
>    // code extraction, will need to setup a Scala Spark job with clickhouse 
> jdbc
>     val clickHouseSchema = StructType(
>       Seq(
>         StructField("str_array", ArrayType(StringType))
>       )
>     )
>     val data = Seq(
>       Row(
>         Seq("a", "b")
>       )
>     )
>     val clickHouseDf = spark.createDataFrame(sc.parallelize(data), 
> clickHouseSchema)
>    
>     val props = new Properties
>     props.put("user", "default")
>     clickHouseDf.write
>       .mode(SaveMode.Append)
>       .option("driver", com.clickhouse.jdbc.ClickHouseDriver)
>       .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) 
> {code}
> h2. Fix
>  - [https://github.com/apache/spark/pull/44459] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to