[ https://issues.apache.org/jira/browse/SPARK-46612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kent Yao resolved SPARK-46612. ------------------------------ Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 44459 [https://github.com/apache/spark/pull/44459] > Clickhouse's JDBC throws `java.lang.IllegalArgumentException: Unknown data > type: string` when write array string with Apache Spark scala > ---------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-46612 > URL: https://issues.apache.org/jira/browse/SPARK-46612 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.5.0 > Reporter: Nguyen Phan Huy > Assignee: Nguyen Phan Huy > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Issue is also reported on Clickhouse's github: > [https://github.com/ClickHouse/clickhouse-java/issues/1505] > h3. Bug description > When using Scala spark to write an array of string to Clickhouse, the driver > throws {{java.lang.IllegalArgumentException: Unknown data type: string}} > exception. > Exception is thrown by: > [https://github.com/ClickHouse/clickhouse-java/blob/aa3870eadb1a2d3675fd5119714c85851800f076/clickhouse-data/src/main/java/com/clickhouse/data/ClickHouseDataType.java#L238] > This was caused by Spark JDBC Utils tried to cast the type to lower case > ({{{}String{}}} -> {{{}string{}}}). > [https://github.com/apache/spark/blob/6b931530d75cb4f00236f9c6283de8ef450963ad/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L639] > h3. Steps to reproduce > # Create Clickhouse table with String Array field > ([https://clickhouse.com/]). > # Write data to the table with scala Spark, via Clickhouse's JDBC > ([https://github.com/ClickHouse/clickhouse-java)] > {code:java} > // code extraction, will need to setup a Scala Spark job with clickhouse > jdbc > val clickHouseSchema = StructType( > Seq( > StructField("str_array", ArrayType(StringType)) > ) > ) > val data = Seq( > Row( > Seq("a", "b") > ) > ) > val clickHouseDf = spark.createDataFrame(sc.parallelize(data), > clickHouseSchema) > > val props = new Properties > props.put("user", "default") > clickHouseDf.write > .mode(SaveMode.Append) > .option("driver", com.clickhouse.jdbc.ClickHouseDriver) > .jdbc("jdbc:clickhouse://localhost:8123/foo", table = "bar", props) > {code} > h2. Fix > - [https://github.com/apache/spark/pull/44459] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org