Marvin Rösch created SPARK-38327: ------------------------------------ Summary: JDBC Source with MariaDB connection returns column names as values Key: SPARK-38327 URL: https://issues.apache.org/jira/browse/SPARK-38327 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.1 Environment: MariaDB version 10.3.10
Running with spark-k8s-operator Reporter: Marvin Rösch Using a JDBC source with the official MariaDB JDBC driver and a JDBC connection URL like the following does not work as expected: {noformat} jdbc:mariadb://db.example.com:3306/schema {noformat} Assume we have a table "values" like the following in MariaDB: ||id (binary)||name (varchar)|| |0xAB|Name 1| |0xBC|Name 2| We intend to create and display a data frame from it like this: {code:scala} spark.read .format("jdbc") .option("url", "jdbc:mariadb://db.example.com:3306/schema") .option("dbtable", "values") .load() .show{code} *Expected Behavior* Using such a connection URL on an arbitrary MariaDB table or query results in a data frame that reflects the table structure and content from MariaDB correctly, with columns having the correct type and values. The output of the above should be {noformat} +----+------+ | id| name| +----+------+ |[AB]|Name 1| |[BC]|Name 2| +----+------+{noformat} *Observed Behavior* Result rows contain column names as values, making them effectively useless to work with. The actual output is {noformat} +-------+----+ | id|name| +-------+----+ |[69 64]|name| |[69 64]|name| +-------+----+{noformat} *Further information* An easy workaround appears to be specifying "mysql" instead of "mariadb" in the connection URL while explicitly specifying the MariaDB driver. I'd expect the mariadb URL to work out of the box, however. It looks like this has been an issue since at least 2016 according to a [StackOverflow post|https://stackoverflow.com/questions/38808463/incorrect-data-while-loading-jdbc-table-in-spark-sql]. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org