Re: Character encoding corruption in Spark JDBC connector

2016-09-13 Thread Sean Owen
Based on your description, this isn't a problem in Spark. It means your JDBC connector isn't interpreting bytes from the database according to the encoding in which they were written. It could be Latin1, sure. But if "new String(ResultSet.getBytes())" works, it's only because your platform's

Character encoding corruption in Spark JDBC connector

2016-09-13 Thread Mark Bittmann
Hello Spark community, I'm reading from a MySQL database into a Spark dataframe using the JDBC connector functionality, and I'm experiencing some character encoding issues. The default encoding for MySQL strings is latin1, but the mysql JDBC connector implementation of "ResultSet.getString()"

Character encoding corruption in Spark JDBC connector

2016-09-13 Thread Mark Bittmann
Hello Spark community, I'm reading from a MySQL database into a Spark dataframe using the JDBC connector functionality, and I'm experiencing some character encoding issues. The default encoding for MySQL stings is latin1, but the mysql JDBC connector implementation of "ResultSet.getString()" will