[ https://issues.apache.org/jira/browse/SPARK-47707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
王俊博 updated SPARK-47707: ------------------------ Description: MySQL JDBC drivers `mysql-connector-java-5.1.49.jar` converts JSON type into Types.CHAR with a precision of Int.Max. When receiving CHAR with Int.Max precision, Spark executor will throw an error of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `. For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and precision is {{{}Int.Max{}}}. For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and precision is {{{}Int.Max{}}}. Spark use {{mysql-connector-java-8.0.16.jar}} is right. {code:java} private def getCatalystType( sqlType: Int, typeName: String, precision: Int, scale: Int, signed: Boolean, isTimestampNTZ: Boolean): DataType = sqlType match{ ... case java.sql.Types.LONGNVARCHAR => StringType ... } {code} If compatibility with 5.1.49 is not required, the current code is sufficient. was: MySQL JDBC drivers `mysql-connector-java-5.1.49.jar` converts JSON type into Types.CHAR with a precision of Int.Max. When receiving CHAR with Int.Max precision, Spark executor will throw an error of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `. For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and precision is {{{}Int.Max{}}}. For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and precision is {{{}Int.Max{}}}. Spark use {{mysql-connector-java-8.0.16.jar}} is right. ``` private def getCatalystType( sqlType: Int, typeName: String, precision: Int, scale: Int, signed: Boolean, isTimestampNTZ: Boolean): DataType = sqlType match { ... case java.sql.Types.LONGNVARCHAR => StringType ... } ``` If compatibility with 5.1.49 is not required, the current code is sufficient. > Special handling of JSON type for MySQL connector > ------------------------------------------------- > > Key: SPARK-47707 > URL: https://issues.apache.org/jira/browse/SPARK-47707 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.4.0 > Environment: mysql-connector-java-5.1.49.jar > spark-3.5.0 > Reporter: 王俊博 > Priority: Minor > > MySQL JDBC drivers `mysql-connector-java-5.1.49.jar` converts JSON type into > Types.CHAR with a precision of Int.Max. > When receiving CHAR with Int.Max precision, Spark executor will throw an > error of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `. > For {{mysql-connector-java-5.1.49.jar}} json sqlType is {{Char}} and > precision is {{{}Int.Max{}}}. > For {{mysql-connector-java-8.0.16.jar}} json sqlType is {{LONGVARCHAR}} and > precision is {{{}Int.Max{}}}. > Spark use {{mysql-connector-java-8.0.16.jar}} is right. > {code:java} > private def getCatalystType( > sqlType: Int, > typeName: String, > precision: Int, > scale: Int, > signed: Boolean, > isTimestampNTZ: Boolean): DataType = sqlType match{ > ... > case java.sql.Types.LONGNVARCHAR => StringType > ... > } > {code} > > If compatibility with 5.1.49 is not required, the current code is sufficient. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org