[ https://issues.apache.org/jira/browse/SPARK-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909972#comment-14909972 ]
Suresh Thalamati commented on SPARK-10849: ------------------------------------------ I am working on creating pull request for this issue. > Allow user to specify database column type for data frame fields when writing > data to jdbc data sources. > --------------------------------------------------------------------------------------------------------- > > Key: SPARK-10849 > URL: https://issues.apache.org/jira/browse/SPARK-10849 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.5.0 > Reporter: Suresh Thalamati > Priority: Minor > > Mapping data frame field type to database column type is addressed to large > extent by adding dialects, and Adding maxlength option in SPARK-10101 to > set the VARCHAR length size. > In some cases it is hard to determine max supported VARCHAR size , For > example DB2 Z/OS VARCHAR size depends on the page size. And some databases > also has ROW SIZE limits for VARCHAR. Specifying default CLOB for all String > columns will likely make read/write slow. > Allowing users to specify database type corresponding to the data frame field > will be useful in cases where users wants to fine tune mapping for one or two > fields, and is fine with default for all other fields . > I propose to make the following two properties available for users to set in > the data frame metadata when writing to JDBC data sources. > database.column.type -- column type to use for create table. > jdbc.column.type" -- jdbc type to use for setting null values. > Example : > val secdf = sc.parallelize( Array(("Apple","Revenue ..."), > ("Google","Income:123213"))).toDF("name", "report") > val metadataBuilder = new MetadataBuilder() > metadataBuilder.putString("database.column.type", "CLOB(100K)") > metadataBuilder.putLong("jdbc.type", java.sql.Types.CLOB) > val metadta = metadataBuilder.build() > val secReportDF = secdf.withColumn("report", col("report").as("report", > metadata)) > secReporrDF.write.jdbc("jdbc:mysql://<URL>/secdata", "reports", mysqlProps) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org