[jira] [Commented] (SPARK-10856) SQL Server dialect needs to map java.sql.Timestamp to DATETIME instead of TIMESTAMP

Henrik Behrens (JIRA) Tue, 06 Oct 2015 02:31:49 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-10856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944786#comment-14944786
 ]


Henrik Behrens commented on SPARK-10856:
----------------------------------------

Thank you for creating a pull request, I did it the same way at my site. 

In the mean time, I found two other types that need a special mapping for MS 
SQL Server:

1. java.sql.types.BIT needs to be mapped to BIT instead of BIT[n], otherwise we 
get the following error message with Spark:

com.microsoft.sqlserver.jdbc.SQLServerException: Column, parameter, or variable
#10: Cannot specify a column width on data type bit.
        at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError
(SQLServerException.java:217)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ
erStatement.java:1635)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePrep
aredStatement(SQLServerPreparedStatement.java:426)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecC
md.doExecute(SQLServerPreparedStatement.java:372)
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:6276)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe
rverConnection.java:1793)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer
verStatement.java:184)
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS
erverStatement.java:159)
        at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeUpdate
(SQLServerPreparedStatement.java:315)
        at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:277)

2. Strings of unlimited length should be mapped to NVARCHAR(MAX) in SQL Server 
instead of TEXT, because TEXT is deprecated (see 
https://msdn.microsoft.com/en-us/library/ms187993.aspx).

I tested the following patch successfully:

  override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
    case TimestampType => Some(JdbcType("DATETIME", java.sql.Types.TIMESTAMP))
    case StringType => Some(JdbcType("NVARCHAR(MAX)", java.sql.Types.NVARCHAR))
    case BooleanType => Some(JdbcType("BIT", java.sql.Types.BIT))
    case _ => None
  }

Please consider including the other two changes into your pull request. I you 
prefer, I can create two new bug reports.

> SQL Server dialect needs to map java.sql.Timestamp to DATETIME instead of 
> TIMESTAMP
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-10856
>                 URL: https://issues.apache.org/jira/browse/SPARK-10856
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0, 1.4.1, 1.5.0
>            Reporter: Henrik Behrens
>              Labels: patch
>
> When saving a DataFrame to MS SQL Server, en error is thrown if there is more 
> than one TIMESTAMP column:
> df.printSchema
> root
>  |-- Id: string (nullable = false)
>  |-- TypeInformation_CreatedBy: string (nullable = false)
>  |-- TypeInformation_ModifiedBy: string (nullable = true)
>  |-- TypeInformation_TypeStatus: integer (nullable = false)
>  |-- TypeInformation_CreatedAtDatabase: timestamp (nullable = false)
>  |-- TypeInformation_ModifiedAtDatabase: timestamp (nullable = true)
> df.write.mode("overwrite").jdbc(url, tablename, props)
> com.microsoft.sqlserver.jdbc.SQLServerException: A table can only have one 
> timestamp column. Because table 'DebtorTypeSet1' already has one, the column 
> 'TypeInformation_ModifiedAtDatabase' cannot be added.
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError
> (SQLServerException.java:217)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ
> erStatement.java:1635)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePrep
> aredStatement(SQLServerPreparedStatement.java:426)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecC
> md.doExecute(SQLServerPreparedStatement.java:372)
>         at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:6276)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe
> rverConnection.java:1793)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer
> verStatement.java:184)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS
> erverStatement.java:159)
>         at 
> com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeUpdate
> (SQLServerPreparedStatement.java:315)
> I tested this on Windows and SQL Server 12 using Spark 1.4.1.
> I think this can be fixed in a similar way to Spark-10419.
> As a refererence, here is the type mapping according to the SQL Server JDBC 
> driver (basicDT.java, extracted from sqljdbc_4.2.6420.100_enu.exe):
>    private static void displayRow(String title, ResultSet rs) {
>       try {
>          System.out.println(title);
>          System.out.println(rs.getInt(1) + " , " +            // SQL integer 
> type.
>                rs.getString(2) + " , " +                      // SQL char 
> type.
>                rs.getString(3) + " , " +                      // SQL varchar 
> type.
>                rs.getBoolean(4) + " , " +                     // SQL bit type.
>                rs.getDouble(5) + " , " +                      // SQL decimal 
> type.
>                rs.getDouble(6) + " , " +                      // SQL money 
> type.
>                rs.getTimestamp(7) + " , " +                   // SQL datetime 
> type.
>                rs.getDate(8) + " , " +                        // SQL date 
> type.
>                rs.getTime(9) + " , " +                        // SQL time 
> type.
>                rs.getTimestamp(10) + " , " +                  // SQL 
> datetime2 type.
>                ((SQLServerResultSet)rs).getDateTimeOffset(11)); // SQL 
> datetimeoffset type. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-10856) SQL Server dialect needs to map java.sql.Timestamp to DATETIME instead of TIMESTAMP

Reply via email to