When attempting to write a Dataframe to SQL Server that contains java.sql.Timestamp or java.lang.boolean objects I get errors about the query that is formed being invalid. Specifically, java.sql.Timestamp objects try to be written as the Timestamp type, which is not appropriate for date/time storage in TSQL. They should be datetimes or datetime2s.
java.lang.boolean errors out because it tries to specify the width of the BIT field, which SQL Server doesn't like. However, I can write strings/varchars and ints without any issues. ________________________________________ From: Davies Liu [dav...@databricks.com] Sent: Monday, July 20, 2015 9:08 AM To: Young, Matthew T Cc: user@spark.apache.org Subject: Re: Spark and SQL Server Sorry for the confusing. What's the other issues? On Mon, Jul 20, 2015 at 8:26 AM, Young, Matthew T <matthew.t.yo...@intel.com> wrote: > Thanks Davies, that resolves the issue with Python. > > I was using the Java/Scala DataFrame documentation > <https://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/sql/DataFrameWriter.html> > and assuming that it was the same for PySpark< > http://spark.apache.org/docs/1.4.0/api/python/pyspark.sql.html#pyspark.sql.DataFrameWriter>. > I will keep this distinction in mind going forward. > > I guess we have to wait for Microsoft to release an SQL Server connector for > Spark to resolve the other issues. > > Cheers, > > -- Matthew Young > > ________________________________________ > From: Davies Liu [dav...@databricks.com] > Sent: Saturday, July 18, 2015 12:45 AM > To: Young, Matthew T > Cc: user@spark.apache.org > Subject: Re: Spark and SQL Server > > I think you have a mistake on call jdbc(), it should be: > > jdbc(self, url, table, mode, properties) > > You had use properties as the third parameter. > > On Fri, Jul 17, 2015 at 10:15 AM, Young, Matthew T > <matthew.t.yo...@intel.com> wrote: >> Hello, >> >> I am testing Spark interoperation with SQL Server via JDBC with Microsoft’s >> 4.2 JDBC Driver. Reading from the database works ok, but I have encountered >> a couple of issues writing back. In Scala 2.10 I can write back to the >> database except for a couple of types. >> >> >> 1. When I read a DataFrame from a table that contains a datetime column >> it comes in as a java.sql.Timestamp object in the DataFrame. This is alright >> for Spark purposes, but when I go to write this back to the database with >> df.write.jdbc(…) it errors out because it is trying to write the TimeStamp >> type to SQL Server, which is not a date/time storing type in TSQL. I think >> it should be writing a datetime, but I’m not sure how to tell Spark this. >> >> >> >> 2. A related misunderstanding happens when I try to write a >> java.lang.boolean to the database; it errors out because Spark is trying to >> specify the width of the bit type, which is illegal in SQL Server (error >> msg: Cannot specify a column width on data type bit). Do I need to edit >> Spark source to fix this behavior, or is there a configuration option >> somewhere that I am not aware of? >> >> >> When I attempt to write back to SQL Server in an IPython notebook, py4j >> seems unable to convert a Python dict into a Java hashmap, which is >> necessary for parameter passing. I’ve documented details of this problem >> with code examples >> here<http://stackoverflow.com/questions/31417653/java-util-hashmap-missing-in-pyspark-session>. >> Any advice would be appreciated. >> >> Thank you for your time, >> >> -- Matthew Young >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org