[ 
https://issues.apache.org/jira/browse/SPARK-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720747#comment-14720747
 ] 

Suresh Thalamati commented on SPARK-9078:
-----------------------------------------

@Bob, Reynold

I ran into the same issue when trying to write data frames into an existing 
table in DB2 database.
if you are not working on the fix, I would like to give it a try. 

Thank you for the analysis of the issue , my understanding is fix should do the 
following to address table exists problem with LIMIT syntax.:

-- Add table Exists method to the JdbcDialect interface, and allow dialects 
override the method as required for specific databases.
-- Default implementation of  table exists method should use 
DatabaseMetaData.getTables() to find if table exists. If that particular 
interface is not implement use  the query "select 1 from $table where 1=0".
-- Add table exist method that use LIMIT query to  MySQL , and Postgres 
dialects.

* Enhancing registering of dialect :  (I think this may have to be separate 
Jira to avoid confusion).

@Reynold : I am not understanding your comment on adding  option to pass 
through  the jdbc data source. If you can give an example that will be great. 

Are you referring to some thing like the following ?
 df.write.option("datasource.jdbc.dialects" 
"org.apache.DerbyDialect").jdbc("jdbc:deryby://<server:port/SAMPLE", "emp", 
properties)


> Use of non-standard LIMIT keyword in JDBC tableExists code
> ----------------------------------------------------------
>
>                 Key: SPARK-9078
>                 URL: https://issues.apache.org/jira/browse/SPARK-9078
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.1, 1.4.0
>            Reporter: Robert Beauchemin
>            Priority: Minor
>
> tableExists in  
> spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses 
> non-standard SQL (specifically, the LIMIT keyword) to determine whether a 
> table exists in a JDBC data source. This will cause an exception in many/most 
> JDBC databases that doesn't support LIMIT keyword. See 
> http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql
> To check for table existence or an exception, it could be recrafted around 
> "select 1 from $table where 0 = 1" which isn't the same (it returns an empty 
> resultset rather than the value '1'), but would support more data sources and 
> also support empty tables. Arguably ugly and possibly queries every row on 
> sources that don't support constant folding, but better than failing on JDBC 
> sources that don't support LIMIT. 
> Perhaps "supports LIMIT" could be a field in the JdbcDialect class for 
> databases that support keyword this to override. The ANSI standard is (OFFSET 
> and) FETCH. 
> The standard way to check for table existence would be to use 
> information_schema.tables which is a SQL standard but may not work for other 
> JDBC data sources that support SQL, but not the information_schema. The JDBC 
> DatabaseMetaData interface provides getSchemas()  that allows checking for 
> the information_schema in drivers that support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to