Re: Use of non-standard LIMIT keyword in JDBC tableExists code
Hi Bob, Thanks for the email. You can select Spark as the project when you file a JIRA ticket at https://issues.apache.org/jira/browse/SPARK For select 1 from $table where 0=1 -- if the database's optimizer doesn't do constant folding and short-circuit execution, could the query end up scanning all the data? On Wed, Jul 15, 2015 at 11:29 AM, Bob Beauchemin b...@sqlskills.com wrote: tableExists in spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses non-standard SQL (specifically, the LIMIT keyword) to determine whether a table exists in a JDBC data source. This will cause an exception in many/most JDBC databases that doesn't support LIMIT keyword. See http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql To check for table existence or an exception, it could be recrafted around select 1 from $table where 0 = 1 which isn't the same (it returns an empty resultset rather than the value '1'), but would support more data sources and also support empty tables. The standard way to check for existence would be to use information_schema.tables which is a SQL standard but may not work for other JDBC data sources that support SQL, but not the information_schema. As an aside, I tried to submit a bug through JIRA, but using Create Bug-Quick or Bug-Detailed through the Spark issue-tracker webpage appears to submit the bug as a bug against the Atlas project, rather than the Spark project. Cheers, Bob -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Use of non-standard LIMIT keyword in JDBC tableExists code
tableExists in spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses non-standard SQL (specifically, the LIMIT keyword) to determine whether a table exists in a JDBC data source. This will cause an exception in many/most JDBC databases that doesn't support LIMIT keyword. See http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql To check for table existence or an exception, it could be recrafted around select 1 from $table where 0 = 1 which isn't the same (it returns an empty resultset rather than the value '1'), but would support more data sources and also support empty tables. The standard way to check for existence would be to use information_schema.tables which is a SQL standard but may not work for other JDBC data sources that support SQL, but not the information_schema. As an aside, I tried to submit a bug through JIRA, but using Create Bug-Quick or Bug-Detailed through the Spark issue-tracker webpage appears to submit the bug as a bug against the Atlas project, rather than the Spark project. Cheers, Bob -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Use of non-standard LIMIT keyword in JDBC tableExists code
Granted the 1=0 thing is ugly and assumes constant-folding support or reads way too much data. Submitted JIRA SPARK-9078 (thanks for pointers) and expounded on possible solutions a little bit more there. Cheers, and thanks, Bob -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253p13265.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org