Re: Use of non-standard LIMIT keyword in JDBC tableExists code

2015-07-15 Thread Reynold Xin
Hi Bob,

Thanks for the email. You can select Spark as the project when you file a
JIRA ticket at https://issues.apache.org/jira/browse/SPARK



For select 1 from $table where 0=1 -- if the database's optimizer doesn't
do constant folding and short-circuit execution, could the query end up
scanning all the data?


On Wed, Jul 15, 2015 at 11:29 AM, Bob Beauchemin b...@sqlskills.com wrote:

 tableExists in
 spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala
 uses
 non-standard SQL (specifically, the LIMIT keyword) to determine whether a
 table exists in a JDBC data source. This will cause an exception in
 many/most JDBC databases that doesn't support LIMIT keyword. See

 http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql

 To check for table existence or an exception, it could be recrafted around
 select 1 from $table where 0 = 1 which isn't the same (it returns an
 empty
 resultset rather than the value '1'), but would support more data sources
 and also support empty tables. The standard way to check for existence
 would
 be to use information_schema.tables which is a SQL standard but may not
 work
 for other JDBC data sources that support SQL, but not the
 information_schema.

 As an aside, I tried to submit a bug through JIRA, but using Create
 Bug-Quick or Bug-Detailed through the Spark issue-tracker webpage appears
 to
 submit the bug as a bug against the Atlas project, rather than the Spark
 project.

 Cheers, Bob



 --
 View this message in context:
 http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253.html
 Sent from the Apache Spark Developers List mailing list archive at
 Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Use of non-standard LIMIT keyword in JDBC tableExists code

2015-07-15 Thread Bob Beauchemin
tableExists in 
spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses
non-standard SQL (specifically, the LIMIT keyword) to determine whether a
table exists in a JDBC data source. This will cause an exception in
many/most JDBC databases that doesn't support LIMIT keyword. See
http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql

To check for table existence or an exception, it could be recrafted around
select 1 from $table where 0 = 1 which isn't the same (it returns an empty
resultset rather than the value '1'), but would support more data sources
and also support empty tables. The standard way to check for existence would
be to use information_schema.tables which is a SQL standard but may not work
for other JDBC data sources that support SQL, but not the
information_schema.

As an aside, I tried to submit a bug through JIRA, but using Create
Bug-Quick or Bug-Detailed through the Spark issue-tracker webpage appears to
submit the bug as a bug against the Atlas project, rather than the Spark
project.

Cheers, Bob



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Use of non-standard LIMIT keyword in JDBC tableExists code

2015-07-15 Thread Bob Beauchemin
Granted the 1=0 thing is ugly and assumes constant-folding support or reads
way too much data.

Submitted JIRA SPARK-9078 (thanks for pointers) and expounded on possible
solutions a little bit more there.

Cheers, and thanks, Bob 



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Use-of-non-standard-LIMIT-keyword-in-JDBC-tableExists-code-tp13253p13265.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org