[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8676


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140608496
  
It has been merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140607740
  
@rxin I reverted the patch that caused those.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140607734
  
I've merged this.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140607639
  
@vanzin do you know what's going on with the tests? 

[error] Execution of test test.org.apache.spark.sql.JavaApplySchemaSuite 
failed: java.lang.ClassNotFoundException: 
org.apache.spark.deploy.yarn.ExtendedYarnTest



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140578273
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140578275
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42504/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140578152
  
  [Test build #42504 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42504/console)
 for   PR 8676 at commit 
[`ee7b842`](https://github.com/apache/spark/commit/ee7b8426c0cdadeecb2f0f07d4f62024daefed19).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140545384
  
  [Test build #42504 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42504/consoleFull)
 for   PR 8676 at commit 
[`ee7b842`](https://github.com/apache/spark/commit/ee7b8426c0cdadeecb2f0f07d4f62024daefed19).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140541508
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140541446
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140539473
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140526528
  
  [Test build #1760 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1760/console)
 for   PR 8676 at commit 
[`ee7b842`](https://github.com/apache/spark/commit/ee7b8426c0cdadeecb2f0f07d4f62024daefed19).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140520458
  
  [Test build #1760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1760/consoleFull)
 for   PR 8676 at commit 
[`ee7b842`](https://github.com/apache/spark/commit/ee7b8426c0cdadeecb2f0f07d4f62024daefed19).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140520174
  
(Oops spoke too soon - I will merge after tests pass)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140520067
  
Thanks. Merging this in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-15 Thread sureshthalamati
Github user sureshthalamati commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-140519265
  
@rxin  Thank you for reviewing the patch . Just to make sure tested with 
out  the next() call   on MySql, Postgres, and DB2, it worked fine.  Updated 
the pull request.   


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-11 Thread sureshthalamati
Github user sureshthalamati commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-139611781
  
Typo in my previous comment, I  meant  when  query is where 1=0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-11 Thread sureshthalamati
Github user sureshthalamati commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-139611398
  
next() will return false because resultset will be empty when query is 
where 1!=0.   executeQuery() will throw an exception if table is not found.  
next() call  is not  really required to  find if  the table exists or not. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8676#discussion_r39230872
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -42,10 +42,13 @@ object JdbcUtils extends Logging {
   /**
* Returns true if the table already exists in the JDBC database.
*/
-  def tableExists(conn: Connection, table: String): Boolean = {
+  def tableExists(conn: Connection, url: String, table: String): Boolean = 
{
+val dialect = JdbcDialects.get(url)
+
 // Somewhat hacky, but there isn't a good way to identify whether a 
table exists for all
-// SQL database systems, considering "table" could also include the 
database name.
-Try(conn.prepareStatement(s"SELECT 1 FROM $table LIMIT 
1").executeQuery().next()).isSuccess
+// SQL database systems using JDBC meta data calls, considering 
"table" could also include
+// the database name. Query used to find table exists can be overriden 
by the dialects.
+
Try(conn.prepareStatement(dialect.getTableExistsQuery(table)).executeQuery().next()).isSuccess
--- End diff --

will next still return success if the query is where 1 = 0? there is no 
result isn't there?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8676#discussion_r39230841
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -88,6 +88,17 @@ abstract class JdbcDialect {
   def quoteIdentifier(colName: String): String = {
 s$colName
   }
+
+  /**
+   * Get the SQL query that should be used to find if the given table 
exists. Dialects can
+   * override this method to return a query that works best in a 
particular database.
+   * @param table  The name of the table.
+   * @return The SQL query to use for checking the table.
+   */
+  def getTableExistsQuery(table: String): String = {
+s"SELECT * FROM $table WHERE 1=0"
--- End diff --

actually never mind we cannot quote it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8676#discussion_r39227358
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala ---
@@ -88,6 +88,17 @@ abstract class JdbcDialect {
   def quoteIdentifier(colName: String): String = {
 s$colName
   }
+
+  /**
+   * Get the SQL query that should be used to find if the given table 
exists. Dialects can
+   * override this method to return a query that works best in a 
particular database.
+   * @param table  The name of the table.
+   * @return The SQL query to use for checking the table.
+   */
+  def getTableExistsQuery(table: String): String = {
+s"SELECT * FROM $table WHERE 1=0"
--- End diff --

maybe we should quote the table here actually


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-10 Thread sureshthalamati
Github user sureshthalamati commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-139410536
  
@rxin 

Even if spark is running on jdk1.7,  customers using older version of 
drivers will run into AbstractMethodError  exception.   I think adding 
requirement for customers to use new drivers that implement getSchema() 
function will be unnecessary.

After implementing the current approach I got curious on how the jdbc read 
functionality  finds the meta data and learned 
org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.resolveTable  also uses 
 s"SELECT * FROM $table WHERE 1=0"  to get column information.

Alternative approach is to add getMetadataQuery(table:string) to the 
JdbcDialect interface that helps to  determine if table exists for write case , 
and column type information in the case of read  instead of 
getTableExistsQuery() as implemented  in the current pull request. It might be 
a milli second slower in the case of  write call for dialects that specify 
“select 1 from $table limit 1", instead of “select * from $table limit 
1”.  Advantage is one method to the interface will address both the cases.

Any comments ? 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-139097036
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-09 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/8676#issuecomment-139097004
  
FWIW, we dropped JVM 1.6 support in Spark 1.5.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9078] [SQL] Allow jdbc dialects to over...

2015-09-09 Thread sureshthalamati
GitHub user sureshthalamati opened a pull request:

https://github.com/apache/spark/pull/8676

[SPARK-9078] [SQL] Allow jdbc dialects to override the query used to check 
the table.

Current implementation uses query with a LIMIT clause to find if table 
already exists. This syntax works only in some database systems. This patch 
changes the default query to the one that is likely to work on most databases, 
and adds a new method to the  JdbcDialect abstract class to allow  dialects to 
override the default query.

I looked at using the JDBC meta data calls, it turns out there is no common 
way to find the current schema, catalog..etc.  There is a new method 
Connection.getSchema() , but that is available only starting jdk1.7 , and 
existing jdbc drivers may not have implemented it.  Other option was to use 
jdbc escape syntax clause for LIMIT, not sure on how well this supported in all 
the databases also. After looking at all the jdbc metadata options my 
conclusion was most common way is to use the simple select query with 'where 1 
=0' , and allow dialects to customize as needed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sureshthalamati/spark table_exists_spark-9078

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8676.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8676


commit d4787548cc0ec9408c36deaa64443554d19e7f5f
Author: sureshthalamati 
Date:   2015-09-10T01:35:24Z

Modifying query to check table exists to be more generic, and allow dialect 
implementations to specify the query.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org