[GitHub] spark pull request: [SPARK-2710] [SQL] Build SchemaRDD from a Jdbc...

chutium Tue, 26 Aug 2014 03:25:02 -0700

Github user chutium commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1612#discussion_r16705825
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala ---
    @@ -81,8 +113,14 @@ class JdbcRDD[T: ClassTag](
           logInfo("statement fetch size set to: " + stmt.getFetchSize + " to 
force MySQL streaming ")
         }
     
    -    stmt.setLong(1, part.lower)
    -    stmt.setLong(2, part.upper)
    +    val parameterCount = stmt.getParameterMetaData.getParameterCount
    +    if (parameterCount > 0) {
    --- End diff --
    
    i am afraid they do not think it is a problem, in the original comment of 
JdbcRDD:
    ```
     * @param sql the text of the query.
     *   The query must contain two ? placeholders for parameters used to 
partition the results.
     *   E.g. "select title, author from books where ? <= id and id <= ?"
    ```
    
    but i believe many users just want to get the whole table out of RDBMS 
simply, and then do some calculation in Spark's magic world... how many 
partitions will be created is no matter, so in the normal use case, the tables 
stored in RDBMS are small, therefore these two ? placeholders for partitioning 
is not always necessary.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2710] [SQL] Build SchemaRDD from a Jdbc...

Reply via email to