[ https://issues.apache.org/jira/browse/SPARK-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-11261. ------------------------------- Resolution: Won't Fix > Provide a more flexible alternative to Jdbc RDD > ----------------------------------------------- > > Key: SPARK-11261 > URL: https://issues.apache.org/jira/browse/SPARK-11261 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Richard Marscher > > The existing JdbcRDD only covers a limited number of use cases by requiring > the semantics of your query to operate on upper and lower bound predicates > like: "select title, author from books where ? <= id and id <= ?" > However, there are many use cases that cannot use such a method and/or are > much more inefficient doing so. > For example, we have a MySQL table partitioned on a partition key. We don't > have range values to lookup but rather want to get all entries matching a > predicate and have Spark run 1 query in a partition against each logical > partition of our MySQL table. For example: "select * from devices where > partition_id = ? and app_id = 'abcd'". > Another use case, looking up against a distinct set of identifiers that don't > fall within an ordering. "select * from users where user_id in > (?,?,?,?,?,?,?)". The number of identifiers may be quite large and/or dynamic. > Solution: > Instead of addressing each use case differently with new RDD types, provide > an alternate, general RDD that gives the user direct control over how the > query is partitioned in Spark and filling in the placeholders. > The user should be able to control which placeholder values are available on > each partition of the RDD and also how they are inserted into the > PreparedStatement. Ideally it can support dynamic placeholder values like > inserting a set of values for an IN clause or similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org