Hi Ningjun,

Haven't done this myself, saw your question and was curious about the
answer and found this article which you might find useful:
http://www.sparkexpert.com/2015/03/28/loading-database-data-into-spark-using-data-sources-api/

According this article, you can pass in your SQL statement in the "dbtable"
mapping, ie, something like:

val jdbcDF = sqlContext.read.format("jdbc")
    .options(
        Map("url" -> "jdbc:postgresql:dbserver",
                "dbtable" -> "(select docid, title, docText from
dbo.document where docid between 10 and 1000)"
)).load

-sujit

On Mon, Dec 7, 2015 at 8:26 AM, Wang, Ningjun (LNG-NPV) <
ningjun.w...@lexisnexis.com> wrote:

> How can I create a RDD from a SQL query against SQLServer database? Here
> is the example of dataframe
>
>
>
> http://spark.apache.org/docs/latest/sql-programming-guide.html#overview
>
>
>
>
>
> *val* jdbcDF *=* sqlContext.read.format("jdbc").options(
>
>   *Map*("url" -> "jdbc:postgresql:dbserver",
>
>   "dbtable" -> "schema.tablename")).load()
>
>
>
> This code create dataframe from a table. How can I create dataframe from a
> query, e.g. “select docid, title, docText from dbo.document where docid
> between 10 and 1000”?
>
>
>
> Ningjun
>
>
>

Reply via email to