Fwd: Spark for Oracle sample code

2015-09-25 Thread Cui Lin
Hello, All, I found the examples for JDBC connection are mostly read the whole table and then do operations like joining. val jdbcDF = sqlContext.read.format("jdbc").options( Map("url" -> "jdbc:postgresql:dbserver", "dbtable" -> "schema.tablename")).load() Sometimes it is not practical

Spark for Oracle sample code

2015-09-25 Thread Cui Lin
Hello, All, I found the examples for JDBC connection are mostly read the whole table and then do operations like joining. val jdbcDF = sqlContext.read.format("jdbc").options( Map("url" -> "jdbc:postgresql:dbserver", "dbtable" -> "schema.tablename")).load() Sometimes it is not practical

Re: Spark for Oracle sample code

2015-09-25 Thread Jonathan Yue
In your dbtable you can insert "select ..." instead of table name. I never tried, but saw example from the web. Best regards,  Jonathan  From: Cui Lin <icecreamlc...@gmail.com> To: user <user@spark.apache.org> Sent: Friday, September 25, 2015 4:12 PM Subject: S

Re: Spark for Oracle sample code

2015-09-25 Thread Michael Armbrust
In most cases predicates that you add to jdbcDF will be push down into oracle, preventing the whole table from being sent over. df.where("column = 1") Another common pattern is to save the table to parquet or something for repeat querying. Michael On Fri, Sep 25, 2015 at 3:13 PM, Cui Lin