Re: DataFrames initial jdbc loading - will it be utilizing a filter predicate?

Zhan Zhang Wed, 18 Nov 2015 13:51:52 -0800

When you have following query, 'account=== “acct1” will be pushdown to generate 
new query with “where account = acct1”


Thanks.

Zhan Zhang

On Nov 18, 2015, at 11:36 AM, Eran Medan 
<eran.me...@gmail.com<mailto:eran.me...@gmail.com>> wrote:

I understand that the following are equivalent

    df.filter('account === "acct1")

    sql("select * from tempTableName where account = 'acct1'")


But is Spark SQL "smart" to also push filter predicates down for the initial 
load?

e.g.
        sqlContext.read.jdbc(…).filter('account=== "acct1")

Is Spark "smart enough" to this for each partition?

       ‘select … where account= ‘acc1’ AND (partition where clause here)?

Or do I have to put it on each partition where clause otherwise it will load 
the entire set and only then filter it in memory?

[https://mailfoogae.appspot.com/t?sender=aZWhyYW5uLm1laGRhbkBnbWFpbC5jb20%3D&type=zerocontent&guid=4e81181c-98d1-4dac-b047-a4c9e7d864d9]ᐧ

Re: DataFrames initial jdbc loading - will it be utilizing a filter predicate?

Reply via email to