I understand that the following are equivalent
df.filter('account === "acct1")
sql("select * from tempTableName where account = 'acct1'")
But is Spark SQL "smart" to also push filter predicates down for the
initial load?
e.g.
sqlContext.read.jdbc(…).filter('account=== "acct1")
Is Spark "smart enough" to this for each partition?
‘select … where account= ‘acc1’ AND (partition where clause here)?
Or do I have to put it on each partition where clause otherwise it will
load the entire set and only then filter it in memory?
ᐧ