How are you running your test here? Are you perhaps doing a .count()? On Sat, Jan 17, 2015 at 12:54 PM, Corey Nolet <cjno...@gmail.com> wrote:
> Michael, > > What I'm seeing (in Spark 1.2.0) is that the required columns being pushed > down to the DataRelation are not the product of the SELECT clause but > rather just the columns explicitly included in the WHERE clause. > > Examples from my testing: > > SELECT * FROM myTable --> The required columns are empty. > SELECT key1 FROM myTable --> The required columns are empty > SELECT * FROM myTable where key1 = 'val1' --> The required columns > contains key1. > SELECT key1,key2 FROM myTable where key1 = 'val1' --> The required columns > contains key1 > SELECT key1,key2 FROM myTable where key1 = 'val1' and key2 = 'val2' --> > The required columns cintains key1,key2 > > > > I created SPARK-5296 for the OR predicate to be pushed down in some > capacity. > > > > > > > > On Sat, Jan 17, 2015 at 3:38 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> 1) The fields in the SELECT clause are not pushed down to the predicate >>> pushdown API. I have many optimizations that allow fields to be filtered >>> out before the resulting object is serialized on the Accumulo tablet >>> server. How can I get the selection information from the execution plan? >>> I'm a little hesitant to implement the data relation that allows me to see >>> the logical plan because it's noted in the comments that it could change >>> without warning. >>> >> >> I'm not sure I understand. The list of required columns should be pushed >> down to the data source. Are you looking for something more complicated? >> >> >>> 2) I'm surprised to find that the predicate pushdown filters get >>> completely removed when I do anything more complex in a where clause other >>> than simple AND statements. Using an OR statement caused the filter array >>> that was passed into the PrunedFilteredDataSource to be empty. >>> >> >> This was just an initial cut at the set of predicates to push down. We >> can add Or. Mind opening a JIRA? >> > >