Re: Spark SQL Custom Predicate Pushdown

2015-01-17 Thread Michael Armbrust
1) The fields in the SELECT clause are not pushed down to the predicate pushdown API. I have many optimizations that allow fields to be filtered out before the resulting object is serialized on the Accumulo tablet server. How can I get the selection information from the execution plan? I'm a

Re: Spark SQL Custom Predicate Pushdown

2015-01-17 Thread Corey Nolet
I see now. It optimizes the selection semantics so that less things need to be included just to do a count(). Very nice. I did a collect() instead of a count just to see what would happen and it looks like the all the expected select fields were propagated down as expected. Thanks. On Sat,

Re: Spark SQL Custom Predicate Pushdown

2015-01-17 Thread Corey Nolet
Michael, What I'm seeing (in Spark 1.2.0) is that the required columns being pushed down to the DataRelation are not the product of the SELECT clause but rather just the columns explicitly included in the WHERE clause. Examples from my testing: SELECT * FROM myTable -- The required columns are

Re: Spark SQL Custom Predicate Pushdown

2015-01-17 Thread Michael Armbrust
How are you running your test here? Are you perhaps doing a .count()? On Sat, Jan 17, 2015 at 12:54 PM, Corey Nolet cjno...@gmail.com wrote: Michael, What I'm seeing (in Spark 1.2.0) is that the required columns being pushed down to the DataRelation are not the product of the SELECT clause

Re: Spark SQL Custom Predicate Pushdown

2015-01-17 Thread Corey Nolet
...@gmail.com] *Sent:* Friday, January 16, 2015 1:51 PM *To:* user *Subject:* Spark SQL Custom Predicate Pushdown I have document storage services in Accumulo that I'd like to expose to Spark SQL. I am able to push down predicate logic to Accumulo to have it perform only the seeks necessary

Re: Spark SQL Custom Predicate Pushdown

2015-01-16 Thread Corey Nolet
[mailto:cjno...@gmail.com] *Sent:* Friday, January 16, 2015 1:51 PM *To:* user *Subject:* Spark SQL Custom Predicate Pushdown I have document storage services in Accumulo that I'd like to expose to Spark SQL. I am able to push down predicate logic to Accumulo to have it perform only the seeks

Spark SQL Custom Predicate Pushdown

2015-01-15 Thread Corey Nolet
I have document storage services in Accumulo that I'd like to expose to Spark SQL. I am able to push down predicate logic to Accumulo to have it perform only the seeks necessary on each tablet server to grab the results being asked for. I'm interested in using Spark SQL to push those predicates

RE: Spark SQL Custom Predicate Pushdown

2015-01-15 Thread Cheng, Hao
/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/sources From: Corey Nolet [mailto:cjno...@gmail.com] Sent: Friday, January 16, 2015 1:51 PM To: user Subject: Spark SQL Custom Predicate Pushdown I have document storage services in Accumulo that I'd like to expose to Spark SQL. I am