yes, that's exactly what I was looking for, thanks for the pointer ;-) On Thu, Jul 28, 2016 at 1:07 AM, Takeshi Yamamuro <linguin....@gmail.com> wrote: > Hi, > > Have you seen this ticket? > https://issues.apache.org/jira/browse/SPARK-12449 > > // maropu > > On Thu, Jul 28, 2016 at 2:13 AM, Timothy Potter <thelabd...@gmail.com> > wrote: >> >> I'm not looking for a one-off solution for a specific query that can >> be solved on the client side as you suggest, but rather a generic >> solution that can be implemented within the DataSource impl itself >> when it knows a sub-query can be pushed down into the engine. In other >> words, I'd like to intercept the query planning process to be able to >> push-down computation into the engine when it makes sense. >> >> On Wed, Jul 27, 2016 at 8:04 AM, Marco Colombo >> <ing.marco.colo...@gmail.com> wrote: >> > Why don't you create a dataframe filtered, map it as temporary table and >> > then use it in your query? You can also cache it, of multiple queries on >> > the >> > same inner queries are requested. >> > >> > >> > Il mercoledì 27 luglio 2016, Timothy Potter <thelabd...@gmail.com> ha >> > scritto: >> >> >> >> Take this simple join: >> >> >> >> SELECT m.title as title, solr.aggCount as aggCount FROM movies m INNER >> >> JOIN (SELECT movie_id, COUNT(*) as aggCount FROM ratings WHERE rating >> >> >= 4 GROUP BY movie_id ORDER BY aggCount desc LIMIT 10) as solr ON >> >> solr.movie_id = m.movie_id ORDER BY aggCount DESC >> >> >> >> I would like the ability to push the inner sub-query aliased as "solr" >> >> down into the data source engine, in this case Solr as it will >> >> greatlly reduce the amount of data that has to be transferred from >> >> Solr into Spark. I would imagine this issue comes up frequently if the >> >> underlying engine is a JDBC data source as well ... >> >> >> >> Is this possible? Of course, my example is a bit cherry-picked so >> >> determining if a sub-query can be pushed down into the data source >> >> engine is probably not a trivial task, but I'm wondering if Spark has >> >> the hooks to allow me to try ;-) >> >> >> >> Cheers, >> >> Tim >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >> > >> > >> > -- >> > Ing. Marco Colombo >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> > > > > -- > --- > Takeshi Yamamuro
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org