-dev +user No, lambda functions and other code are black-boxes to Spark SQL. If you want those kinds of optimizations you need to express the columns required in either SQL or the DataFrame DSL (coming in 1.3).
On Mon, Mar 2, 2015 at 1:55 AM, Wail <w.alkowail...@cces-kacst-mit.org> wrote: > Dears, > > I'm just curious about the complexity of the query optimizer. Can the > optimizer evaluates what after the SQL? maybe it's a stupid question ,, but > here is an example to show the case: > > From the Spark SQL example: > val teenagers = sqlContext.sql("SELECT * FROM people WHERE age >= 13 AND > age > <= 19") > > if(condition) > { > teenagers.map(t => "Name: " + t(0)).collect().foreach(println) > } > else > { > teenagers.map(t => "Age: " + t(1)).collect().foreach(println) > } > > As for instance ... is the optimizer aware that I need only one column and > pushes down the projection to bring only one as needed? > > Thanks! > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Is-SparkSQL-optimizer-aware-of-the-needed-data-after-the-query-tp10835.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >