Hi Devs, Sorry for bothering you with my questions (and concerns), but I really need to understand this piece of code (= my personal challenge :))
Is this true that SparkPlan.doExecute (to "execute" a physical operator) is only used when whole-stage code gen is disabled (which is not by default)? May I call this execution path traditional (even "old-fashioned")? Is this true that these days SparkPlan.doProduce and SparkPlan.doConsume (and others) are used for "executing" a physical operator (i.e. to generate the Java source code) since whole-stage code generation is enabled and is currently the proper execution path? p.s. This SparkPlan.doExecute is used to trigger whole-stage code gen by WholeStageCodegenExec (and InputAdapter), but that's all the code that is to be executed by doExecute, isn't it? Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Kafka Streams https://bit.ly/mastering-kafka-streams Follow me at https://twitter.com/jaceklaskowski On Fri, Sep 7, 2018 at 7:24 PM Jacek Laskowski <ja...@japila.pl> wrote: > Hi Spark Devs, > > I really need your help understanding the relationship > between HashAggregateExec, TungstenAggregationIterator and > UnsafeFixedWidthAggregationMap. > > While exploring UnsafeFixedWidthAggregationMap and how it's used I've > noticed that it's for HashAggregateExec and TungstenAggregationIterator > exclusively. And given that TungstenAggregationIterator is used exclusively > in HashAggregateExec and the use of UnsafeFixedWidthAggregationMap in both > seems to be almost the same (if not the same), I've got a question I cannot > seem to answer myself. > > Since HashAggregateExec supports Whole-Stage Codegen > HashAggregateExec.doExecute won't be used at all, but doConsume and > doProduce (unless codegen is disabled). Is that correct? > > If so, TungstenAggregationIterator is not used at all, but > UnsafeFixedWidthAggregationMap is used directly instead (in the Java code > that uses createHashMap or finishAggregate). Is that correct? > > Pozdrawiam, > Jacek Laskowski > ---- > https://about.me/JacekLaskowski > Mastering Spark SQL https://bit.ly/mastering-spark-sql > Spark Structured Streaming https://bit.ly/spark-structured-streaming > Mastering Kafka Streams https://bit.ly/mastering-kafka-streams > Follow me at https://twitter.com/jaceklaskowski >