Re: Spark SQL code generation

Michael Armbrust Mon, 06 Apr 2015 15:15:47 -0700

The compilation happens in parallel on all of the machines, so its not
really clear that there is a win to generating it on the driver and
shipping it from a latency perspective.  However, really I just took the
easiest path that didn't require more bytecode extracting / shipping
machinery.


On Mon, Apr 6, 2015 at 3:07 PM, Akshat Aranya <aara...@gmail.com> wrote:

> Thanks for the info, Michael.  Is there a reason to do so, as opposed to
> shipping out the bytecode and loading it via the classloader?  Is it more
> complex?  I can imagine caching to be effective for repeated queries, but
> when the subsequent queries are different.
>
> On Mon, Apr 6, 2015 at 2:41 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> It is generated and cached on each of the executors.
>>
>> On Mon, Apr 6, 2015 at 2:32 PM, Akshat Aranya <aara...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm curious as to how Spark does code generation for SQL queries.
>>>
>>> Following through the code, I saw that an expression is parsed and
>>> compiled into a class using Scala reflection toolbox.  However, it's
>>> unclear to me whether the actual byte code is generated on the master or on
>>> each of the executors.  If it generated on the master, how is the byte code
>>> shipped out to the executors?
>>>
>>> Thanks,
>>> Akshat
>>>
>>>
>>> https://databricks.com/blog/2014/06/02/exciting-performance-improvements-on-the-horizon-for-spark-sql.html
>>>
>>
>>
>

Re: Spark SQL code generation

Reply via email to