Re: Nested "struct" fonction call creates a compilation error in Spark SQL

2017-06-15 Thread Michael Armbrust
You might also try with a newer version.  Several instance of code
generation failures have been fixed since 2.0.

On Thu, Jun 15, 2017 at 1:15 PM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> Hi Michael,
> Spark 2.0.2 - but I have a very interesting test case actually
> The optimiser seems to be at fault in a way, I've joined to this email the
> explain when I limit myself to 2 levels of struct mutation and when it goes
> to 5.
> As you can see the optimiser seems to be doing a lot more in the later
> case.
> After further investigation, the code is not "failing" per se - spark is
> trying the whole stage codegen, the compilation is failing due to the
> compilation error and I think it's falling back to the "non codegen" way.
>
> I'll try to create a simpler test case to reproduce this if I can, what do
> you think ?
>
> Regards,
>
> Olivier.
>
>
> 2017-06-15 21:08 GMT+02:00 Michael Armbrust :
>
>> Which version of Spark?  If its recent I'd open a JIRA.
>>
>> On Thu, Jun 15, 2017 at 6:04 AM, Olivier Girardot <
>> o.girar...@lateral-thoughts.com> wrote:
>>
>>> Hi everyone,
>>> when we create recursive calls to "struct" (up to 5 levels) for
>>> extending a complex datastructure we end up with the following compilation
>>> error :
>>>
>>> org.codehaus.janino.JaninoRuntimeException: Code of method
>>> "(I[Lscala/collection/Iterator;)V" of class
>>> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator"
>>> grows beyond 64 KB
>>>
>>> The CreateStruct code itself is properly using the ctx.splitExpression
>>> command but the "end result" of the df.select( struct(struct(struct()
>>> ))) ends up being too much.
>>>
>>> Should I open a JIRA or is there a workaround ?
>>>
>>> Regards,
>>>
>>> --
>>> *Olivier Girardot* | Associé
>>> o.girar...@lateral-thoughts.com
>>>
>>
>>
>
>
> --
> *Olivier Girardot* | Associé
> o.girar...@lateral-thoughts.com
> +33 6 24 09 17 94
>


Re: Nested "struct" fonction call creates a compilation error in Spark SQL

2017-06-15 Thread Michael Armbrust
Which version of Spark?  If its recent I'd open a JIRA.

On Thu, Jun 15, 2017 at 6:04 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> Hi everyone,
> when we create recursive calls to "struct" (up to 5 levels) for extending
> a complex datastructure we end up with the following compilation error :
>
> org.codehaus.janino.JaninoRuntimeException: Code of method
> "(I[Lscala/collection/Iterator;)V" of class "org.apache.spark.sql.
> catalyst.expressions.GeneratedClass$GeneratedIterator" grows beyond 64 KB
>
> The CreateStruct code itself is properly using the ctx.splitExpression
> command but the "end result" of the df.select( struct(struct(struct()
> ))) ends up being too much.
>
> Should I open a JIRA or is there a workaround ?
>
> Regards,
>
> --
> *Olivier Girardot* | Associé
> o.girar...@lateral-thoughts.com
>


Nested "struct" fonction call creates a compilation error in Spark SQL

2017-06-15 Thread Olivier Girardot
Hi everyone,
when we create recursive calls to "struct" (up to 5 levels) for extending a
complex datastructure we end up with the following compilation error :

org.codehaus.janino.JaninoRuntimeException: Code of method
"(I[Lscala/collection/Iterator;)V" of class
"org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator"
grows beyond 64 KB

The CreateStruct code itself is properly using the ctx.splitExpression
command but the "end result" of the df.select( struct(struct(struct()
))) ends up being too much.

Should I open a JIRA or is there a workaround ?

Regards,

-- 
*Olivier Girardot* | Associé
o.girar...@lateral-thoughts.com