You might also try with a newer version. Several instance of code generation failures have been fixed since 2.0.
On Thu, Jun 15, 2017 at 1:15 PM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > Hi Michael, > Spark 2.0.2 - but I have a very interesting test case actually > The optimiser seems to be at fault in a way, I've joined to this email the > explain when I limit myself to 2 levels of struct mutation and when it goes > to 5. > As you can see the optimiser seems to be doing a lot more in the later > case. > After further investigation, the code is not "failing" per se - spark is > trying the whole stage codegen, the compilation is failing due to the > compilation error and I think it's falling back to the "non codegen" way. > > I'll try to create a simpler test case to reproduce this if I can, what do > you think ? > > Regards, > > Olivier. > > > 2017-06-15 21:08 GMT+02:00 Michael Armbrust <mich...@databricks.com>: > >> Which version of Spark? If its recent I'd open a JIRA. >> >> On Thu, Jun 15, 2017 at 6:04 AM, Olivier Girardot < >> o.girar...@lateral-thoughts.com> wrote: >> >>> Hi everyone, >>> when we create recursive calls to "struct" (up to 5 levels) for >>> extending a complex datastructure we end up with the following compilation >>> error : >>> >>> org.codehaus.janino.JaninoRuntimeException: Code of method >>> "(I[Lscala/collection/Iterator;)V" of class >>> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator" >>> grows beyond 64 KB >>> >>> The CreateStruct code itself is properly using the ctx.splitExpression >>> command but the "end result" of the df.select( struct(struct(struct(....) >>> ))) ends up being too much. >>> >>> Should I open a JIRA or is there a workaround ? >>> >>> Regards, >>> >>> -- >>> *Olivier Girardot* | AssociƩ >>> o.girar...@lateral-thoughts.com >>> >> >> > > > -- > *Olivier Girardot* | AssociƩ > o.girar...@lateral-thoughts.com > +33 6 24 09 17 94 >