[ https://issues.apache.org/jira/browse/SPARK-25044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572775#comment-16572775 ]
Lukas Rytz commented on SPARK-25044: ------------------------------------ The encoding is as expected. To expand on a few details * The function {code:java} (x: Int, y: Int) => ""{code} is not specialized, neither in 2.11 nor 2.12. Function2 doesn't have a specialized variant for Int*Int*AnyRef. So this creates an instance of Function2, not Function2$sp$XXX, and the argumetns are boxed when invoking the method. * The 2.11 encoding always generates the apply method with the types as they appear in source code, and then generates a bridge method if necessary. So the above will generate an apply(II)LString; with the implementation, and a bridge apply(LObject;LObject;)LObject; that unboxes and delegates to the implementation. Callsites will always box and invoke the bridge method. * The 2.12 encoding generates an $anonfun$foo$1(II)LString; method in the enclosing class with the lambda body. In addition, it creates an $anonfun$foo$1$adapted(LObject;LObject;)LString; method that unboxes and invokes the body method. The adapted method is used for the LMF. The SAM interface passed to the LMF is Function2, whose abstract method is apply(LObject;LObject)LObject; * You're right that LMF can do boxing adaptations internally, so we could pass the $anonfun$foo$1 method to LMF (instead of the $adapted). However, the boxing semantics are not exactly those that we need for Scala. In particular, unboxing null gives 0 in Scala, but NPE in java. That's why we emit and use the $adapted method. On the other hand: * The function {code:java} (x: Int, y: Int) => x + y{code} is specialized. * In 2.11, the closure class extends Function2$mcIII$sp * 2.12 creates a $anonfun$foo$2(II)I method in the enclosing class. This method is used for the LMF, the SAM interface is Lscala/runtime/java8/JFunction2$mcIII$sp. The signature of the abstract method in that interface matches exactly. I don't know about what the SQL implementation does internally, but maybe the above gives enough information to understand the problem? Let me know if I can help. > Address translation of LMF closure primitive args to Object in Scala 2.12 > ------------------------------------------------------------------------- > > Key: SPARK-25044 > URL: https://issues.apache.org/jira/browse/SPARK-25044 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL > Affects Versions: 2.4.0 > Reporter: Sean Owen > Priority: Major > > A few SQL-related tests fail in Scala 2.12, such as UDFSuite's "SPARK-24891 > Fix HandleNullInputsForUDF rule": > {code:java} > - SPARK-24891 Fix HandleNullInputsForUDF rule *** FAILED *** > Results do not match for query: > ... > == Results == > == Results == > !== Correct Answer - 3 == == Spark Answer - 3 == > !struct<> struct<a:bigint,b:int,c:int> > ![0,10,null] [0,10,0] > ![1,12,null] [1,12,1] > ![2,14,null] [2,14,2] (QueryTest.scala:163){code} > You can kind of get what's going on reading the test: > {code:java} > test("SPARK-24891 Fix HandleNullInputsForUDF rule") { > // assume(!ClosureCleanerSuite2.supportsLMFs) > // This test won't test what it intends to in 2.12, as lambda metafactory > closures > // have arg types that are not primitive, but Object > val udf1 = udf({(x: Int, y: Int) => x + y}) > val df = spark.range(0, 3).toDF("a") > .withColumn("b", udf1($"a", udf1($"a", lit(10)))) > .withColumn("c", udf1($"a", lit(null))) > val plan = spark.sessionState.executePlan(df.logicalPlan).analyzed > comparePlans(df.logicalPlan, plan) > checkAnswer( > df, > Seq( > Row(0, 10, null), > Row(1, 12, null), > Row(2, 14, null))) > }{code} > > It seems that the closure that is fed in as a UDF changes behavior, in a way > that primitive-type arguments are handled differently. For example an Int > argument, when fed 'null', acts like 0. > I'm sure it's a difference in the LMF closure and how its types are > understood, but not exactly sure of the cause yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org