[ https://issues.apache.org/jira/browse/SPARK-24935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833982#comment-16833982 ]
Wenchen Fan commented on SPARK-24935: ------------------------------------- I have sent https://github.com/apache/spark/pull/24539 to backport it. > Problem with Executing Hive UDF's from Spark 2.2 Onwards > -------------------------------------------------------- > > Key: SPARK-24935 > URL: https://issues.apache.org/jira/browse/SPARK-24935 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.0, 2.3.1 > Reporter: Parth Gandhi > Assignee: Parth Gandhi > Priority: Major > Fix For: 3.0.0, 2.4.3 > > > A user of sketches library(https://github.com/DataSketches/sketches-hive) > reported an issue with HLL Sketch Hive UDAF that seems to be a bug in Spark > or Hive. Their code runs fine in 2.1 but has an issue from 2.2 onwards. For > more details on the issue, you can refer to the discussion in the > sketches-user list: > [https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/sketches-user/GmH4-OlHP9g/MW-J7Hg4BwAJ] > > On further debugging, we figured out that from 2.2 onwards, Spark hive UDAF > provides support for partial aggregation, and has removed the functionality > that supported complete mode aggregation(Refer > https://issues.apache.org/jira/browse/SPARK-19060 and > https://issues.apache.org/jira/browse/SPARK-18186). Thus, instead of > expecting update method to be called, merge method is called here > ([https://github.com/DataSketches/sketches-hive/blob/master/src/main/java/com/yahoo/sketches/hive/hll/SketchEvaluator.java#L56)] > which throws the exception as described in the forums above. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org