[ 
https://issues.apache.org/jira/browse/SPARK-24935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773574#comment-16773574
 ] 

gavin hu commented on SPARK-24935:
----------------------------------

Hi [~pgandhi]  "A user of sketches library..." That's me! I'm so excited that 
the issue I reported couple months back has got fixed here. Kudos to you for 
the great work!

As you may know many users including me are moving towards Spark 2.4.1, getting 
the fix there will definitely benefit a lot of people. Looking forwards to 
that. 

 

> Problem with Executing Hive UDF's from Spark 2.2 Onwards
> --------------------------------------------------------
>
>                 Key: SPARK-24935
>                 URL: https://issues.apache.org/jira/browse/SPARK-24935
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.1
>            Reporter: Parth Gandhi
>            Priority: Major
>
> A user of sketches library(https://github.com/DataSketches/sketches-hive) 
> reported an issue with HLL Sketch Hive UDAF that seems to be a bug in Spark 
> or Hive. Their code runs fine in 2.1 but has an issue from 2.2 onwards. For 
> more details on the issue, you can refer to the discussion in the 
> sketches-user list:
> [https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/sketches-user/GmH4-OlHP9g/MW-J7Hg4BwAJ]
>  
> On further debugging, we figured out that from 2.2 onwards, Spark hive UDAF 
> provides support for partial aggregation, and has removed the functionality 
> that supported complete mode aggregation(Refer 
> https://issues.apache.org/jira/browse/SPARK-19060 and 
> https://issues.apache.org/jira/browse/SPARK-18186). Thus, instead of 
> expecting update method to be called, merge method is called here 
> ([https://github.com/DataSketches/sketches-hive/blob/master/src/main/java/com/yahoo/sketches/hive/hll/SketchEvaluator.java#L56)]
>  which throws the exception as described in the forums above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to