Hi, Recently while creating a custom generic hive UDF I came across a different behavior for the Evaluate method. The custom UDF had a logic to increment the counter and write it to a file. Now when I execute it directly without involving any table it always returns an extra count i.e. 2. Now when I added some logs to inside the evaluate method I observed that the logs (sysout) were printed twice. Now on further research I came across the @UDFType annotation and found out that if we do not provide this annotation in our custom UDF, default value is deterministic true. When I provide this annotation in my custom UDF and set @UDFType( deterministic = false ), I observed that my logs were printed only once and my UDF was returning the accurate count i.e. 1 therefore implying my evaluate was called only once when @UDFType( deterministic = false ). Now I wanted to understand what is the connection between @UDFType and Evaluate method when UDF is invoked directly without a table.
Note : When I invoke my UDF on a table I get the appropriate count even with @UDFType( deterministic = true ). Thanks in advance. :) Regards, PradeepKumar Yadav