Hi all,
I'm trying to implement a udf which makes use of some data structures
like binary tree. However, it seems that hive instantiates new udf
object for each row in the table. Then the data structures would be also
initialized again and again for each row. Whereas, in the book
<Programming Hive>, a geoip function is taken for an example showing that a
LookupService object "is saved in a reference so it only needs to be
initialized once in the lifetime of a map or reduce task that initializes it".
The code for this function can be found here
(https://github.com/edwardcapriolo/hive-geoip/).
Could anyone give me some ideas how to make the udf object initialize
once in the lifetime of a map or reduce task?
Best Regards,ypg