[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749189#comment-13749189 ]
Eric Hanson commented on HIVE-4961: ----------------------------------- Completed working version of bridge to allow custom UDFs that are subclasses of UDF to work in vectorized mode. This supports UDFs with evaluate() methods that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable) and standard types (e.g. long). Generic UDFs are not supported. That will be the subject of a future patch. I did manual testing for a large set of UDFs taking and returning the types supported by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp. UDFs one argument and multiple arguments were tested. Both constant and variable arguments were tested. Including the tests with the patch, or doing another patch with end-to-end tests, is yet to be done. > Create bridge for custom UDFs to operate in vectorized mode > ----------------------------------------------------------- > > Key: HIVE-4961 > URL: https://issues.apache.org/jira/browse/HIVE-4961 > Project: Hive > Issue Type: Sub-task > Reporter: Eric Hanson > Assignee: Eric Hanson > Attachments: vectorUDF.4.patch, vectorUDF.5.patch > > > Suppose you have a custom UDF myUDF() that you've created to extend hive. The > goal of this JIRA is to create a facility where if you run a query that uses > myUDF() in an expression, the query will run in vectorized mode. > This would be a general-purpose bridge for custom UDFs that users add to > Hive. It would work with existing UDFs. > I'm considering a separate JIRA for a new kind of custom UDF implementation > that is vectorized from the beginning, to optimize performance. That is not > covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira