[ https://issues.apache.org/jira/browse/PIG-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127282#comment-13127282 ]
Jonathan Coveney commented on PIG-2234: --------------------------------------- How do you envision this being implemented on the backend? > Alebraic udf Init and Intermediate functions should be able to return non > tuple data types > ------------------------------------------------------------------------------------------ > > Key: PIG-2234 > URL: https://issues.apache.org/jira/browse/PIG-2234 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.8.0, 0.9.0 > Reporter: Thejas M Nair > > The exec() call to Algebraic UDF initial and intermediate classes are > required to return a Tuple. This has been done because the output is > collected in a DataBag and passed to Intermediate.exec() and Final.exec() > calls, and DataBag in pig needs to contain a Tuple. But this results in > additional Tuple objects getting created and also adds additional > (de)serialization costs. Functions such as COUNT, SUM are also having to wrap > the initial and intermediate results in Tuples. > The Algebraic interface needs to change to reduce the costs for udfs that > don't need an intermediate tuple . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira