[ https://issues.apache.org/jira/browse/PIG-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127298#comment-13127298 ]
Jonathan Coveney commented on PIG-2234: --------------------------------------- Are you thinking of a "PrimitiveBag" that allows you to use primitive types without a Tuple wrapper, or altering DataBag itself to support that? It seems like the latter would break a lot of contracts in Pig, many of which were put into place for a convenient reason... but the former could be intriguing. > Alebraic udf Init and Intermediate functions should be able to return non > tuple data types > ------------------------------------------------------------------------------------------ > > Key: PIG-2234 > URL: https://issues.apache.org/jira/browse/PIG-2234 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.8.0, 0.9.0 > Reporter: Thejas M Nair > > The exec() call to Algebraic UDF initial and intermediate classes are > required to return a Tuple. This has been done because the output is > collected in a DataBag and passed to Intermediate.exec() and Final.exec() > calls, and DataBag in pig needs to contain a Tuple. But this results in > additional Tuple objects getting created and also adds additional > (de)serialization costs. Functions such as COUNT, SUM are also having to wrap > the initial and intermediate results in Tuples. > The Algebraic interface needs to change to reduce the costs for udfs that > don't need an intermediate tuple . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira