Alebraic udf Init and Intermediate functions should be able to return non tuple 
data types
------------------------------------------------------------------------------------------

                 Key: PIG-2234
                 URL: https://issues.apache.org/jira/browse/PIG-2234
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.9.0, 0.8.0
            Reporter: Thejas M Nair


The exec() call to Algebraic UDF initial and intermediate classes are required 
to return a Tuple. This has been done because the output is collected in a 
DataBag and passed to Intermediate.exec() and Final.exec() calls, and DataBag 
in pig needs to contain a Tuple. But this results in additional Tuple objects 
getting created and also adds additional (de)serialization costs. Functions 
such as COUNT, SUM are also having to wrap the initial and intermediate results 
in Tuples.

The Algebraic interface needs to change to reduce the costs for udfs that don't 
need an intermediate tuple .



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to