Andy Schlaikjer created PIG-2991:
------------------------------------
Summary: Clarify document of Algebraic contracts
Key: PIG-2991
URL: https://issues.apache.org/jira/browse/PIG-2991
Project: Pig
Issue Type: Improvement
Components: documentation
Affects Versions: 0.10.0
Reporter: Andy Schlaikjer
Documentation of Algebraic contracts is somewhat confusing.
It took me a while to understand that Initial impl exec method is passed a
singleton bag of X, and should return the single X value so that Intermed exec
gets a proper bag of X.
The builtins like SUM and COUNT are generally clearly written, but this
specific point isn't easy to deduce from those impls either.
It would be great if the discussion at the following URL could be improved to
make all Algebraic contracts more explicit:
http://pig.apache.org/docs/r0.10.0/udf.html#algebraic-interface
Also, detailed answers to the following questions would be great to include in
some form:
Q: Does Pig make use of Initial, Intermed, Final class outputSchema methods? If
so, how?
Q: If my Intermed or Final classes additionally implement Accumulator
interface, does Pig take advantage of this?
Q: Should the parent UDF's outputSchema method always expect to be passed the
same input schema, regardless of the context (algebraic, accumulative, regular
exec) in which it is used?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira