[
https://issues.apache.org/jira/browse/PIG-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256065#comment-13256065
]
Alan Gates commented on PIG-2651:
---------------------------------
In general looks good. This will be great for performance in certain
situations. A few issues:
* The new files need Apache License headers.
* The new files need javadocs.
* The new files need stability and audience annotations.
* We need tests. In particular it should test that this works stand alone,
that multiple work together, and that it works with other
non-TerminatingAccumulator accumulators.
> Provide a much easier to use accumulator interface
> --------------------------------------------------
>
> Key: PIG-2651
> URL: https://issues.apache.org/jira/browse/PIG-2651
> Project: Pig
> Issue Type: New Feature
> Reporter: Jonathan Coveney
> Assignee: Jonathan Coveney
> Fix For: 0.11, 0.10.1
>
> Attachments: PIG-2651-0.patch
>
>
> This introduces a new interface, IteratingAccumulatorEvalFunc (that name is
> NOT final...). The cool thing about this patch is that it is built purely on
> top of the existing Accumulator code (well, it uses PIG-2066, but it could
> easily work without it). That is to say, it's an easier way to write
> accumulators without having to fork the Pig codebase.
> The downside is that the only way I am able to provide such a clean interface
> is by using a second thread. I need to explore any potential performance
> implications, but given that most of the easy to use Pig stuff has
> performance implications, I think as long as we measure and and document
> them, it's worth the much more usable interface. Plus I don't think it will
> be too bad as one thread does the heavy lifting, while another just ferries
> values in between. SUM could now be written as:
> {code}
> public class SUM extends IteratingAccumulatorEvalFunc<Long> {
> public Long exec(Iterator<Tuple> it) throws IOException {
> long sum = 0;
> while (it.hasNext()) {
> sum += (Long)it.next().get(0);
> }
> return sum;
> }
> }
> {code}
> Besides performance tests, I need to figure out how to properly test this
> sort of thing. I particularly welcome advice on that front.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira