[ https://issues.apache.org/jira/browse/DATAFU-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579505#comment-14579505 ]
Matthew Hayes commented on DATAFU-91: ------------------------------------- [~user1234321], HyperLogLogPlus is too big to be output from Initial. Having Initial output the Long hashed value makes the most sense to me. Therefore Final needs to be able to handle either Long or DataByteArray. It isn't guaranteed that Intermediate is executed. Keeping the accumulator implementation as you suggest seems reasonable to me. > pig version of HyperLogLog estimator should be Algebraic and use combiners > -------------------------------------------------------------------------- > > Key: DATAFU-91 > URL: https://issues.apache.org/jira/browse/DATAFU-91 > Project: DataFu > Issue Type: Bug > Affects Versions: 1.3.0 > Reporter: Ido Hadanny > Assignee: Ido Hadanny > Priority: Minor > Fix For: 1.3.0 > > Attachments: hyper-log-log-algebraic-3.diff, > hyper-log-log-algebraic.diff, hyper-log-log-algebraic.diff > > > Matt: I don't remember if there was a particular reason I didn't implement > this as AlgebraicEvalFunc. It seems like it could be. I believe the Java > MapReduce version leverages the combiner. If you want to try making this > Algebraic we would be happy to accept a patch :) -- This message was sent by Atlassian JIRA (v6.3.4#6332)