Hey sorry for the delay. I took a look at the diff and replied with some comments in the JIRA. Please take a look, thanks.
-Matt On Sat, May 9, 2015 at 10:53 PM, Ido Hadanny <ido.hada...@gmail.com> wrote: > Hey, I see that this is still in open and un-assigned - can you assign it > to me so I can mark it as "patch available"? or do you want me just to mark > it as "fixed"? > > On 28 April 2015 at 08:37, Ido Hadanny <ido.hada...@gmail.com> wrote: > >> https://issues.apache.org/jira/browse/DATAFU-91 >> >> >> On 27 April 2015 at 18:02, Matthew Hayes <matthew.terence.ha...@gmail.com >> > wrote: >> >>> Great thanks :) Please file a JIRA and attach the patch there. >>> >>> -Matt >>> >>> On Apr 27, 2015, at 6:26 AM, Ido Hadanny <ido.hada...@gmail.com> wrote: >>> >>> Hey guys, >>> patch is attached + tested on unit-tests + We're testing it on a >>> 1000-nodes real hadoop cluster as we speak. >>> Do you want us to create a jira issue for this, or is this good enough? >>> Thanks, Ilia and Ido >>> >>> On 7 March 2015 at 23:09, Matthew Hayes <matthew.terence.ha...@gmail.com >>> > wrote: >>> >>>> I don't remember if there was a particular reason I didn't implement >>>> this as AlgebraicEvalFunc. It seems like it could be. I believe the Java >>>> MapReduce version leverages the combiner. If you want to try making this >>>> Algebraic we would be happy to accept a patch :) >>>> >>>> -Matt >>>> >>>> > On Mar 7, 2015, at 12:11 PM, Ido Hadanny <ido.hada...@gmail.com> >>>> wrote: >>>> > >>>> > data.fu has a nice implementation of HyperLogLog for estimating >>>> cardinality >>>> > here >>>> > < >>>> https://github.com/apache/incubator-datafu/blob/master/datafu-pig/src/main/java/datafu/pig/stats/HyperLogLogPlusPlus.java >>>> > >>>> > >>>> > However, it's implemented as Accumulator which means it will run only >>>> at >>>> > the reducer and not in the combiner (but it will never load the >>>> entire set >>>> > into memory as in normal EvalFunc). Why couldn't data.fu implement it >>>> as >>>> > Algebraic - and fill the registers at every combiner, then merge and >>>> reduce >>>> > the result? Am I missing something here? >>>> > also available here: >>>> > >>>> http://stackoverflow.com/questions/28908217/why-is-data-fu-implementing-hyperloglog-as-an-accumulator-and-not-as-algebraic >>>> > >>>> > thanks! >>>> > >>>> > >>>> > -- >>>> > Sent from my androido >>>> >>> >>> >>> >>> -- >>> Sent from my androido >>> >>> <hyper-log-log-algebraic.diff> >>> >>> >> >> >> -- >> Sent from my androido >> > > > > -- > Sent from my androido >