> On May 18, 2016, 10:19 p.m., Matthew Hayes wrote:
> > datafu-pig/src/main/java/datafu/pig/bags/CountDistinctUpTo.java, line 147
> > <https://reviews.apache.org/r/46701/diff/1/?file=1361707#file1361707line147>
> >
> >     What about clearing the set so we don't have to garbage collect?
> 
> Eyal Allweil wrote:
>     I just reassigned it because the clear() method in HashSet uses 
> Array.fill and I thought it would be more expensive than just letting it be 
> garbage collected and making a new one.

I would think GCing the hashset would be more expensive than clearing.  I did a 
quick benchmark and it seems that clear is significantly faster for large and 
small hashsets.


- Matthew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46701/#review133803
-----------------------------------------------------------


On April 27, 2016, 7:44 a.m., Eyal Allweil wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46701/
> -----------------------------------------------------------
> 
> (Updated April 27, 2016, 7:44 a.m.)
> 
> 
> Review request for DataFu.
> 
> 
> Repository: datafu
> 
> 
> Description
> -------
> 
> DATAFU-117 - New UDF - CountDistinctUpTo
> 
> 
> Diffs
> -----
> 
>   datafu-pig/src/main/java/datafu/pig/bags/CountDistinctUpTo.java 
> PRE-CREATION 
>   datafu-pig/src/test/java/datafu/test/pig/bags/BagTests.java 
> 28292db0c01a1967ea53d9cc3d316e9906d942a8 
> 
> Diff: https://reviews.apache.org/r/46701/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Eyal Allweil
> 
>

Reply via email to