On Thu, Dec 26, 2013 at 10:19 AM, Suneel Marthi <suneel_mar...@yahoo.com>wrote:

> I heard people outside of dev@ and user@ who have tried running Streaming
> KMeans (from 0.8) on their Production clusters on large datasets and had
> seen the job crash in the Reduce phase due to OOM errors (this is with
> -Xmx2GB).
>

Excessive memory usage in reduce was a known bug that was addressed
(supposedly) by using a combiner.

This really smells like bug resurrection happened somehow.  Clearly that
also means that our unit tests are insufficient.

Reply via email to