Re: Streaming and incremental cooccurrence

Ted Dunning Wed, 22 Apr 2015 22:03:07 -0700

On Wed, Apr 22, 2015 at 8:07 PM, Pat Ferrel <p...@occamsmachete.com> wrote:


> I think we have been talking about an idea that does an incremental
> approximation, then a refresh every so often to remove any approximation so
> in an ideal world we need both.


Actually, the method I was pushing is exact.  If the sampling is made
deterministic using clever seeds, then deletion is even possible since you
can determine whether an observation was thrown away rather than used to
increment counts.

The only creeping crud aspect of this is the accumulation of zero rows as
things fall out of the accumulation window.  I would be tempted to not
allow deletion and just restart as Pat is suggesting.

Re: Streaming and incremental cooccurrence

Reply via email to