Re: outlier detection in time-series using Mahout

Ted Dunning Mon, 01 Nov 2010 08:22:39 -0700

There is nothing explicit in Mahout for this, but you could use the Dirchlet
mixture model clustering to do this.

The idea would be to express your different observed time series or short
segments of time sequences as mixture
models and then find regions that are not well described by this mixture
model.  Ideally, you would have a Markov
model underneath the mixture coefficients, but that is out of scope for what
Mahout does for you right off the bat.  It
wouldn't be too hard to merge the HMM code and the DP clustering to get
this, though.

So the answer is no.

But Mahout would be a decent substrate for building your own.

On Mon, Nov 1, 2010 at 8:02 AM, Srivathsan Srinivas <
[email protected]> wrote:

> Hi,
>       Any pointers to techniques/papers that detect outliers in time-series
> of very large data sets using Mahout? I am interesting in seeing what
> techniques are favorable for use in large-scale distributed systems using
> Hadoop/Mahout.
>
> Thanks,
> Sri.
>

Re: outlier detection in time-series using Mahout

Reply via email to