Re: mahout examples

prasenjit mukherjee Sat, 21 Nov 2009 22:00:49 -0800

On Sun, Nov 22, 2009 at 9:54 AM, Ted Dunning <[email protected]> wrote:
> Expressing your symbolic sequences by tiling these phrases gives you much of
> the temporality that you are interested and lets you use algorithms like
> k-means pretty much directly.


Approach sounds interesting . Can you explain a bit on how you intend
to represent a sequence as a vector here ? Assuming sequence being  "a
b a a c". I was thinking of the following 2 approaches :

If I use symbols as my basis and the coefficients as time-slices then
I would loose the information of recurring symbols  ( symbol a in my
example ) .  e.g. vector representation of "a b a a c": 1(a)+ 2(b) +
5(c) ( problem : how to incorporate 3a,4a )

On the other hand if I use time-slices as my basis and some mapping of
terms as its coefficients then my simple euclidean measure wont make
any sense.  e.g. let's a->1, b->2, c->3, then vector representation of
"a b a a c":  1(t1) + 2(t2) + 1(t3) + 1(t4) + 3(t5)

-Prasen

>
> If you don't have symbolic sequences, you have another problem, but you
> might get similar results by doing vector quantization on your continuous
> time-series expressed in terms of multi-scale localized spectral detectors.
> Some problems work well with those techniques, some definitely need more
> interesting feature detectors.  The spectral processing and vector
> quantization are fairly natural tasks for map-reduce which is nice.  In
> fact, vector quantization is commonly done with some variant on k-means.
>

Re: mahout examples

Reply via email to