Anomaly Detection ---- Re: This is the high level scenario ---- Re: Can Mahout do pattern recognition?

2013-06-02 Thread Mimi Tam

Thank you very much for your response, Andrew.

Its my bad that I did not convey my question/scenario clearly.

Although I have gazillions streams of wireless call control data coming in 
from many M2M devices, they are aggregated in the middleware and within the 
middleware, my Pattern Recognition software will be taking in just one 
constant stream of input with everything mixed in.


My goal is to segregate all this mixed RF, Base Station, call control data 
into on a per customer/corporation, per device basis and then detect anomaly 
patterns.


It is this "Anomaly Detection" that I am hoping Mahout can facilitate so I 
don't have to explore the route of neutral networking using FANN.


Any thoughts? Any advice, suggestions, comments will be greatly appreciated.

Many Thanks...Mimi




-Original Message- 
From: Andrew Musselman

Sent: Friday, May 31, 2013 8:36 PM
To: dev@mahout.apache.org
Subject: Re: This is the high level scenario  Re: Can Mahout do pattern 
recognition?


This sounds like something Storm was purpose-built for:
http://storm-project.net/

It lets you do computation on streams coming in.

Hope this helps.


On Fri, May 31, 2013 at 2:27 PM, Mimi Tam  wrote:


I have a gazillion streams of wireless call control data coming in from a
gazillions of M2M (Machine-to-Machine) devices. These device data belong 
to

different customers of ours all intermingled together when we (as a
Wireless Carrier) receive them e.g. All the devices that belong to Company
A are sending these data to us together with companies' B, C, D, E.
etc. simultaneously. Its Hadoop data right now.

I want to be able to detect problems from these records coming in
pertaining to each and every company. For example, normally if 7 records
coming in are the same and the 8th one was the end-puof-session record, we
know that a device might have made requests repeatedly to connect to a
network but failed, I want to be able to tell or predict what is happening
or what will happen based on the sequence of events (as records coming in
all mixed up) on a per device basis. I will also check some key value
indicators e.g. Signal Strength and together with the pattern recognized
(i.e. the call sequence expressed as records coming in all mixed up with
all other call sequence and all other devices) to determine if we can fix
up something before it happens.

So basically, I envision classifying or clustering records that belong to
one device and the device belongs to a particular company. Knowing what
records coming in and compare to the training pattern that I fed in (i.e.
the expected record sequence), I should be able to tell which device from
which company is misbehaving and why or which device from which company is
about to be bumped out from a cell tower because of this tower's traffic
saturation.

Hope I am not too confusing.

If Mahout can do what I am looking for and I don't need to use FANN,
that'd be splendid. But, I'd like to find a source to tell me exactly how 
I

can implement it if Mahout is my solution candidate.

Many Thanks in advanceMimi







-Original Message- From: Ted Dunning
Sent: Friday, May 31, 2013 4:40 PM
To: Mahout Dev List
Subject: Re: Can Mahout do pattern recognition?

On Fri, May 31, 2013 at 4:07 PM, Mimi Tam  wrote:

 Please if you will direct me to where I can find out how this can be 
done.




Right here is a good place.

But you have to give to get.  What kind of data do you have?  What does
similar mean?  What kinds of clustering do you want to do?  Why?





Re: This is the high level scenario ---- Re: Can Mahout do pattern recognition?

2013-05-31 Thread Andrew Musselman
This sounds like something Storm was purpose-built for:
http://storm-project.net/

It lets you do computation on streams coming in.

Hope this helps.


On Fri, May 31, 2013 at 2:27 PM, Mimi Tam  wrote:

> I have a gazillion streams of wireless call control data coming in from a
> gazillions of M2M (Machine-to-Machine) devices. These device data belong to
> different customers of ours all intermingled together when we (as a
> Wireless Carrier) receive them e.g. All the devices that belong to Company
> A are sending these data to us together with companies' B, C, D, E.
> etc. simultaneously. Its Hadoop data right now.
>
> I want to be able to detect problems from these records coming in
> pertaining to each and every company. For example, normally if 7 records
> coming in are the same and the 8th one was the end-puof-session record, we
> know that a device might have made requests repeatedly to connect to a
> network but failed, I want to be able to tell or predict what is happening
> or what will happen based on the sequence of events (as records coming in
> all mixed up) on a per device basis. I will also check some key value
> indicators e.g. Signal Strength and together with the pattern recognized
> (i.e. the call sequence expressed as records coming in all mixed up with
> all other call sequence and all other devices) to determine if we can fix
> up something before it happens.
>
> So basically, I envision classifying or clustering records that belong to
> one device and the device belongs to a particular company. Knowing what
> records coming in and compare to the training pattern that I fed in (i.e.
> the expected record sequence), I should be able to tell which device from
> which company is misbehaving and why or which device from which company is
> about to be bumped out from a cell tower because of this tower's traffic
> saturation.
>
> Hope I am not too confusing.
>
> If Mahout can do what I am looking for and I don't need to use FANN,
> that'd be splendid. But, I'd like to find a source to tell me exactly how I
> can implement it if Mahout is my solution candidate.
>
> Many Thanks in advanceMimi
>
>
>
>
>
>
>
> -Original Message- From: Ted Dunning
> Sent: Friday, May 31, 2013 4:40 PM
> To: Mahout Dev List
> Subject: Re: Can Mahout do pattern recognition?
>
> On Fri, May 31, 2013 at 4:07 PM, Mimi Tam  wrote:
>
>  Please if you will direct me to where I can find out how this can be done.
>>
>>
> Right here is a good place.
>
> But you have to give to get.  What kind of data do you have?  What does
> similar mean?  What kinds of clustering do you want to do?  Why?
>


This is the high level scenario ---- Re: Can Mahout do pattern recognition?

2013-05-31 Thread Mimi Tam
I have a gazillion streams of wireless call control data coming in from a 
gazillions of M2M (Machine-to-Machine) devices. These device data belong to 
different customers of ours all intermingled together when we (as a Wireless 
Carrier) receive them e.g. All the devices that belong to Company A are 
sending these data to us together with companies' B, C, D, E. etc. 
simultaneously. Its Hadoop data right now.


I want to be able to detect problems from these records coming in pertaining 
to each and every company. For example, normally if 7 records coming in are 
the same and the 8th one was the end-puof-session record, we know that a 
device might have made requests repeatedly to connect to a network but 
failed, I want to be able to tell or predict what is happening or what will 
happen based on the sequence of events (as records coming in all mixed up) 
on a per device basis. I will also check some key value indicators e.g. 
Signal Strength and together with the pattern recognized (i.e. the call 
sequence expressed as records coming in all mixed up with all other call 
sequence and all other devices) to determine if we can fix up something 
before it happens.


So basically, I envision classifying or clustering records that belong to 
one device and the device belongs to a particular company. Knowing what 
records coming in and compare to the training pattern that I fed in (i.e. 
the expected record sequence), I should be able to tell which device from 
which company is misbehaving and why or which device from which company is 
about to be bumped out from a cell tower because of this tower's traffic 
saturation.


Hope I am not too confusing.

If Mahout can do what I am looking for and I don't need to use FANN, that'd 
be splendid. But, I'd like to find a source to tell me exactly how I can 
implement it if Mahout is my solution candidate.


Many Thanks in advanceMimi







-Original Message- 
From: Ted Dunning

Sent: Friday, May 31, 2013 4:40 PM
To: Mahout Dev List
Subject: Re: Can Mahout do pattern recognition?

On Fri, May 31, 2013 at 4:07 PM, Mimi Tam  wrote:


Please if you will direct me to where I can find out how this can be done.



Right here is a good place.

But you have to give to get.  What kind of data do you have?  What does
similar mean?  What kinds of clustering do you want to do?  Why?