On Wed, Jun 5, 2013 at 5:04 PM, Mimi Tam <mimi....@ieee.org> wrote: > I have one input stream of many Radio network related Signaling records in > XML from various sources. I need to group records that belong to a Base > Station/Mobile Operator/mobile device and monitor its usage pattern e.g. > high traffic at noon time and low traffic at midnight, location of device > path tracking,... etc. The detected usage pattern will be used for various > purposes. These are not personal device but M2M devices. > > Where can I find info on an implementation/integration of this kind of > pattern recognition algorithm in Mahout? >
Mahout does not have anything that specifically does these things. But ... Mahout does have good k-means clustering (thanks to Dan making my prototype good enough to commit). With k-means clustering you can do volume quantization on positions which turns paths into sequences of symbols. You can then to cooccurrence analysis on lagged versions of these sequences to get interesting path predictions. You can also use similar techniques to find traffic anomalies. The basic idea is that can use pure spatial or spatio-temporal clustering to build models that predict traffic. Comparing non-temporal models can tell you where time is important and comparing actual versus predicted traffic can help you find anomalies. These anomalies might be network disruptions or might be disturbances in your own systems.