Hallo's finally i find some time to ask boring questions :)
I some sort of stumbled across the mahout project at apachecon08 in amsterdam. But i havent found the time for looking into it deeply. I would like to ask for some hints / links / directions for a 'predictions' feature. i read through the mahout wiki and found some interesting links. but since i com more from the applications part and i am not that much into databases i need some help getting started. we develop a reporting application for a telcommunication company. mainly we store data in an oracle cluster. it consists of a star-schema. the application mainly offers to create reports on two data sources: costs and traffic. the data amount is about 1-2 terabytes. the idea came up to implement some 'alarming' features. so customers could set up some limits for contracts, phone numbers etc to get notified once the limits are reached or the data 'behaves strange' (too strong increases for a period, other ideas to come...). i would like to ask if there is something of use in mahout or whether you would recommend to keep such features 'simple' on a statistical basis and not use learning techniques at all? on the other hand the more boring questions: do i need a hadoop cluster for your implementations or could i run them on oracle based clusters as well? any links, recommendations, books, hints, thoughts are welcome! :) kind regards werner
