Re: Question about data warehousing and mining through Mahout

hdev ml Tue, 31 Aug 2010 14:55:34 -0700

Hi Sean,

I may not be able to divulge a lot of information about the business because
of confidentiality and since I am a new employee here :), but

the log data has

1. different types of user requests - Different types of requests and its
related data.
2. different session parameters - When did the user session start. What is
the stickiness of the user etc.
3. different context parameters, such as user location, where he is going
etc.

Per my understanding of hive, we can do some statistical reporting, like
frequency of user sessions, which geographical region, which device he is
using the most etc.

But we also want to mine this data to get some predictive capabilities like
what is the likelihood that the user will use the same device again or if we
get sales/marketing data (on the roadmap for future), we want to possibly
predict which region to put more marketing/sales efforts. What is the
pattern for growth of user base, in which geographical regions etc. What is
the pattern of user requests failing and a number of requirements like these
from the business.

Does that fit the data mining bill? or am I looking in the wrong place.

Again thanks for your time and help.

HDev

On Tue, Aug 31, 2010 at 2:40 PM, Sean Owen <[email protected]> wrote:

> I think you'd have to begin to define what you want to do with the
> logs? What do you mean when you say "data mining"?
>
> On Tue, Aug 31, 2010 at 10:21 PM, hdev ml <[email protected]> wrote:
> > Hi all,
> >
> > I am currently trying to find out what frameworks/software/product will
> > support data warehousing/data mining the best.
> >
> > We get around 1.5+ TB of log data every month and we want to do some
> > reporting on top of that and later on move on to data mining.
> >
> > I am a total newbie in this world, coming from a RDBMS background and
> wanted
> > to get your opinion on what is the best approach to take in this regard.
> >
> > I looked around the hadoop movement and the corresponding sub projects.
> >
> > I found Hive as a framework can support and scale for this large data.
> >
> > So first of phase of reporting can be done using hive. But can I reuse
> the
> > same data for data mining through the Mahout project?
> >
> > Can somebody please guide me regarding this?
> >
> > Thanks for your help.
> >
> > HDev.
> >
>

Re: Question about data warehousing and mining through Mahout

Reply via email to