I think you are overestimating the compute needs,

100,000 customers generating 1k on average a day of data(I suspect that is a
high number) for 1 year,36 gig of data. Not that Much. using a IO subsystem
that can do 100 meg/sec read access(3ware comes to mind as a cheap way, you
can get it above 200+ meg a second for allot more, pci bus limited) 360
seconds for one pass thought the data. 6 minutes is not that bad, no
real-time need for this problem. Build a system based on the serverworks HE
chip set, 4 gig's memory is only 2000$ these days(and should make the
programmer life easier) and has dual PIII and dual pci chips, put in a
gigbit interface for the LAN connect if you want to offload the problem.
2.2Ghz of CPU should solve most of your computation issues(stat and simple
matching).  About the only thing that could be a problem is your
programmers. They hopeful need to use a one pass solution. N squared at
worst. What you are doing should be a IO bound problem. The beowulfen are
use for compute bound tasks like weather, key breaking, simulations etc...
were the data set can be broken down and acted on independently. Also do not
underestimate the issues the programmers will have writing code for that
system, it is a different mindset. Giving them a speedy box with fast IO and
lots of memory and that more likely will permit them to get the job done.
But on the other hand I am guessing/assuming about the problem.

MJM
----- Original Message -----
From: "Joey Kelly" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: "Gary Huntress" <[EMAIL PROTECTED]>
Sent: Tuesday, July 17, 2001 9:13 PM
Subject: Re: beowulfen and mysql


> Gary,
>
> Thanks for your response.
>
> Basically, what we are doing is buidling a huge database that will
> hold a lot of different info on each of our customers. I can't tell you
> what we do for a living, but basically we want to find out (for
> instance)
> how many of our customers used our product (the green one, not
> the blue or
> red one) last month, and travelled to Tennesee while using it. We
> might be
> able to sell this info to our vendors so they can make a better
> product
> next year. We have well over 100,000 customers. We will probably
> need to
> run other types of queries against the data next month, and I have 2
> programmers to write the queries.
>
> What kind of algorithms my coders will construct is up to them (I'm
> just the admin).
>
> I do know that beowulfen are great number crunchers, and that a
> huge number of selects might not run faster on a cluster than one
> one huge machine because of the I/O bottlenecks between
> machines. We are planning on doing what we can to compensate
> for this. That said, what kid of experiences have you and others
> had with mysql and clusters?
>
> --Joey
>
> From:           "Gary Huntress" <[EMAIL PROTECTED]>
> To:             "Joey Kelly" <[EMAIL PROTECTED]>,
<[EMAIL PROTECTED]>
> Subject:        Re: beowulfen and mysql
> Date sent:      Sun, 15 Jul 2001 21:19:02 -0400
>
> > First, lets make sure we're all talking about the same thing.  A general
> > definition of data mining would be "the automated extraction of
predictive
> > information from large databases".   Automated implies some sort of
agent,
> > and by its very nature the predictions are statistical (hence the large
data
> > sets).   Most agents that I have seen could be classified as decision
tree,
> > neural network, or genetic algorithm.
> >
> > Note that data mining is not considered to be data warehousing, ad hoc
> > querying, OLTP or visualization.
> >
> > Are you trying to build some sort of predictive model?  Perhaps you can
> > describe it a bit further.  In general, once you pick an algorithm or
> > approach you would have to build an application layer above your query
> > layer.   I imagine you would do that in C, for speed.
> >
> > You probably want to read about MySQL replication here
> > http://www.mysql.com/documentation/mysql/bychapter/manual_Replication.ht
ml#R
> > eplication
> >
> > Now, regarding Beowulf clusters.   They are defined as (by the people
that
> > created it, Donald Becker and company) solely a computational cluster
and
> > not something geared toward data mining.   You certainly can learn some
> > great lessons from their architecture (channel bonded ethernet or
myrinet)
> > but don't expect them to answer any questions in the database arena!
> >
> > Regards,
> > Gary "SuperID" Huntress
> > =======================================================
> > FreeSQL.org offering free database hosting to developers
> > Visit http://www.freesql.org
> >
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Joey Kelly" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Sunday, July 15, 2001 8:26 PM
> > Subject: beowulfen and mysql
> >
> >
> > > Howdy.
> > >
> > > My company needs to implement a data mining setup. I am
> > > building a cluster using dual athlons and perhaps firewire instead of
> > > 100baseTX.
> > >
> > > I need to find out as much as I can from those who have done
> > > mysql on beowulfen. Please contact me at [EMAIL PROTECTED]
> > >
> > > Thanks :)
> > >
> > > +++
> > >
> > > Joey Kelly
> > > /Minister of the Gospel | Computer Networking Consultant/
> > > http://nolalinuxcoop.dhs.org/~jkelly/home/
> > >
> > > "Experience hath shewn, that even under the best forms [of government]
> > those entrusted with power have, in time, and by slow op
> > > erations, perverted it into tyranny." - Thomas Jefferson
> > >
> > >
> > > ---------------------------------------------------------------------
> > > Before posting, please check:
> > >    http://www.mysql.com/manual.php   (the manual)
> > >    http://lists.mysql.com/           (the list archive)
> > >
> > > To request this thread, e-mail <[EMAIL PROTECTED]>
> > > To unsubscribe, e-mail
> > <[EMAIL PROTECTED]>
> > > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
> > >
> >
>
>
>
> +++
>
> Joey Kelly
> /Minister of the Gospel | Computer Networking Consultant/
> http://nolalinuxcoop.dhs.org/~jkelly/home/
>
> "Experience hath shewn, that even under the best forms [of government]
those entrusted with power have, in time, and by slow op
> erations, perverted it into tyranny." - Thomas Jefferson
>
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail
<[EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to