Sorry for the double post - I intended to mention this, but forgot.
For your automatic summary generation you could consider a cron job
piping in SQL, a la
*/3 * * * * mysql -D databaseName -e "insert into summaryTable select
now(), sum(numberColumn) from liveTable where datestamp > date_sub(
now(), interval 3 minute)"

though a more reliable approach in one of my own applications was to
key off maximum ID found each time, rather than a date field, since
data arrived from different sources asynchronously.  More like (in
pseudo SQL):

select MAXID as max(id) from liveTable
select MINID as minid from statusTable
insert into summaryTable select now(), sum(numberColumn) from
liveTable where id > MINID and id <= MAXID
update statusTable set minid = MAXID

so you're examining a moving window of ID numbers rather than a window
of date/timestamps.  Works for the odd case where an early datestamp
slips into your table, or for when the database server is down for a
period of time.


As an alternative to cron, MySQL 5.1 has a new feature for scheduled events:
http://dev.mysql.com/doc/refman/5.1/en/events-syntax.html
5.1 isn't production just yet but depending on your timeframe it might
be worth a look.

HTH,
Dan



On 8/22/06, Dan Buettner <[EMAIL PROTECTED]> wrote:
Andy, performance will generally depend on several factors:

- size of each record
- amount of RAM
- speed of disks, for when RAM isn't enough
- concurrent inserts/writes (using InnoDB or MyISAM tables?)

At one level, 6 million records is no problem, and really not even
very many.  If every record is several MB in size though then it
becomes another story.  One system I designed has collected some 15
million records, and performance is still good, though records are
fairly small (a few hundred bytes).  Real-time reporting on old data
in a Web browser is no longer feasible though, so I started generating
summary data for "live" reporting (see below).

I would suggest taking some time to architect it well, using archival
tables for old data, possibly with the MERGE engine.  If you need to
plot data in close to real time with a large recordset, a good
approach may be to automatically generate summary data into a side
table periodically (every 3 minutes, every 30 minutes, whatever) off
the last <x> minutes.  If your summary process can run off a smallish
"live" data table and key off an indexed field, it will hopefully run
quickly each time.  Then you plot off that summary table while the
live processes go about their business, largely unaffected by your
reporting.

Read up on MySQL performance - there's a good section ("optimization")
in the online manual, and also I recommend Jeremy Zawodny's book "High
Performance MySQL" from O'Reilly as a good source of information.

Within reason, a well-architected database on modest hardware can
often perform better than a poorly-architected one on a superfancy
box.

As an aside, I ran Sybase on both E420R's (very similar to the 450)
and V240's at my last job, and the V240 ran circles around the 420.
The V240's are nice little boxes (though I had no real complaints
about the 420's, for what they were - cheap, reliable data storage).
Loading the thing up on RAM should help.

Best,
Dan


On 8/22/06, Andy Ford <[EMAIL PROTECTED]> wrote:
> Hi everyone
>
> I have a requirement to store in the region of 3 million entries per month 
(30 ish days) of data in a mySQL database totalling a maximum of 2 months (6 
Million entries).
> The database will be 'probably' reside on a Sun V240. I have a similar size 
database running on an old Sun E450 and it runs like a dog (with three legs).
> I'll need to access the data and plot the output almost real time, or as 
close to it as possible.
>
> I may have to go down the route of buying another database like DB2, Informix 
or Oracle but I have no idea of the costs involved or whether I will get improved 
performance form these compared to mySQL.
>
>
> Anyone have experience of large datasets within mySQL.
>
> Thanks
>
> Regards
>
> Andy
>
> This e-mail is private and may be confidential and is for the intended 
recipient only.  If misdirected, please notify us by telephone and confirm that it 
has been deleted from your system and any copies destroyed.  If you are not the 
intended recipient you are strictly prohibited from using, printing, copying, 
distributing or disseminating this e-mail or any information contained in it.  We 
use reasonable endeavours to virus scan all e-mails leaving the Company but no 
warranty is given that this e-mail and any attachments are virus free.  You should 
undertake your own virus checking.  The right to monitor e-mail communications 
through our network is reserved by us.
>
>
>
>


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to