Re: [sqlite] Re: How to optimize a select that gets 17 rows from a 700k row DB (via perl using DBI)?

Jay Sprenkle Sun, 14 Jan 2007 08:49:24 -0800

On 1/13/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:


Jay:

The closer to real-time, the better.  The most often a cron job can run
under Linux is minutely, and minutely is pretty good.  I guess I could
have the summary process occur at the end of the script that polls the
machines.  It could generate static HTML, which would presumably make the
page load super fast.  However, under the current regime, the process of
creating that summary is going to take at least 10 seconds.  40 seconds
for polling + 10 seconds for summarizing=50 seconds, and that number is
only going to get bigger!  So I'll have to figure out a better table
structure anyway.


You don't have to run under cron for something like that. Loading and
unloading the program several times a minute is not very efficient anyway.
Just let it run continuously and use sleep() (or a timer) to yield your time
slice until
the next time you want to run.

What's the advantage of a database for this application? If all you need is
to load balance it would seem simpler to just query each machine for it's
load and react accordingly. I'm not sure if Perl supports SOAP
interfaces or serializing data over an http connection. You might look
into that for later.

Are indices something that only work if you create them BEFORE you

start adding data?

No. The index on stats.Timestamp should speed up finding
the record max(Timestamp). It will speed up queries for existing data to.
It's like a table of contents for a book.

Here's what I would try:

1. Can you speed up this?
select Machine, max(Timestamp) as M from stats group by machine

If this is trying to get the machine with the latest time stamp then
perhaps this might be faster:
 select Machine, Timestamp as M from stats order by Timestamp desc limit 1
It gets one record instead of summarizing a lot of data.

Your code does a join of two tables on the machine column and timestamp:

select a.* from stats a, (select Machine, max(Timestamp) as M from stats
group by machine) b where a.machine=b.machine and a.timestamp=b.M order by
load, Mem*MemPctFree desc, Scratch desc;

Did you index both tables on ( machine, timestamp )?
It's got to match them up so an index will speed up the search of both sides
when it tries to match them up.


Additional thoughts:

In general, I think splitting the tables up is the way to go.  Any further
comments/suggestions appreciated!

Jonathan

-----------------------------------------------------------------------------

To unsubscribe, send email to [EMAIL PROTECTED]

-----------------------------------------------------------------------------



--
--
The PixAddixImage Collector suite:
http://groups-beta.google.com/group/pixaddix

SqliteImporter and SqliteReplicator: Command line utilities for Sqlite
http://www.reddawn.net/~jsprenkl/Sqlite

Cthulhu Bucks!
http://www.cthulhubucks.com

Re: [sqlite] Re: How to optimize a select that gets 17 rows from a 700k row DB (via perl using DBI)?

Reply via email to