Have you considered running a process that summarizes the data from the table
and just fetching the last summary for display? Will a periodic snapshot work
for your reporting or do you need realtime summarization?


On 1/12/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Hi all.  I will try to post all data I can think of.  The basic issue is
that I have a Perl CGI script that shows the most recent load stats for a
bunch of servers we use at our site for long, computing-intense programs.
The environment is all RedHat Enterprise Linux 4 ( 2.6.9-42.26.ELsmp #1
SMP i686 i686 i386 GNU/Linux)  SQLite version is 3.2.2.  As the number of
records in my database has increased, the page load time has also
increased, now to about 10 seconds, which is unacceptable, IMO.  The idea
is to give users an idea of the least loaded machine to help them make the
decision of which to use for their next program.  The whole concept of how
to best do this is, of course, in and of itself a complicated issue.  At
my org. there is no other system in place so this is how I do it.  I've
gotten a big positive response from users.  The last implementation used a
CSV file and each iteration of gathering the data from the servers only
allowed the cron job to run every five minutes.  After upgrading to
SQLite, the now minutely cron job has never failed to complete
successfully in under one minute that I know of.  I'd like to focus on
optimizing my current implementation, and then, if relevant, talk about
other, perhaps better ways to do it.  Currently, both the old, 5-minutely
CSV version and the current beta (SQLite) version run in parallel, and the
CSV version is still more widely used since the page actually loads in a
reasonable amount of time.  The CSV data are overwritten each time so the
process of reading in that data is always the same.  The SQLite database
is INSERTed into because I want to keep historical data.  The rationale
for this is explained later.

The current system uses a DB with a table called stats that has more or
less minutely data on various conditions for ~17 so-called peak machines.
The table can be described thus:

CREATE TABLE stats (Machine text not null,Load real,Scratch text(4), Mem
int, MemPctFree int, Procs int, Users int, Timestamp text(20) not null,
Message text);

My CGI script fetches the most recent observation for each machine with a
select statement of the form:

select a.* from stats a, (select Machine, max(Timestamp) as M from stats
group by machine) b where a.machine=b.machine and a.timestamp=b.M order by
load, Mem*MemPctFree desc, Scratch desc;

I floundered about trying to find a select statement that would simply
give me the right answer; I suspect the main opportunity for optimization
is in rewriting that statement.

The CGI script uses DBI and the exact statement is my $aref =
$dbh->selectall_arrayref($ss2); where $ss2 is the select statement above.

The idea being that the least loaded machines are first in the list.  The
usefulness of the stats reported are of course debatable, but again, let's
make sure the problem is addressed separately.  Some of the machines have
very large scratch partitions which make those machines particularly
suitable for jobs that generate very large files.

A typical group of rows might look like this:


Machine
Load
Scratch
space
available
Total
memory
(GB)
Free
memory
(%)
Running
processes
User(s)
Timestamp
Message
peaklx6
0
21G
12
96
1
1
[EMAIL PROTECTED]:28:35

peaklx1
0
58G
12
88
1
1
[EMAIL PROTECTED]:29:03

peaklx5
0
19G
12
86
1
1
[EMAIL PROTECTED]:28:33

peaklx4
0

12
84
1
4
[EMAIL PROTECTED]:28:31

peaklx2
0
59G
12
65
1
2
[EMAIL PROTECTED]:29:05

peaklx21
0

4
14
1
2
[EMAIL PROTECTED]:29:07

peaklx3
0.1

12
78
1
3
[EMAIL PROTECTED]:28:27

fstlx1
0.2
20G
4
5
1
11
[EMAIL PROTECTED]:28:37

peaklx22
1

4
53
2
0
[EMAIL PROTECTED]:29:09

peaklx24
2

4
58
3
0
[EMAIL PROTECTED]:29:12

peaklx29
2

4
40
3
3
[EMAIL PROTECTED]:28:25

peaklx23
2

4
39
3
0
[EMAIL PROTECTED]:29:11

peaklx30
2

4
29
3
0
[EMAIL PROTECTED]:28:29

peaklx28
2

4
28
3
0
[EMAIL PROTECTED]:28:23

peaklx26
3

12
69
4
0
[EMAIL PROTECTED]:29:17

peaklx25
4

4
21
5
0
[EMAIL PROTECTED]:29:15

peaklx27
4

4
17
5
2
[EMAIL PROTECTED]:28:21


The problem I am having is that the page load times are getting very, very
long.  The table now has about 700,000 columns and the select statement is
taking about 10 seconds to execute (I benchmarked various parts of the
program and it's not the DB connect or loading the modules or whatnot).
The DB file lives on an NFS mounted share but running the same query
locally and interactively and both all take approximately the same amount
of time.  The size of the DB file itself is currently 89MB.  The web
server (and all the servers in question) all are dual processor 2.8GHz
Xeons with a minimum of 4GB of memory (all running the same OS).  Some
have hyperthreading turned on and others don't.

I tried putting a copy of the DB file in the same directory as the script;
no dice.

I tried timing the subquery and found that the query 'select Machine,
max(Timestamp) as M from stats group by machine' takes about 2 seconds
while the entire query takes about 10.  I tried adding several indices to
the table:

CREATE INDEX PKEY on stats (Machine, Timestamp);
CREATE INDEX TS on stats (Timestamp);
CREATE INDEX MACH on stats (Machine);

and tried the same benchmarks after each index creation but with the same
times.  Are indices something that only work if you create them BEFORE you
start adding data?

I have tried changing the default cache size to an obscenely large
700000000 with no change in my benchmarks.  I assume increasing page size
won't make a difference if cache size is that big.

I tried the very dangerous PRAGMA synchronous = OFF; However, that setting
doesn't seem to persist between sessions, and I can't seem to figure out
if it actually worked from within perl.  However it didn't make a
difference interactively.

Finally, PRAGMA temp_store = MEMORY; did not help either.

Anyway I am a novice SQL programmer; I would describe myself as a rapidly
improving, competent Perl programmer.  The only solution that I can think
of is creating a second table that is the table with the most recent info,
and then keeping the big database for historical reasons (eventually I
hope to create charts and such from the historical data to chart trends
and perhaps give my users (and, perhaps, eventually, a sort of poor-man's
load balancing program) more info to go on when it comes to picking which
machine to run their big job on).  I have also read about preparing a
statement and then executing the resulting statement handle but does that
work for SQLite?  I've had great luck just using fetchall_(array|hash)ref.

From what I read, I should be able to get better performance out of this
app.  I probably have wasted a lot of time pursuing dead ends, but I
figured I'd do my due diligence before emailing the list.  I must also
admit I am pretty impressed that despite my mucking around, the overall
app remains as robust as ever (sure every once in a while as I was
building the indices and whatnot the UPDATES failed but for the most part,
she takes a licking and keeps on ticking).

So there it is; what next, my friends?

TIA,

Jonathan




-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------




--
--
The PixAddixImage Collector suite:
http://groups-beta.google.com/group/pixaddix

SqliteImporter and SqliteReplicator: Command line utilities for Sqlite
http://www.reddawn.net/~jsprenkl/Sqlite

Cthulhu Bucks!
http://www.cthulhubucks.com

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to