Have you considered running a process that summarizes the data from the table and just fetching the last summary for display? Will a periodic snapshot work for your reporting or do you need realtime summarization?
On 1/12/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Hi all. I will try to post all data I can think of. The basic issue is that I have a Perl CGI script that shows the most recent load stats for a bunch of servers we use at our site for long, computing-intense programs. The environment is all RedHat Enterprise Linux 4 ( 2.6.9-42.26.ELsmp #1 SMP i686 i686 i386 GNU/Linux) SQLite version is 3.2.2. As the number of records in my database has increased, the page load time has also increased, now to about 10 seconds, which is unacceptable, IMO. The idea is to give users an idea of the least loaded machine to help them make the decision of which to use for their next program. The whole concept of how to best do this is, of course, in and of itself a complicated issue. At my org. there is no other system in place so this is how I do it. I've gotten a big positive response from users. The last implementation used a CSV file and each iteration of gathering the data from the servers only allowed the cron job to run every five minutes. After upgrading to SQLite, the now minutely cron job has never failed to complete successfully in under one minute that I know of. I'd like to focus on optimizing my current implementation, and then, if relevant, talk about other, perhaps better ways to do it. Currently, both the old, 5-minutely CSV version and the current beta (SQLite) version run in parallel, and the CSV version is still more widely used since the page actually loads in a reasonable amount of time. The CSV data are overwritten each time so the process of reading in that data is always the same. The SQLite database is INSERTed into because I want to keep historical data. The rationale for this is explained later. The current system uses a DB with a table called stats that has more or less minutely data on various conditions for ~17 so-called peak machines. The table can be described thus: CREATE TABLE stats (Machine text not null,Load real,Scratch text(4), Mem int, MemPctFree int, Procs int, Users int, Timestamp text(20) not null, Message text); My CGI script fetches the most recent observation for each machine with a select statement of the form: select a.* from stats a, (select Machine, max(Timestamp) as M from stats group by machine) b where a.machine=b.machine and a.timestamp=b.M order by load, Mem*MemPctFree desc, Scratch desc; I floundered about trying to find a select statement that would simply give me the right answer; I suspect the main opportunity for optimization is in rewriting that statement. The CGI script uses DBI and the exact statement is my $aref = $dbh->selectall_arrayref($ss2); where $ss2 is the select statement above. The idea being that the least loaded machines are first in the list. The usefulness of the stats reported are of course debatable, but again, let's make sure the problem is addressed separately. Some of the machines have very large scratch partitions which make those machines particularly suitable for jobs that generate very large files. A typical group of rows might look like this: Machine Load Scratch space available Total memory (GB) Free memory (%) Running processes User(s) Timestamp Message peaklx6 0 21G 12 96 1 1 [EMAIL PROTECTED]:28:35 peaklx1 0 58G 12 88 1 1 [EMAIL PROTECTED]:29:03 peaklx5 0 19G 12 86 1 1 [EMAIL PROTECTED]:28:33 peaklx4 0 12 84 1 4 [EMAIL PROTECTED]:28:31 peaklx2 0 59G 12 65 1 2 [EMAIL PROTECTED]:29:05 peaklx21 0 4 14 1 2 [EMAIL PROTECTED]:29:07 peaklx3 0.1 12 78 1 3 [EMAIL PROTECTED]:28:27 fstlx1 0.2 20G 4 5 1 11 [EMAIL PROTECTED]:28:37 peaklx22 1 4 53 2 0 [EMAIL PROTECTED]:29:09 peaklx24 2 4 58 3 0 [EMAIL PROTECTED]:29:12 peaklx29 2 4 40 3 3 [EMAIL PROTECTED]:28:25 peaklx23 2 4 39 3 0 [EMAIL PROTECTED]:29:11 peaklx30 2 4 29 3 0 [EMAIL PROTECTED]:28:29 peaklx28 2 4 28 3 0 [EMAIL PROTECTED]:28:23 peaklx26 3 12 69 4 0 [EMAIL PROTECTED]:29:17 peaklx25 4 4 21 5 0 [EMAIL PROTECTED]:29:15 peaklx27 4 4 17 5 2 [EMAIL PROTECTED]:28:21 The problem I am having is that the page load times are getting very, very long. The table now has about 700,000 columns and the select statement is taking about 10 seconds to execute (I benchmarked various parts of the program and it's not the DB connect or loading the modules or whatnot). The DB file lives on an NFS mounted share but running the same query locally and interactively and both all take approximately the same amount of time. The size of the DB file itself is currently 89MB. The web server (and all the servers in question) all are dual processor 2.8GHz Xeons with a minimum of 4GB of memory (all running the same OS). Some have hyperthreading turned on and others don't. I tried putting a copy of the DB file in the same directory as the script; no dice. I tried timing the subquery and found that the query 'select Machine, max(Timestamp) as M from stats group by machine' takes about 2 seconds while the entire query takes about 10. I tried adding several indices to the table: CREATE INDEX PKEY on stats (Machine, Timestamp); CREATE INDEX TS on stats (Timestamp); CREATE INDEX MACH on stats (Machine); and tried the same benchmarks after each index creation but with the same times. Are indices something that only work if you create them BEFORE you start adding data? I have tried changing the default cache size to an obscenely large 700000000 with no change in my benchmarks. I assume increasing page size won't make a difference if cache size is that big. I tried the very dangerous PRAGMA synchronous = OFF; However, that setting doesn't seem to persist between sessions, and I can't seem to figure out if it actually worked from within perl. However it didn't make a difference interactively. Finally, PRAGMA temp_store = MEMORY; did not help either. Anyway I am a novice SQL programmer; I would describe myself as a rapidly improving, competent Perl programmer. The only solution that I can think of is creating a second table that is the table with the most recent info, and then keeping the big database for historical reasons (eventually I hope to create charts and such from the historical data to chart trends and perhaps give my users (and, perhaps, eventually, a sort of poor-man's load balancing program) more info to go on when it comes to picking which machine to run their big job on). I have also read about preparing a statement and then executing the resulting statement handle but does that work for SQLite? I've had great luck just using fetchall_(array|hash)ref. From what I read, I should be able to get better performance out of this app. I probably have wasted a lot of time pursuing dead ends, but I figured I'd do my due diligence before emailing the list. I must also admit I am pretty impressed that despite my mucking around, the overall app remains as robust as ever (sure every once in a while as I was building the indices and whatnot the UPDATES failed but for the most part, she takes a licking and keeps on ticking). So there it is; what next, my friends? TIA, Jonathan ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------
-- -- The PixAddixImage Collector suite: http://groups-beta.google.com/group/pixaddix SqliteImporter and SqliteReplicator: Command line utilities for Sqlite http://www.reddawn.net/~jsprenkl/Sqlite Cthulhu Bucks! http://www.cthulhubucks.com ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------