I have an FB 2.1.5 Classic server running on a Windows 2003 server,
with a single hard drive for the operating system, and a 3 disk raid 5
array for the database.  We have one database on this machine, which
is a dialect 1 database that was started on IB6.0 many years ago,
currently at 90GB.  We have sweep disabled, and each night run gbak,
gfix –sweep, as well as reindex all tables via a script.

 

The database has very little OLTP, and is mostly used for reporting
and serving web pages to internal business users.  We do alot of ETL
starting very early each morning, and create a mix of scheduled
reports as well as allowing users to specify parameters to create
pre-designed reports in an ad-hoc manner.

 

Once or twice per month, the system slows down tremendously.  One ETL
process typically runs at a pace of about 1000 records per 10 seconds.
During these slow periods, the same ETL will run 1000 transactions per
60-80 seconds.  When processing a file with 1mil+ records, this slow
down costs us hours.

 

I have not been able to determine the reason for these slow periods.
They do not coincide with higher cpu or disk usage – most of the time
I'm seeing very little usage of anything – disk/cpu/network/memory.  I
do see more connections to the database during these periods –
typically we have 10 to 15 connections, and the number may double
during the problem times.  This is due to the fact that the reports
that users are requesting are taking longer to run, and our
pooled-connection application server or web server creates more
connections to satisfy new user requests.

 

I do see more queries running by the time I am notified of the
problem.  Again though, this is a coincidence of the slowdown and not
the cause – queries are taking longer to run, and therefore we have
more chance of overlapping user requests than usual.

 

Most of the time we need to reboot once or even twice to fix the
problem.  That is not a viable long-term solution though, and I'm
looking for more ideas to determine what may be happening.  Any ideas
would be most helpful.

 

I have included gstat –h output of the database as it is suffering
from the issue below:

 

Database header page information:

        Flags                   0

        Checksum                12345

        Generation              43136192

        Page size               8192

        ODS version             11.1

        Oldest transaction      40789582

        Oldest active           41467442

        Oldest snapshot         41467442

        Next transaction        42431040

        Bumped transaction      1

        Sequence number         0

        Next attachment ID      705070

        Implementation ID       16

        Shadow count            0

        Page buffers            2048

        Next header page        0

        Database dialect        1

        Creation date           May 2, 2009 22:22:39

        Attributes              force write, no reserve

 

    Variable header data:

        Sweep interval:         0

        *END*

 

Thank you for your assistance,

 

Bob M..



[Non-text portions of this message have been removed]

Reply via email to