uDig/GeoTools/JDBC clobbers Postgres vacuums

I was wondering if anyone had any experience with the following problem. We 
have encountered a situation where uDig 1.1.RC4 interferes with Postgres 
ability to perform maintenance on tables, eventually causing the database 
management system to fail. I'm not sure whether the problem is in uDig, 
GeoTools, or the JDBC itself.  If it lies in uDig or GeoTools, then I suspect 
that the problem can be fixed easily.

Background:

We run a high performance, real-time application that tracks aircraft movements 
on a 7-by-24 basis. The system uses Postgres as its database engine. We are 
currently working on integrating the system with GeoServer and uDig and have 
recently installed PostGIS. A key feature of the database is that it features a 
nearly 100 percent turnover of its data every 24 hours and must run for weeks 
at a time without being shutdown for maintenance. We stress the heck out of 
Postgres and it runs like a champ.

Recently, I have seen occasions where database performance fell through the 
floor. Maintenance procedures that normally ran in two minutes were taking over 
11000 seconds.  After some investigation, and many false leads, I determined 
that these problems occurred when I left uDig running unattended.  To make the 
problem happen, I opened a PostGIS Data Source, plotted, performed one info 
request, and went home.

The cause is that somewhere in uDig something happens to interfere with the 
vacuum operation.   In Postgres, when you delete a record, the disk space it 
formerly occupied is not made available for re-use until you or the system 
performs a maintenance procedure known as a vacuum. Thus, if this maintenance 
procedure does not occur, the database accumulates a lot of dead space in its 
files if an application performs a lot of deletions (or SQL updates... which 
are essentially copy the old record, delete it, and rewrite it in its updated 
form).  As dead space accumulates, data becomes scatted over many pages in the 
database file system and badly fragmented, until all queries fairly crawl). 
Over the last couple of years, the Postgres folks introduced a feature to 
perform this vacuum operation automatically, but we don't use it because it 
does add some overhead and slow performance (at least when we tested the 
feature a couple of releases ago). Fortunately, because we were performing the 
Vacuum manually, we had the means to verify that it was indeed uDig (or one of 
its components) that was interfering with the Vacuum.

The Cause:

There are certain database operations that can stall a Postgres vacuum process. 
For example, if a table is locked, or if an application is in the middle of 
performing a SQL transaction (“Begin” and “Commit”), the Vacuum operation waits 
until the lock is released or the transaction is committed before performing 
any further operations on the table.  When we run a Vacuum manually, we can put 
it in “verbose” mode and watch its progress.  Last night, I started uDig, 
plotted a layer, and started the Vacuum.  It moved along merrily until it 
reached the table that I had plotted in uDig, at which point it stalled.  When 
I terminated uDig several hours later, the Vacuum resumed almost immediately.

I believe that uDig is either locking the table of interest or holding a 
transaction open. And I’m not sure whether this is “a bug or a feature”.  From 
a certain perspective, this is not an unreasonable thing for it to do since one 
can view uDig as a data-editing tool…  and if a user has exclusive authorship 
of a database, then he might very well want a transaction open for long periods 
of time. However, in a case where uDig is being used to inspect a database (or 
a user just walks away and leaves his editing session open), it can interfere 
with other operations.  So, if it is a feature rather than a bug, it may be a 
feature that needs to be re-thought.

Suggestion:

I don’t know enough about uDig and GeoTools to make a suggestion with any kind 
of authority, but I wonder if the quickest fix might be to allow uDig to open a 
PostGIS Data Source on a read-only basis and ensure that when it does so, no 
transactions or locks are left in place between rendering operations. A more 
sophisticated approach would allow multiple users to operate on the database, 
but may be harder to implement.

Qualification:

Even as I write this, I must admit that I do not understand enough about the 
workings of uDig (the source code of which I find quite challenging to follow), 
or even Postgres, to claim that my analysis is completely correct.  But I did 
conduct the experiment as described above multiple times and have no doubt that 
the problems clear up almost immediately when uDig terminates. I hope that my 
observations can contribute to the continuing process of improvement of the 
uDig application.

Thank you for your attention in this matter.

Gary


---
Gary W. Lucas, Senior Software Engineer
Sonalysts, Inc
215 Parkway North
Waterford, CT 06320
(860) 326-3682
_______________________________________________
User-friendly Desktop Internet GIS (uDig)
http://udig.refractions.net
http://lists.refractions.net/mailman/listinfo/udig-devel

Reply via email to