Thanks for the tips.
Ran the vacuumdb a few times, and looks like the sl_log_1 table is
the one bloating.
Looks like sl_log_1 has just about as many dead rows as live rows
over a 10 minute period (in the 20K range). sl_confirm gets about a
thousand rows, and sl_seqlog gets ~60 live rows, and just about as
many dead rows during this period.
Like you said, perhaps the confirms aren't getting through, but I'm
not sure how to remedy that.
In terms of long-running transactions, I'm getting a handful of
15-30s selects on tables every now and then (every 15 min or so).
Would you be suggesting that it's not so much that slony is
struggling, but that I need to tweak postgresql.conf?
The output of one run is below:
INFO: vacuuming "_slony.sl_confirm"
INFO: index "sl_confirm_idx1" now contains 2398 row versions in 515
pages
DETAIL: 1420 index row versions were removed.
493 index pages have been deleted, 484 are currently reusable.
INFO: index "sl_confirm_idx2" now contains 2402 row versions in 438
pages
DETAIL: 1420 index row versions were removed.
421 index pages have been deleted, 413 are currently reusable.
INFO: "sl_confirm": removed 1420 row versions in 17 pages
INFO: "sl_confirm": found 1420 removable, 2398 nonremovable row
versions in 28 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: "sl_confirm": scanned 28 of 28 pages, containing 2408 live
rows and 0 dead rows; 2408 rows in sample, 2408 estimated total rows
VACUUM
INFO: vacuuming "_slony.sl_setsync"
INFO: index "sl_setsync-pkey" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: "sl_setsync": found 0 removable, 0 nonremovable row versions
in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: vacuuming "pg_toast.pg_toast_357381189"
INFO: index "pg_toast_357381189_index" now contains 0 row versions
in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: "pg_toast_357381189": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: "sl_setsync": scanned 0 of 0 pages, containing 0 live rows and
0 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM
INFO: vacuuming "_slony.sl_log_1"
INFO: index "sl_log_1_idx1" now contains 55103 row versions in 10468
pages
DETAIL: 33856 index row versions were removed.
10174 index pages have been deleted, 9996 are currently reusable.
INFO: index "sl_log_1_idx2" now contains 56952 row versions in 5166
pages
DETAIL: 33856 index row versions were removed.
4991 index pages have been deleted, 4888 are currently reusable.
INFO: "sl_log_1": removed 33856 row versions in 1491 pages
INFO: "sl_log_1": found 33856 removable, 54743 nonremovable row
versions in 4223 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 243532 unused item pointers.
0 pages are entirely empty.
INFO: vacuuming "pg_toast.pg_toast_357381304"
INFO: index "pg_toast_357381304_index" now contains 0 row versions
in 2 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: "pg_toast_357381304": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: "sl_log_1": scanned 3000 of 4223 pages, containing 15546 live
rows and 25252 dead rows; 3000 rows in sample, 21884 estimated total
rows
VACUUM
INFO: vacuuming "_slony.sl_log_2"
INFO: index "sl_log_2_idx1" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: index "sl_log_2_idx2" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: "sl_log_2": found 0 removable, 0 nonremovable row versions in
0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: vacuuming "pg_toast.pg_toast_357381311"
INFO: index "pg_toast_357381311_index" now contains 0 row versions
in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
INFO: "pg_toast_357381311": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
INFO: "sl_log_2": scanned 0 of 0 pages, containing 0 live rows and 0
dead rows; 0 rows in sample, 0 estimated total rows
VACUUM
INFO: vacuuming "_slony.sl_seqlog"
INFO: index "sl_seqlog_idx" now contains 98 row versions in 134 pages
DETAIL: 131 index pages have been deleted, 131 are currently reusable.
INFO: "sl_seqlog": found 0 removable, 98 nonremovable row versions
in 1 pages
DETAIL: 53 dead row versions cannot be removed yet.
There were 47 unused item pointers.
0 pages are entirely empty.
INFO: "sl_seqlog": scanned 1 of 1 pages, containing 45 live rows and
53 dead rows; 45 rows in sample, 45 estimated total rows
VACUUM
Thanks a lot for your help!
--Richard
On Sep 28, 2006, at 12:35 PM, Christopher Browne wrote:
> Richard Yen <[EMAIL PROTECTED]> writes:
>> Looking at my logs, the cleanupThread runs about once every 11
>> minutes, and the vacuuming takes anywhere from 90s to 200s to finish.
>>
>> My slon_conf_vac_frequency is set at 1, so I can't make the vacuums
>> finish any faster.
>
>
> The control of how long between cleanup calls is in slon.h:
>
> #define SLON_CLEANUP_SLEEP 600 /* sleep 10 minutes between
> */
>
> Changing 600 to something less, and recompiling, wouldn't be "major
> surgery."
>
> To my mind, the real question is what table or tables have bloated,
> and why.
>
> I'd suggest running the following, assuming the Slony-I schema is
> _mycluster...
>
> for table in confirm setsync log_1 log_2 seqlog; do
> vacuumdb -t "_mycluster".sl_${table} -v -z
> done
>
> (I'm assuming PGDATABASE, PGPORT, PGUSER, PGHOST are all set
> suitably...)
>
> This will give some idea as to which table is bloated.
>
> There are two particularly plausible directions:
>
> 1. sl_log_1 and sl_seqlog have a lot of live tuples, typically caused
> by confirmations not getting thru.
>
> 2. sl_log_1 and sl_seqlog have a lot of dead tuples, typically caused
> by having some long running transaction holding onto things.
>
> Note that in neither case is the answer to vacuum more frequently...
> --
> "cbbrowne","@","ca.afilias.info"
> <http://dba2.int.libertyrms.com/>
> Christopher Browne
> (416) 673-4124 (land)
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general