Thanks! You were right. The database needed some tuning. I used pgtune and applied the suggested changes. Now the DELETE only takes 27s instead of 8h. The plan also looks very different. Now it's actually using the rhn_pkg_clr_cld_uq index unlike before:

spacewalk=# explain analyze verbose DELETE FROM rhnPackageChangeLogData
spacewalk-# WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM 
rhnPackageChangeLogRec );
                                                                                
  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on rhnpackagechangelogdata  (cost=440805.00..455833.24 rows=200770 
width=6) (actual time=27688.250..27688.250 rows=0 loops=1)
   Output:
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Unique  (cost=0.00..440642.63 rows=64947 width=8) (actual 
time=0.113..25919.024 rows=401528 loops=1)
           Output: rhnpackagechangelogrec.changelog_data_id
           ->  Index Scan using rhn_pkg_clr_cld_uq on rhnpackagechangelogrec  
(cost=0.00..429560.29 rows=4432935 width=8) (actual time=0.106..14740.702 
rows=5178070 loops=1)
                 Output: rhnpackagechangelogrec.changelog_data_id
 Total runtime: 27707.343 ms
(9 rows)

Now, I hope the tuned database won't affect the spacewalk services...

Thanks again for your help!

Cheers,

Gerald

On 08/06/15 20:19, Paul Robert Marino wrote:
Well I'm running PostgreSQL 9.x on all of my database servers so my
plan looks way different than your for instance my servers don't need
to do all of those sorts, so its not a viable comparison. that said
the numbers are nothing like yours my servers chomp through that query
reasonably quickly.
Also Ive seen PostgreSQL 8.4 handle more complex operations on 1TB
tables much more efficiently than how your box is handling things.

I suspect your issue is the setting for work_mem are too low to handle
the dataset without resorting to using a temporary directory to swap
the data in and out of memory. Additionally in PostgreSQL 9.x
effective cache auto tunes to 1/3 of your ram which is fine for most
users, but in 8.x you still need to statically set it in the config.
Also load can be deceptive on modern systems I rarely use it as a
benchmark any more. the intresting thing to know is what does sar say
about your system during those queries.further more which scheduler
are you using for the disks if you are using CFQ which is the default
on Red Hat your IO performance can be somewhat against PostgreSQL in
favor of other applications because PostgreSQL eats so much IO I
always advise the deadline shceduler for physical servers and noop for
virtual ones.

the finally question is by any chance can it have been a really long
time since that query was executed successfully, in which case its
probably got a lot of cleanup to take care of.


On Mon, Jun 8, 2015 at 1:33 PM, Gerald Vogt <v...@spamcop.net> wrote:
vacuum analyze is the first thing I do when I see problems on a
postgresql database...

The database is running on the spacewalk server but it doesn't show too
much load at any time nor is it slow on the disks.

The estimate of explain verbose don't look too promising to me.

What do you get for

explain verbose DELETE FROM rhnPackageChangeLogData WHERE id NOT IN (
SELECT DISTINCT changelog_data_id FROM rhnPackageChangeLogRec );

??

Thanks,

Gerald

On 08.06.15 19:05, Paul Robert Marino wrote:
looks like your database isn't tuned correctly or you desperately need
to run a vacuum analyze on it.
check your disks too.
This is most likely a problem with your database servers,
configuration, maintenance, or hardware.


On Mon, Jun 8, 2015 at 11:07 AM, Gerald Vogt <v...@spamcop.net> wrote:
Does anyone have an idea what I could do about this?

Does anyone see something similar?

Thanks,

Gerald
On 28.05.15 06:57, Gerald Vogt wrote:
Eventually cleanup-data-bunch finished after 8:40 h. Definitively not
suitable to run daily.

I don't quite understand why the query runs with two sequential scans
and does not use the indexes.

Does anyone have an idea how to optimize that? Is that always taking so
long? I am using CentOS 6.6, Spacewalk 2.3, PostgreSQL 8.4.20.

For now, I'll schedule that task monthly...

Thanks,

Gerald

On 27.05.15 13:39, Gerald Vogt wrote:
Hi!

It seems cleanup-data-bunch takes a very long time.

I have noticed that it was in state "INTERRUPTED" and did not run for a
couple of months. I have applied the changes as suggested in

https://www.redhat.com/archives/spacewalk-list/2015-May/msg00091.html

and now cleanup-data-bunch does run again. It's however very slow.
Currently, it is already running for 2:15 h and still goes on.

I can see with pg_stat_activity that is running this query:

DELETE FROM rhnPackageChangeLogData
          WHERE id NOT IN ( SELECT DISTINCT changelog_data_id FROM
rhnPackageChangeLogRec )

which seems to keep the database busy for a while:

spacewalk=# explain verbose DELETE FROM rhnPackageChangeLogData WHERE id NOT IN 
( SELECT DISTINCT changelog_data_id FROM rhnPackageChangeLogRec );
                                                 QUERY PLAN
-----------------------------------------------------------------------------------------------------------
  Seq Scan on rhnpackagechangelogdata  (cost=803563.75..215621093.82 
rows=201534 width=6)
    Output:
    Filter: (NOT (SubPlan 1))
    SubPlan 1
      ->  Materialize  (cost=803563.75..804467.22 rows=64947 width=8)
            Output: rhnpackagechangelogrec.changelog_data_id
            ->  Unique  (cost=781080.12..803244.80 rows=64947 width=8)
                  Output: rhnpackagechangelogrec.changelog_data_id
                  ->  Sort  (cost=781080.12..792162.46 rows=4432935 width=8)
                        Output: rhnpackagechangelogrec.changelog_data_id
                        Sort Key: rhnpackagechangelogrec.changelog_data_id
                        ->  Seq Scan on rhnpackagechangelogrec  
(cost=0.00..109859.35 rows=4432935 width=8)
                              Output: rhnpackagechangelogrec.changelog_data_id
(13 rows)

spacewalk=# select count(*) from rhnpackagechangelogrec;
   count
---------
  5112795
(1 row)

spacewalk=# select count(*) from rhnpackagechangelogdata;
  count
--------
  401000
(1 row)


If it is taking this long I guess there is no point scheduling it daily.

Any thoughts?

Thanks,

Gerald


_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list


_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list


_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list

_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list


_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list

_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list


_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list

Reply via email to