Re: autovac issue with large number of tables

2020-12-08 Thread Kasahara Tatsuhito
On Wed, Dec 9, 2020 at 12:01 AM Fujii Masao  wrote:
>
>
>
> On 2020/12/04 12:21, Kasahara Tatsuhito wrote:
> > Hi,
> >
> > On Thu, Dec 3, 2020 at 9:09 PM Fujii Masao  
> > wrote:
> >>
> >>
> >>
> >> On 2020/12/03 11:46, Kasahara Tatsuhito wrote:
> >>> On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  
> >>> wrote:
> 
>  On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  
>  wrote:
> >
> >
> >
> > On 2020/12/02 12:53, Masahiko Sawada wrote:
> >> On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
> >> wrote:
> >>>
> >>> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao 
> >>>  wrote:
> 
> 
> 
>  On 2020/12/01 16:23, Masahiko Sawada wrote:
> > On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi,
> >>
> >> On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> >>  wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/30 10:43, Masahiko Sawada wrote:
>  On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi, Thanks for you comments.
> >
> > On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >>> Hi,
> >>>
> >>> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >>>  wrote:
> 
> 
> 
>  On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >  wrote:
> >>
> >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >>>  wrote:
> 
>  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >  wrote:
> >>> I wonder if we could have table_recheck_autovac do 
> >>> two probes of the stats
> >>> data.  First probe the existing stats data, and if it 
> >>> shows the table to
> >>> be already vacuumed, return immediately.  If not, 
> >>> *then* force a stats
> >>> re-read, and check a second time.
> >> Does the above mean that the second and subsequent 
> >> table_recheck_autovac()
> >> will be improved to first check using the previous 
> >> refreshed statistics?
> >> I think that certainly works.
> >>
> >> If that's correct, I'll try to create a patch for the 
> >> PoC
> >
> > I still don't know how to reproduce Jim's troubles, but 
> > I was able to reproduce
> > what was probably a very similar problem.
> >
> > This problem seems to be more likely to occur in cases 
> > where you have
> > a large number of tables,
> > i.e., a large amount of stats, and many small tables 
> > need VACUUM at
> > the same time.
> >
> > So I followed Tom's advice and created a patch for the 
> > PoC.
> > This patch will enable a flag in the 
> > table_recheck_autovac function to use
> > the existing stats next time if VACUUM (or ANALYZE) has 
> > already been done
> > by another worker on the check after the stats have 
> > been updated.
> > If the tables continue to require VACUUM after the 
> > refresh, then a refresh
> > will be required instead of using the existing 
> > statistics.
> >
> > I did simple test with HEAD and HEAD + this PoC patch.
> > The tests were conducted in two cases.
> > (I changed few configurations. see attached scripts)
> >
> > 1. Normal VACUUM case
> >  - SET autovacuum = off
> >  - CREATE tables with 100 rows
> >  - DELETE 90 rows for each tables

Re: autovac issue with large number of tables

2020-12-08 Thread Fujii Masao




On 2020/12/04 12:21, Kasahara Tatsuhito wrote:

Hi,

On Thu, Dec 3, 2020 at 9:09 PM Fujii Masao  wrote:




On 2020/12/03 11:46, Kasahara Tatsuhito wrote:

On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  wrote:


On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  wrote:




On 2020/12/02 12:53, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  wrote:


On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  wrote:




On 2020/12/01 16:23, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
 wrote:


Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  wrote:




On 2020/11/30 10:43, Masahiko Sawada wrote:

On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:


Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
 - SET autovacuum = off
 - CREATE tables with 100 rows
 - DELETE 90 rows for each tables
 - SET autovacuum = on and restart PostgreSQL
 - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
 - CREATE brank tables
 - SELECT all of these tables (for generate stats)
 - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
 - Consumes a lot of XIDs by using txid_curent()
 - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
tables:1000
 autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
 autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
 autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
 autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
 autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

tables:5000
 autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
 autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
 autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
 autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
 autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

tables:1
 autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
 autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
 autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
 autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
 autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

tables:2
 autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
 autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
 autovacuum_max_w

Re: autovac issue with large number of tables

2020-12-03 Thread Kasahara Tatsuhito
Hi,

On Thu, Dec 3, 2020 at 9:09 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/03 11:46, Kasahara Tatsuhito wrote:
> > On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/12/02 12:53, Masahiko Sawada wrote:
>  On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
>  wrote:
> >
> > On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/12/01 16:23, Masahiko Sawada wrote:
> >>> On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi,
> 
>  On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
>   wrote:
> >
> >
> >
> > On 2020/11/30 10:43, Masahiko Sawada wrote:
> >> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi, Thanks for you comments.
> >>>
> >>> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> >>>  wrote:
> 
> 
> 
>  On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > Hi,
> >
> > On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >>>  wrote:
> 
>  On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >  wrote:
> >>
> >> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>>  wrote:
> > I wonder if we could have table_recheck_autovac do two 
> > probes of the stats
> > data.  First probe the existing stats data, and if it 
> > shows the table to
> > be already vacuumed, return immediately.  If not, 
> > *then* force a stats
> > re-read, and check a second time.
>  Does the above mean that the second and subsequent 
>  table_recheck_autovac()
>  will be improved to first check using the previous 
>  refreshed statistics?
>  I think that certainly works.
> 
>  If that's correct, I'll try to create a patch for the PoC
> >>>
> >>> I still don't know how to reproduce Jim's troubles, but I 
> >>> was able to reproduce
> >>> what was probably a very similar problem.
> >>>
> >>> This problem seems to be more likely to occur in cases 
> >>> where you have
> >>> a large number of tables,
> >>> i.e., a large amount of stats, and many small tables need 
> >>> VACUUM at
> >>> the same time.
> >>>
> >>> So I followed Tom's advice and created a patch for the 
> >>> PoC.
> >>> This patch will enable a flag in the 
> >>> table_recheck_autovac function to use
> >>> the existing stats next time if VACUUM (or ANALYZE) has 
> >>> already been done
> >>> by another worker on the check after the stats have been 
> >>> updated.
> >>> If the tables continue to require VACUUM after the 
> >>> refresh, then a refresh
> >>> will be required instead of using the existing statistics.
> >>>
> >>> I did simple test with HEAD and HEAD + this PoC patch.
> >>> The tests were conducted in two cases.
> >>> (I changed few configurations. see attached scripts)
> >>>
> >>> 1. Normal VACUUM case
> >>> - SET autovacuum = off
> >>> - CREATE tables with 100 rows
> >>> - DELETE 90 rows for each tables
> >>> - SET autovacuum = on and restart PostgreSQL
> >>> - Measure the time it takes for all tables to be 
> >>> VACUUMed
> >>>
> >>> 2. Anti wrap round VACUUM case
> >>> - CREATE brank tables
> >>> - SELECT all of these tables (for generate stats)
> >>

Re: autovac issue with large number of tables

2020-12-03 Thread Fujii Masao



On 2020/12/03 11:46, Kasahara Tatsuhito wrote:

On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  wrote:


On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  wrote:




On 2020/12/02 12:53, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  wrote:


On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  wrote:




On 2020/12/01 16:23, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
 wrote:


Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  wrote:




On 2020/11/30 10:43, Masahiko Sawada wrote:

On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:


Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
- SET autovacuum = off
- CREATE tables with 100 rows
- DELETE 90 rows for each tables
- SET autovacuum = on and restart PostgreSQL
- Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
- CREATE brank tables
- SELECT all of these tables (for generate stats)
- SET autovacuum_freeze_max_age to low values and restart PostgreSQL
- Consumes a lot of XIDs by using txid_curent()
- Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
   tables:1000
autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

   tables:5000
autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

   tables:1
autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

   tables:2
autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
autov

Re: autovac issue with large number of tables

2020-12-02 Thread Kasahara Tatsuhito
On Wed, Dec 2, 2020 at 7:11 PM Masahiko Sawada  wrote:
>
> On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/12/02 12:53, Masahiko Sawada wrote:
> > > On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
> > > wrote:
> > >>
> > >> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  
> > >> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/12/01 16:23, Masahiko Sawada wrote:
> >  On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> > >  wrote:
> > >>
> > >>
> > >>
> > >> On 2020/11/30 10:43, Masahiko Sawada wrote:
> > >>> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> > >>>  wrote:
> > 
> >  Hi, Thanks for you comments.
> > 
> >  On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> >   wrote:
> > >
> > >
> > >
> > > On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > >> Hi,
> > >>
> > >> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> > >>  wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >  On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >   wrote:
> > >
> > > On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> > >  wrote:
> > >>
> > >> Hi,
> > >>
> > >> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> > >>  wrote:
> > >>>
> > >>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> > >>>  wrote:
> > 
> >  Hi,
> > 
> >  On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >   wrote:
> > >> I wonder if we could have table_recheck_autovac do two 
> > >> probes of the stats
> > >> data.  First probe the existing stats data, and if it 
> > >> shows the table to
> > >> be already vacuumed, return immediately.  If not, *then* 
> > >> force a stats
> > >> re-read, and check a second time.
> > > Does the above mean that the second and subsequent 
> > > table_recheck_autovac()
> > > will be improved to first check using the previous 
> > > refreshed statistics?
> > > I think that certainly works.
> > >
> > > If that's correct, I'll try to create a patch for the PoC
> > 
> >  I still don't know how to reproduce Jim's troubles, but I 
> >  was able to reproduce
> >  what was probably a very similar problem.
> > 
> >  This problem seems to be more likely to occur in cases 
> >  where you have
> >  a large number of tables,
> >  i.e., a large amount of stats, and many small tables need 
> >  VACUUM at
> >  the same time.
> > 
> >  So I followed Tom's advice and created a patch for the PoC.
> >  This patch will enable a flag in the table_recheck_autovac 
> >  function to use
> >  the existing stats next time if VACUUM (or ANALYZE) has 
> >  already been done
> >  by another worker on the check after the stats have been 
> >  updated.
> >  If the tables continue to require VACUUM after the 
> >  refresh, then a refresh
> >  will be required instead of using the existing statistics.
> > 
> >  I did simple test with HEAD and HEAD + this PoC patch.
> >  The tests were conducted in two cases.
> >  (I changed few configurations. see attached scripts)
> > 
> >  1. Normal VACUUM case
> > - SET autovacuum = off
> > - CREATE tables with 100 rows
> > - DELETE 90 rows for each tables
> > - SET autovacuum = on and restart PostgreSQL
> > - Measure the time it takes for all tables to be 
> >  VACUUMed
> > 
> >  2. Anti wrap round VACUUM case
> > - CREATE brank tables
> > - SELECT all of these tables (for generate stats)
> > - SET autovacuum_freeze_max_age to low values and 
> >  restart PostgreSQL
> > - Consumes a lot of XIDs by using txid_curent()
> > - Measure the time it takes for all tables to be 
> >  VA

Re: autovac issue with large number of tables

2020-12-02 Thread Masahiko Sawada
On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/02 12:53, Masahiko Sawada wrote:
> > On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/12/01 16:23, Masahiko Sawada wrote:
>  On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/11/30 10:43, Masahiko Sawada wrote:
> >>> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi, Thanks for you comments.
> 
>  On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
>   wrote:
> >
> >
> >
> > On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >> Hi,
> >>
> >> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >>  wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
>  On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
>   wrote:
> >
> > On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi,
> >>
> >> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >>  wrote:
> >>>
> >>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi,
> 
>  On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
>   wrote:
> >> I wonder if we could have table_recheck_autovac do two 
> >> probes of the stats
> >> data.  First probe the existing stats data, and if it 
> >> shows the table to
> >> be already vacuumed, return immediately.  If not, *then* 
> >> force a stats
> >> re-read, and check a second time.
> > Does the above mean that the second and subsequent 
> > table_recheck_autovac()
> > will be improved to first check using the previous 
> > refreshed statistics?
> > I think that certainly works.
> >
> > If that's correct, I'll try to create a patch for the PoC
> 
>  I still don't know how to reproduce Jim's troubles, but I 
>  was able to reproduce
>  what was probably a very similar problem.
> 
>  This problem seems to be more likely to occur in cases where 
>  you have
>  a large number of tables,
>  i.e., a large amount of stats, and many small tables need 
>  VACUUM at
>  the same time.
> 
>  So I followed Tom's advice and created a patch for the PoC.
>  This patch will enable a flag in the table_recheck_autovac 
>  function to use
>  the existing stats next time if VACUUM (or ANALYZE) has 
>  already been done
>  by another worker on the check after the stats have been 
>  updated.
>  If the tables continue to require VACUUM after the refresh, 
>  then a refresh
>  will be required instead of using the existing statistics.
> 
>  I did simple test with HEAD and HEAD + this PoC patch.
>  The tests were conducted in two cases.
>  (I changed few configurations. see attached scripts)
> 
>  1. Normal VACUUM case
> - SET autovacuum = off
> - CREATE tables with 100 rows
> - DELETE 90 rows for each tables
> - SET autovacuum = on and restart PostgreSQL
> - Measure the time it takes for all tables to be 
>  VACUUMed
> 
>  2. Anti wrap round VACUUM case
> - CREATE brank tables
> - SELECT all of these tables (for generate stats)
> - SET autovacuum_freeze_max_age to low values and 
>  restart PostgreSQL
> - Consumes a lot of XIDs by using txid_curent()
> - Measure the time it takes for all tables to be 
>  VACUUMed
> 
>  For each test case, the following results were obtained by 
>  changing
>  autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
>  Also changing num of tables to 1000, 5000, 1 and 2.
> >

Re: autovac issue with large number of tables

2020-12-02 Thread Kasahara Tatsuhito
Hi

On Wed, Dec 2, 2020 at 3:33 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/02 12:53, Masahiko Sawada wrote:
> > On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/12/01 16:23, Masahiko Sawada wrote:
>  On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/11/30 10:43, Masahiko Sawada wrote:
> >>> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi, Thanks for you comments.
> 
>  On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
>   wrote:
> >
> >
> >
> > On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >> Hi,
> >>
> >> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >>  wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
>  On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
>   wrote:
> >
> > On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi,
> >>
> >> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >>  wrote:
> >>>
> >>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi,
> 
>  On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
>   wrote:
> >> I wonder if we could have table_recheck_autovac do two 
> >> probes of the stats
> >> data.  First probe the existing stats data, and if it 
> >> shows the table to
> >> be already vacuumed, return immediately.  If not, *then* 
> >> force a stats
> >> re-read, and check a second time.
> > Does the above mean that the second and subsequent 
> > table_recheck_autovac()
> > will be improved to first check using the previous 
> > refreshed statistics?
> > I think that certainly works.
> >
> > If that's correct, I'll try to create a patch for the PoC
> 
>  I still don't know how to reproduce Jim's troubles, but I 
>  was able to reproduce
>  what was probably a very similar problem.
> 
>  This problem seems to be more likely to occur in cases where 
>  you have
>  a large number of tables,
>  i.e., a large amount of stats, and many small tables need 
>  VACUUM at
>  the same time.
> 
>  So I followed Tom's advice and created a patch for the PoC.
>  This patch will enable a flag in the table_recheck_autovac 
>  function to use
>  the existing stats next time if VACUUM (or ANALYZE) has 
>  already been done
>  by another worker on the check after the stats have been 
>  updated.
>  If the tables continue to require VACUUM after the refresh, 
>  then a refresh
>  will be required instead of using the existing statistics.
> 
>  I did simple test with HEAD and HEAD + this PoC patch.
>  The tests were conducted in two cases.
>  (I changed few configurations. see attached scripts)
> 
>  1. Normal VACUUM case
> - SET autovacuum = off
> - CREATE tables with 100 rows
> - DELETE 90 rows for each tables
> - SET autovacuum = on and restart PostgreSQL
> - Measure the time it takes for all tables to be 
>  VACUUMed
> 
>  2. Anti wrap round VACUUM case
> - CREATE brank tables
> - SELECT all of these tables (for generate stats)
> - SET autovacuum_freeze_max_age to low values and 
>  restart PostgreSQL
> - Consumes a lot of XIDs by using txid_curent()
> - Measure the time it takes for all tables to be 
>  VACUUMed
> 
>  For each test case, the following results were obtained by 
>  changing
>  autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
>  Also changing num of tables to 1000, 5000, 1 and 2.
> >

Re: autovac issue with large number of tables

2020-12-01 Thread Fujii Masao




On 2020/12/02 12:53, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  wrote:


On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  wrote:




On 2020/12/01 16:23, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
 wrote:


Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  wrote:




On 2020/11/30 10:43, Masahiko Sawada wrote:

On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:


Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
   - SET autovacuum = off
   - CREATE tables with 100 rows
   - DELETE 90 rows for each tables
   - SET autovacuum = on and restart PostgreSQL
   - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
   - CREATE brank tables
   - SELECT all of these tables (for generate stats)
   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
   - Consumes a lot of XIDs by using txid_curent()
   - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
  tables:1000
   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

  tables:5000
   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

  tables:1
   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

  tables:2
   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
  tables:1000
   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
   

Re: autovac issue with large number of tables

2020-12-01 Thread Masahiko Sawada
On Tue, Dec 1, 2020 at 5:31 PM Masahiko Sawada  wrote:
>
> On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/12/01 16:23, Masahiko Sawada wrote:
> > > On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> > >  wrote:
> > >>
> > >> Hi,
> > >>
> > >> On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao 
> > >>  wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/11/30 10:43, Masahiko Sawada wrote:
> >  On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi, Thanks for you comments.
> > >
> > > On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> > >  wrote:
> > >>
> > >>
> > >>
> > >> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > >>> Hi,
> > >>>
> > >>> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> > >>>  wrote:
> > 
> > 
> > 
> >  On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> > >  wrote:
> > >>
> > >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> > >>>  wrote:
> > 
> >  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi,
> > >
> > > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > >  wrote:
> > >>> I wonder if we could have table_recheck_autovac do two 
> > >>> probes of the stats
> > >>> data.  First probe the existing stats data, and if it shows 
> > >>> the table to
> > >>> be already vacuumed, return immediately.  If not, *then* 
> > >>> force a stats
> > >>> re-read, and check a second time.
> > >> Does the above mean that the second and subsequent 
> > >> table_recheck_autovac()
> > >> will be improved to first check using the previous refreshed 
> > >> statistics?
> > >> I think that certainly works.
> > >>
> > >> If that's correct, I'll try to create a patch for the PoC
> > >
> > > I still don't know how to reproduce Jim's troubles, but I was 
> > > able to reproduce
> > > what was probably a very similar problem.
> > >
> > > This problem seems to be more likely to occur in cases where 
> > > you have
> > > a large number of tables,
> > > i.e., a large amount of stats, and many small tables need 
> > > VACUUM at
> > > the same time.
> > >
> > > So I followed Tom's advice and created a patch for the PoC.
> > > This patch will enable a flag in the table_recheck_autovac 
> > > function to use
> > > the existing stats next time if VACUUM (or ANALYZE) has 
> > > already been done
> > > by another worker on the check after the stats have been 
> > > updated.
> > > If the tables continue to require VACUUM after the refresh, 
> > > then a refresh
> > > will be required instead of using the existing statistics.
> > >
> > > I did simple test with HEAD and HEAD + this PoC patch.
> > > The tests were conducted in two cases.
> > > (I changed few configurations. see attached scripts)
> > >
> > > 1. Normal VACUUM case
> > >   - SET autovacuum = off
> > >   - CREATE tables with 100 rows
> > >   - DELETE 90 rows for each tables
> > >   - SET autovacuum = on and restart PostgreSQL
> > >   - Measure the time it takes for all tables to be 
> > > VACUUMed
> > >
> > > 2. Anti wrap round VACUUM case
> > >   - CREATE brank tables
> > >   - SELECT all of these tables (for generate stats)
> > >   - SET autovacuum_freeze_max_age to low values and 
> > > restart PostgreSQL
> > >   - Consumes a lot of XIDs by using txid_curent()
> > >   - Measure the time it takes for all tables to be 
> > > VACUUMed
> > >
> > > For each test case, the following results were obtained by 
> > > changing
> > > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > > Also changing num of tables to 1000, 5000, 1 and 2.
> > >
> > > Due to the poor VM environment (2 VCPU/4 GB), the results are 
> > > a little unstable,
> > > but I think it's enough to ask for a trend.
> > >
> > > ==

Re: autovac issue with large number of tables

2020-12-01 Thread Masahiko Sawada
On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao  wrote:
>
>
>
> On 2020/12/01 16:23, Masahiko Sawada wrote:
> > On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi,
> >>
> >> On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/30 10:43, Masahiko Sawada wrote:
>  On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi, Thanks for you comments.
> >
> > On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> >  wrote:
> >>
> >>
> >>
> >> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >>> Hi,
> >>>
> >>> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >>>  wrote:
> 
> 
> 
>  On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >  wrote:
> >>
> >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >>>  wrote:
> 
>  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >  wrote:
> >>> I wonder if we could have table_recheck_autovac do two probes 
> >>> of the stats
> >>> data.  First probe the existing stats data, and if it shows 
> >>> the table to
> >>> be already vacuumed, return immediately.  If not, *then* 
> >>> force a stats
> >>> re-read, and check a second time.
> >> Does the above mean that the second and subsequent 
> >> table_recheck_autovac()
> >> will be improved to first check using the previous refreshed 
> >> statistics?
> >> I think that certainly works.
> >>
> >> If that's correct, I'll try to create a patch for the PoC
> >
> > I still don't know how to reproduce Jim's troubles, but I was 
> > able to reproduce
> > what was probably a very similar problem.
> >
> > This problem seems to be more likely to occur in cases where 
> > you have
> > a large number of tables,
> > i.e., a large amount of stats, and many small tables need 
> > VACUUM at
> > the same time.
> >
> > So I followed Tom's advice and created a patch for the PoC.
> > This patch will enable a flag in the table_recheck_autovac 
> > function to use
> > the existing stats next time if VACUUM (or ANALYZE) has already 
> > been done
> > by another worker on the check after the stats have been 
> > updated.
> > If the tables continue to require VACUUM after the refresh, 
> > then a refresh
> > will be required instead of using the existing statistics.
> >
> > I did simple test with HEAD and HEAD + this PoC patch.
> > The tests were conducted in two cases.
> > (I changed few configurations. see attached scripts)
> >
> > 1. Normal VACUUM case
> >   - SET autovacuum = off
> >   - CREATE tables with 100 rows
> >   - DELETE 90 rows for each tables
> >   - SET autovacuum = on and restart PostgreSQL
> >   - Measure the time it takes for all tables to be VACUUMed
> >
> > 2. Anti wrap round VACUUM case
> >   - CREATE brank tables
> >   - SELECT all of these tables (for generate stats)
> >   - SET autovacuum_freeze_max_age to low values and restart 
> > PostgreSQL
> >   - Consumes a lot of XIDs by using txid_curent()
> >   - Measure the time it takes for all tables to be VACUUMed
> >
> > For each test case, the following results were obtained by 
> > changing
> > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > Also changing num of tables to 1000, 5000, 1 and 2.
> >
> > Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> > little unstable,
> > but I think it's enough to ask for a trend.
> >
> > ===
> > [1.Normal VACUUM case]
> >  tables:1000
> >   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch) 
> >  20 sec
> >   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch) 
> >  16 sec

Re: autovac issue with large number of tables

2020-11-30 Thread Fujii Masao




On 2020/12/01 16:23, Masahiko Sawada wrote:

On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
 wrote:


Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  wrote:




On 2020/11/30 10:43, Masahiko Sawada wrote:

On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:


Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
  - SET autovacuum = off
  - CREATE tables with 100 rows
  - DELETE 90 rows for each tables
  - SET autovacuum = on and restart PostgreSQL
  - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
  - CREATE brank tables
  - SELECT all of these tables (for generate stats)
  - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
  - Consumes a lot of XIDs by using txid_curent()
  - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
  autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
  autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
  autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
  autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
  autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

 tables:1
  autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
  autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
  autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
  autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
  autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

 tables:2
  autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
  autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
  autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
  autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
  autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
  autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
  autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
  autovacuum_max_workers 5:   (HEAD) 14 sec VS (with p

Re: autovac issue with large number of tables

2020-11-30 Thread Masahiko Sawada
On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
 wrote:
>
> Hi,
>
> On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/11/30 10:43, Masahiko Sawada wrote:
> > > On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> > >  wrote:
> > >>
> > >> Hi, Thanks for you comments.
> > >>
> > >> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao 
> > >>  wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >  Hi,
> > 
> >  On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
> >   wrote:
> > >
> > >
> > >
> > > On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > >> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> > >>  wrote:
> > >>>
> > >>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> > >>>  wrote:
> > 
> >  Hi,
> > 
> >  On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> >   wrote:
> > >
> > > On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> > >  wrote:
> > >>
> > >> Hi,
> > >>
> > >> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > >>  wrote:
> >  I wonder if we could have table_recheck_autovac do two probes 
> >  of the stats
> >  data.  First probe the existing stats data, and if it shows 
> >  the table to
> >  be already vacuumed, return immediately.  If not, *then* force 
> >  a stats
> >  re-read, and check a second time.
> > >>> Does the above mean that the second and subsequent 
> > >>> table_recheck_autovac()
> > >>> will be improved to first check using the previous refreshed 
> > >>> statistics?
> > >>> I think that certainly works.
> > >>>
> > >>> If that's correct, I'll try to create a patch for the PoC
> > >>
> > >> I still don't know how to reproduce Jim's troubles, but I was 
> > >> able to reproduce
> > >> what was probably a very similar problem.
> > >>
> > >> This problem seems to be more likely to occur in cases where you 
> > >> have
> > >> a large number of tables,
> > >> i.e., a large amount of stats, and many small tables need VACUUM 
> > >> at
> > >> the same time.
> > >>
> > >> So I followed Tom's advice and created a patch for the PoC.
> > >> This patch will enable a flag in the table_recheck_autovac 
> > >> function to use
> > >> the existing stats next time if VACUUM (or ANALYZE) has already 
> > >> been done
> > >> by another worker on the check after the stats have been updated.
> > >> If the tables continue to require VACUUM after the refresh, then 
> > >> a refresh
> > >> will be required instead of using the existing statistics.
> > >>
> > >> I did simple test with HEAD and HEAD + this PoC patch.
> > >> The tests were conducted in two cases.
> > >> (I changed few configurations. see attached scripts)
> > >>
> > >> 1. Normal VACUUM case
> > >>  - SET autovacuum = off
> > >>  - CREATE tables with 100 rows
> > >>  - DELETE 90 rows for each tables
> > >>  - SET autovacuum = on and restart PostgreSQL
> > >>  - Measure the time it takes for all tables to be VACUUMed
> > >>
> > >> 2. Anti wrap round VACUUM case
> > >>  - CREATE brank tables
> > >>  - SELECT all of these tables (for generate stats)
> > >>  - SET autovacuum_freeze_max_age to low values and restart 
> > >> PostgreSQL
> > >>  - Consumes a lot of XIDs by using txid_curent()
> > >>  - Measure the time it takes for all tables to be VACUUMed
> > >>
> > >> For each test case, the following results were obtained by 
> > >> changing
> > >> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > >> Also changing num of tables to 1000, 5000, 1 and 2.
> > >>
> > >> Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> > >> little unstable,
> > >> but I think it's enough to ask for a trend.
> > >>
> > >> ===
> > >> [1.Normal VACUUM case]
> > >> tables:1000
> > >>  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  
> > >> 20 sec
> > >>  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  
> > >> 16 sec
> > >>  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  
> > >> 16 sec
> > >>  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  
> > >> 17 sec
> > >>  autovacuum_max_workers 10:  (HEAD) 19 sec VS (wi

Re: autovac issue with large number of tables

2020-11-30 Thread Kasahara Tatsuhito
Hi,

On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao  wrote:
>
>
>
> On 2020/11/30 10:43, Masahiko Sawada wrote:
> > On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi, Thanks for you comments.
> >>
> >> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
>  Hi,
> 
>  On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao 
>   wrote:
> >
> >
> >
> > On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada 
> >>  wrote:
> >>>
> >>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>>  wrote:
> 
>  Hi,
> 
>  On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
>   wrote:
> >
> > On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >  wrote:
> >>
> >> Hi,
> >>
> >> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>  wrote:
>  I wonder if we could have table_recheck_autovac do two probes of 
>  the stats
>  data.  First probe the existing stats data, and if it shows the 
>  table to
>  be already vacuumed, return immediately.  If not, *then* force a 
>  stats
>  re-read, and check a second time.
> >>> Does the above mean that the second and subsequent 
> >>> table_recheck_autovac()
> >>> will be improved to first check using the previous refreshed 
> >>> statistics?
> >>> I think that certainly works.
> >>>
> >>> If that's correct, I'll try to create a patch for the PoC
> >>
> >> I still don't know how to reproduce Jim's troubles, but I was able 
> >> to reproduce
> >> what was probably a very similar problem.
> >>
> >> This problem seems to be more likely to occur in cases where you 
> >> have
> >> a large number of tables,
> >> i.e., a large amount of stats, and many small tables need VACUUM at
> >> the same time.
> >>
> >> So I followed Tom's advice and created a patch for the PoC.
> >> This patch will enable a flag in the table_recheck_autovac 
> >> function to use
> >> the existing stats next time if VACUUM (or ANALYZE) has already 
> >> been done
> >> by another worker on the check after the stats have been updated.
> >> If the tables continue to require VACUUM after the refresh, then a 
> >> refresh
> >> will be required instead of using the existing statistics.
> >>
> >> I did simple test with HEAD and HEAD + this PoC patch.
> >> The tests were conducted in two cases.
> >> (I changed few configurations. see attached scripts)
> >>
> >> 1. Normal VACUUM case
> >>  - SET autovacuum = off
> >>  - CREATE tables with 100 rows
> >>  - DELETE 90 rows for each tables
> >>  - SET autovacuum = on and restart PostgreSQL
> >>  - Measure the time it takes for all tables to be VACUUMed
> >>
> >> 2. Anti wrap round VACUUM case
> >>  - CREATE brank tables
> >>  - SELECT all of these tables (for generate stats)
> >>  - SET autovacuum_freeze_max_age to low values and restart 
> >> PostgreSQL
> >>  - Consumes a lot of XIDs by using txid_curent()
> >>  - Measure the time it takes for all tables to be VACUUMed
> >>
> >> For each test case, the following results were obtained by changing
> >> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> >> Also changing num of tables to 1000, 5000, 1 and 2.
> >>
> >> Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> >> little unstable,
> >> but I think it's enough to ask for a trend.
> >>
> >> ===
> >> [1.Normal VACUUM case]
> >> tables:1000
> >>  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 
> >> sec
> >>  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 
> >> sec
> >>  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 
> >> sec
> >>  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 
> >> sec
> >>  autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 
> >> sec
> >>
> >> tables:5000
> >>  autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 
> >> sec
> >>  autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 
> >> sec
> >>  autovacuum_max_workers 3:   (HEAD) 38 sec 

Re: autovac issue with large number of tables

2020-11-30 Thread Fujii Masao




On 2020/11/30 10:43, Masahiko Sawada wrote:

On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:


Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
 - SET autovacuum = off
 - CREATE tables with 100 rows
 - DELETE 90 rows for each tables
 - SET autovacuum = on and restart PostgreSQL
 - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
 - CREATE brank tables
 - SELECT all of these tables (for generate stats)
 - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
 - Consumes a lot of XIDs by using txid_curent()
 - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
tables:1000
 autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
 autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
 autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
 autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
 autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

tables:5000
 autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
 autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
 autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
 autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
 autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

tables:1
 autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
 autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
 autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
 autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
 autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

tables:2
 autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
 autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
 autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
 autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
 autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
tables:1000
 autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
 autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
 autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
 autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
 autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec

tables:5000
 autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
 autovacuum_max_workers 2:   (H

Re: autovac issue with large number of tables

2020-11-29 Thread Masahiko Sawada
On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
 wrote:
>
> Hi, Thanks for you comments.
>
> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > > Hi,
> > >
> > > On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  
> > > wrote:
> > >>
> > >>
> > >>
> > >> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > >>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  
> > >>> wrote:
> > 
> >  On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi,
> > >
> > > On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada 
> > >  wrote:
> > >>
> > >> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > >>>  wrote:
> > > I wonder if we could have table_recheck_autovac do two probes of 
> > > the stats
> > > data.  First probe the existing stats data, and if it shows the 
> > > table to
> > > be already vacuumed, return immediately.  If not, *then* force a 
> > > stats
> > > re-read, and check a second time.
> >  Does the above mean that the second and subsequent 
> >  table_recheck_autovac()
> >  will be improved to first check using the previous refreshed 
> >  statistics?
> >  I think that certainly works.
> > 
> >  If that's correct, I'll try to create a patch for the PoC
> > >>>
> > >>> I still don't know how to reproduce Jim's troubles, but I was able 
> > >>> to reproduce
> > >>> what was probably a very similar problem.
> > >>>
> > >>> This problem seems to be more likely to occur in cases where you 
> > >>> have
> > >>> a large number of tables,
> > >>> i.e., a large amount of stats, and many small tables need VACUUM at
> > >>> the same time.
> > >>>
> > >>> So I followed Tom's advice and created a patch for the PoC.
> > >>> This patch will enable a flag in the table_recheck_autovac function 
> > >>> to use
> > >>> the existing stats next time if VACUUM (or ANALYZE) has already 
> > >>> been done
> > >>> by another worker on the check after the stats have been updated.
> > >>> If the tables continue to require VACUUM after the refresh, then a 
> > >>> refresh
> > >>> will be required instead of using the existing statistics.
> > >>>
> > >>> I did simple test with HEAD and HEAD + this PoC patch.
> > >>> The tests were conducted in two cases.
> > >>> (I changed few configurations. see attached scripts)
> > >>>
> > >>> 1. Normal VACUUM case
> > >>> - SET autovacuum = off
> > >>> - CREATE tables with 100 rows
> > >>> - DELETE 90 rows for each tables
> > >>> - SET autovacuum = on and restart PostgreSQL
> > >>> - Measure the time it takes for all tables to be VACUUMed
> > >>>
> > >>> 2. Anti wrap round VACUUM case
> > >>> - CREATE brank tables
> > >>> - SELECT all of these tables (for generate stats)
> > >>> - SET autovacuum_freeze_max_age to low values and restart 
> > >>> PostgreSQL
> > >>> - Consumes a lot of XIDs by using txid_curent()
> > >>> - Measure the time it takes for all tables to be VACUUMed
> > >>>
> > >>> For each test case, the following results were obtained by changing
> > >>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > >>> Also changing num of tables to 1000, 5000, 1 and 2.
> > >>>
> > >>> Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> > >>> little unstable,
> > >>> but I think it's enough to ask for a trend.
> > >>>
> > >>> ===
> > >>> [1.Normal VACUUM case]
> > >>>tables:1000
> > >>> autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 
> > >>> sec
> > >>> autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 
> > >>> sec
> > >>> autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 
> > >>> sec
> > >>> autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 
> > >>> sec
> > >>> autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 
> > >>> sec
> > >>>
> > >>>tables:5000
> > >>> autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 
> > >>> sec
> > >>> autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 
> > >>> sec
> > >>> autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 
> > >>> sec
> > >>> autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 
> > >>> sec
> > >>> autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 
> > >>> sec
> > >>>
> > >>>tables:1
>

Re: autovac issue with large number of tables

2020-11-29 Thread Kasahara Tatsuhito
Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao  wrote:
>
>
>
> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > Hi,
> >
> > On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  
> > wrote:
> >>
> >>
> >>
> >> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  
> >>> wrote:
> 
>  On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>>  wrote:
> > I wonder if we could have table_recheck_autovac do two probes of 
> > the stats
> > data.  First probe the existing stats data, and if it shows the 
> > table to
> > be already vacuumed, return immediately.  If not, *then* force a 
> > stats
> > re-read, and check a second time.
>  Does the above mean that the second and subsequent 
>  table_recheck_autovac()
>  will be improved to first check using the previous refreshed 
>  statistics?
>  I think that certainly works.
> 
>  If that's correct, I'll try to create a patch for the PoC
> >>>
> >>> I still don't know how to reproduce Jim's troubles, but I was able to 
> >>> reproduce
> >>> what was probably a very similar problem.
> >>>
> >>> This problem seems to be more likely to occur in cases where you have
> >>> a large number of tables,
> >>> i.e., a large amount of stats, and many small tables need VACUUM at
> >>> the same time.
> >>>
> >>> So I followed Tom's advice and created a patch for the PoC.
> >>> This patch will enable a flag in the table_recheck_autovac function 
> >>> to use
> >>> the existing stats next time if VACUUM (or ANALYZE) has already been 
> >>> done
> >>> by another worker on the check after the stats have been updated.
> >>> If the tables continue to require VACUUM after the refresh, then a 
> >>> refresh
> >>> will be required instead of using the existing statistics.
> >>>
> >>> I did simple test with HEAD and HEAD + this PoC patch.
> >>> The tests were conducted in two cases.
> >>> (I changed few configurations. see attached scripts)
> >>>
> >>> 1. Normal VACUUM case
> >>> - SET autovacuum = off
> >>> - CREATE tables with 100 rows
> >>> - DELETE 90 rows for each tables
> >>> - SET autovacuum = on and restart PostgreSQL
> >>> - Measure the time it takes for all tables to be VACUUMed
> >>>
> >>> 2. Anti wrap round VACUUM case
> >>> - CREATE brank tables
> >>> - SELECT all of these tables (for generate stats)
> >>> - SET autovacuum_freeze_max_age to low values and restart 
> >>> PostgreSQL
> >>> - Consumes a lot of XIDs by using txid_curent()
> >>> - Measure the time it takes for all tables to be VACUUMed
> >>>
> >>> For each test case, the following results were obtained by changing
> >>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> >>> Also changing num of tables to 1000, 5000, 1 and 2.
> >>>
> >>> Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> >>> little unstable,
> >>> but I think it's enough to ask for a trend.
> >>>
> >>> ===
> >>> [1.Normal VACUUM case]
> >>>tables:1000
> >>> autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> >>> autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> >>> autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> >>> autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> >>> autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> >>>
> >>>tables:5000
> >>> autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> >>> autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> >>> autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> >>> autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> >>> autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> >>>
> >>>tables:1
> >>> autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 
> >>> sec
> >>> autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 
> >>> sec
> >>> autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 
> >>> sec
> >>> autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 
> >>> sec
> >>> autovacuum_max_workers 10:  (HEAD)  97 sec V

Re: autovac issue with large number of tables

2020-11-27 Thread Fujii Masao




On 2020/11/27 18:38, Kasahara Tatsuhito wrote:

Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
- SET autovacuum = off
- CREATE tables with 100 rows
- DELETE 90 rows for each tables
- SET autovacuum = on and restart PostgreSQL
- Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
- CREATE brank tables
- SELECT all of these tables (for generate stats)
- SET autovacuum_freeze_max_age to low values and restart PostgreSQL
- Consumes a lot of XIDs by using txid_curent()
- Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
   tables:1000
autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

   tables:5000
autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

   tables:1
autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

   tables:2
autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
   tables:1000
autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec

   tables:5000
autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 2

Re: autovac issue with large number of tables

2020-11-27 Thread Kasahara Tatsuhito
On Fri, Nov 27, 2020 at 5:22 PM Masahiko Sawada  wrote:
>
> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  
> wrote:
> >
> >
> >
> > On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  
> > > wrote:
> > >>
> > >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  
> > >>> wrote:
> > 
> >  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >   wrote:
> > >
> > > Hi,
> > >
> > > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > >  wrote:
> > >>> I wonder if we could have table_recheck_autovac do two probes of 
> > >>> the stats
> > >>> data.  First probe the existing stats data, and if it shows the 
> > >>> table to
> > >>> be already vacuumed, return immediately.  If not, *then* force a 
> > >>> stats
> > >>> re-read, and check a second time.
> > >> Does the above mean that the second and subsequent 
> > >> table_recheck_autovac()
> > >> will be improved to first check using the previous refreshed 
> > >> statistics?
> > >> I think that certainly works.
> > >>
> > >> If that's correct, I'll try to create a patch for the PoC
> > >
> > > I still don't know how to reproduce Jim's troubles, but I was able to 
> > > reproduce
> > > what was probably a very similar problem.
> > >
> > > This problem seems to be more likely to occur in cases where you have
> > > a large number of tables,
> > > i.e., a large amount of stats, and many small tables need VACUUM at
> > > the same time.
> > >
> > > So I followed Tom's advice and created a patch for the PoC.
> > > This patch will enable a flag in the table_recheck_autovac function 
> > > to use
> > > the existing stats next time if VACUUM (or ANALYZE) has already been 
> > > done
> > > by another worker on the check after the stats have been updated.
> > > If the tables continue to require VACUUM after the refresh, then a 
> > > refresh
> > > will be required instead of using the existing statistics.
> > >
> > > I did simple test with HEAD and HEAD + this PoC patch.
> > > The tests were conducted in two cases.
> > > (I changed few configurations. see attached scripts)
> > >
> > > 1. Normal VACUUM case
> > >- SET autovacuum = off
> > >- CREATE tables with 100 rows
> > >- DELETE 90 rows for each tables
> > >- SET autovacuum = on and restart PostgreSQL
> > >- Measure the time it takes for all tables to be VACUUMed
> > >
> > > 2. Anti wrap round VACUUM case
> > >- CREATE brank tables
> > >- SELECT all of these tables (for generate stats)
> > >- SET autovacuum_freeze_max_age to low values and restart 
> > > PostgreSQL
> > >- Consumes a lot of XIDs by using txid_curent()
> > >- Measure the time it takes for all tables to be VACUUMed
> > >
> > > For each test case, the following results were obtained by changing
> > > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > > Also changing num of tables to 1000, 5000, 1 and 2.
> > >
> > > Due to the poor VM environment (2 VCPU/4 GB), the results are a 
> > > little unstable,
> > > but I think it's enough to ask for a trend.
> > >
> > > ===
> > > [1.Normal VACUUM case]
> > >   tables:1000
> > >autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> > >autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> > >autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> > >autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> > >autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> > >
> > >   tables:5000
> > >autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> > >autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> > >autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> > >autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> > >autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> > >
> > >   tables:1
> > >autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> > >autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> > >autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> > >autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> > >autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> > >
> > >   tables:2
> > >autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> > >autovacuum_max_wor

Re: autovac issue with large number of tables

2020-11-27 Thread Kasahara Tatsuhito
Hi,

On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:
>
>
>
> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  
> >>> wrote:
> 
>  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >  wrote:
> >>> I wonder if we could have table_recheck_autovac do two probes of the 
> >>> stats
> >>> data.  First probe the existing stats data, and if it shows the table 
> >>> to
> >>> be already vacuumed, return immediately.  If not, *then* force a stats
> >>> re-read, and check a second time.
> >> Does the above mean that the second and subsequent 
> >> table_recheck_autovac()
> >> will be improved to first check using the previous refreshed 
> >> statistics?
> >> I think that certainly works.
> >>
> >> If that's correct, I'll try to create a patch for the PoC
> >
> > I still don't know how to reproduce Jim's troubles, but I was able to 
> > reproduce
> > what was probably a very similar problem.
> >
> > This problem seems to be more likely to occur in cases where you have
> > a large number of tables,
> > i.e., a large amount of stats, and many small tables need VACUUM at
> > the same time.
> >
> > So I followed Tom's advice and created a patch for the PoC.
> > This patch will enable a flag in the table_recheck_autovac function to 
> > use
> > the existing stats next time if VACUUM (or ANALYZE) has already been 
> > done
> > by another worker on the check after the stats have been updated.
> > If the tables continue to require VACUUM after the refresh, then a 
> > refresh
> > will be required instead of using the existing statistics.
> >
> > I did simple test with HEAD and HEAD + this PoC patch.
> > The tests were conducted in two cases.
> > (I changed few configurations. see attached scripts)
> >
> > 1. Normal VACUUM case
> >- SET autovacuum = off
> >- CREATE tables with 100 rows
> >- DELETE 90 rows for each tables
> >- SET autovacuum = on and restart PostgreSQL
> >- Measure the time it takes for all tables to be VACUUMed
> >
> > 2. Anti wrap round VACUUM case
> >- CREATE brank tables
> >- SELECT all of these tables (for generate stats)
> >- SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> >- Consumes a lot of XIDs by using txid_curent()
> >- Measure the time it takes for all tables to be VACUUMed
> >
> > For each test case, the following results were obtained by changing
> > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > Also changing num of tables to 1000, 5000, 1 and 2.
> >
> > Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> > unstable,
> > but I think it's enough to ask for a trend.
> >
> > ===
> > [1.Normal VACUUM case]
> >   tables:1000
> >autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> >autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> >autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> >autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> >autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> >
> >   tables:5000
> >autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> >autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> >autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> >autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> >autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> >
> >   tables:1
> >autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> >autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> >autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> >autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> >autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> >
> >   tables:2
> >autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> >autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
> >autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
> >autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
> >autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
> >
>

Re: autovac issue with large number of tables

2020-11-27 Thread Masahiko Sawada
On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao  wrote:
>
>
>
> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> > On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  
> > wrote:
> >>
> >> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  
> >>> wrote:
> 
>  On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>   wrote:
> >
> > Hi,
> >
> > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >  wrote:
> >>> I wonder if we could have table_recheck_autovac do two probes of the 
> >>> stats
> >>> data.  First probe the existing stats data, and if it shows the table 
> >>> to
> >>> be already vacuumed, return immediately.  If not, *then* force a stats
> >>> re-read, and check a second time.
> >> Does the above mean that the second and subsequent 
> >> table_recheck_autovac()
> >> will be improved to first check using the previous refreshed 
> >> statistics?
> >> I think that certainly works.
> >>
> >> If that's correct, I'll try to create a patch for the PoC
> >
> > I still don't know how to reproduce Jim's troubles, but I was able to 
> > reproduce
> > what was probably a very similar problem.
> >
> > This problem seems to be more likely to occur in cases where you have
> > a large number of tables,
> > i.e., a large amount of stats, and many small tables need VACUUM at
> > the same time.
> >
> > So I followed Tom's advice and created a patch for the PoC.
> > This patch will enable a flag in the table_recheck_autovac function to 
> > use
> > the existing stats next time if VACUUM (or ANALYZE) has already been 
> > done
> > by another worker on the check after the stats have been updated.
> > If the tables continue to require VACUUM after the refresh, then a 
> > refresh
> > will be required instead of using the existing statistics.
> >
> > I did simple test with HEAD and HEAD + this PoC patch.
> > The tests were conducted in two cases.
> > (I changed few configurations. see attached scripts)
> >
> > 1. Normal VACUUM case
> >- SET autovacuum = off
> >- CREATE tables with 100 rows
> >- DELETE 90 rows for each tables
> >- SET autovacuum = on and restart PostgreSQL
> >- Measure the time it takes for all tables to be VACUUMed
> >
> > 2. Anti wrap round VACUUM case
> >- CREATE brank tables
> >- SELECT all of these tables (for generate stats)
> >- SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> >- Consumes a lot of XIDs by using txid_curent()
> >- Measure the time it takes for all tables to be VACUUMed
> >
> > For each test case, the following results were obtained by changing
> > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > Also changing num of tables to 1000, 5000, 1 and 2.
> >
> > Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> > unstable,
> > but I think it's enough to ask for a trend.
> >
> > ===
> > [1.Normal VACUUM case]
> >   tables:1000
> >autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> >autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> >autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> >autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> >autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> >
> >   tables:5000
> >autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> >autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> >autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> >autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> >autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> >
> >   tables:1
> >autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> >autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> >autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> >autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> >autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> >
> >   tables:2
> >autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> >autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
> >autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
> >autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
> >autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
> >
> 

Re: autovac issue with large number of tables

2020-11-26 Thread Fujii Masao




On 2020/11/26 10:41, Kasahara Tatsuhito wrote:

On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:


On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:


On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:


Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:

I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC


I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
   - SET autovacuum = off
   - CREATE tables with 100 rows
   - DELETE 90 rows for each tables
   - SET autovacuum = on and restart PostgreSQL
   - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
   - CREATE brank tables
   - SELECT all of these tables (for generate stats)
   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
   - Consumes a lot of XIDs by using txid_curent()
   - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
  tables:1000
   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

  tables:5000
   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

  tables:1
   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

  tables:2
   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
  tables:1000
   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec

  tables:5000
   autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
   autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
   autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
   autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
   autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec

  tables:1
   autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 sec
   autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  

Re: autovac issue with large number of tables

2020-11-25 Thread Kasahara Tatsuhito
On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada  wrote:
>
> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
>  wrote:
> >
> > Hi,
> >
> > On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > > >  wrote:
> > > > > > I wonder if we could have table_recheck_autovac do two probes of 
> > > > > > the stats
> > > > > > data.  First probe the existing stats data, and if it shows the 
> > > > > > table to
> > > > > > be already vacuumed, return immediately.  If not, *then* force a 
> > > > > > stats
> > > > > > re-read, and check a second time.
> > > > > Does the above mean that the second and subsequent 
> > > > > table_recheck_autovac()
> > > > > will be improved to first check using the previous refreshed 
> > > > > statistics?
> > > > > I think that certainly works.
> > > > >
> > > > > If that's correct, I'll try to create a patch for the PoC
> > > >
> > > > I still don't know how to reproduce Jim's troubles, but I was able to 
> > > > reproduce
> > > > what was probably a very similar problem.
> > > >
> > > > This problem seems to be more likely to occur in cases where you have
> > > > a large number of tables,
> > > > i.e., a large amount of stats, and many small tables need VACUUM at
> > > > the same time.
> > > >
> > > > So I followed Tom's advice and created a patch for the PoC.
> > > > This patch will enable a flag in the table_recheck_autovac function to 
> > > > use
> > > > the existing stats next time if VACUUM (or ANALYZE) has already been 
> > > > done
> > > > by another worker on the check after the stats have been updated.
> > > > If the tables continue to require VACUUM after the refresh, then a 
> > > > refresh
> > > > will be required instead of using the existing statistics.
> > > >
> > > > I did simple test with HEAD and HEAD + this PoC patch.
> > > > The tests were conducted in two cases.
> > > > (I changed few configurations. see attached scripts)
> > > >
> > > > 1. Normal VACUUM case
> > > >   - SET autovacuum = off
> > > >   - CREATE tables with 100 rows
> > > >   - DELETE 90 rows for each tables
> > > >   - SET autovacuum = on and restart PostgreSQL
> > > >   - Measure the time it takes for all tables to be VACUUMed
> > > >
> > > > 2. Anti wrap round VACUUM case
> > > >   - CREATE brank tables
> > > >   - SELECT all of these tables (for generate stats)
> > > >   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> > > >   - Consumes a lot of XIDs by using txid_curent()
> > > >   - Measure the time it takes for all tables to be VACUUMed
> > > >
> > > > For each test case, the following results were obtained by changing
> > > > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > > > Also changing num of tables to 1000, 5000, 1 and 2.
> > > >
> > > > Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> > > > unstable,
> > > > but I think it's enough to ask for a trend.
> > > >
> > > > ===
> > > > [1.Normal VACUUM case]
> > > >  tables:1000
> > > >   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> > > >   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> > > >   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> > > >   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> > > >   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> > > >
> > > >  tables:5000
> > > >   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> > > >   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> > > >   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> > > >   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> > > >   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> > > >
> > > >  tables:1
> > > >   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> > > >   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> > > >   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> > > >   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> > > >   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> > > >
> > > >  tables:2
> > > >   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> > > >   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
> > > >   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
> > > >   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
> > > >   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
> > > >
> > > > [2.Anti wrap round VACUUM case]
> > > >  tables:1000
> > > >   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
> > > >   

Re: autovac issue with large number of tables

2020-11-25 Thread Masahiko Sawada
On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
 wrote:
>
> Hi,
>
> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:
> >
> > On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >  wrote:
> > >
> > > Hi,
> > >
> > > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> > >  wrote:
> > > > > I wonder if we could have table_recheck_autovac do two probes of the 
> > > > > stats
> > > > > data.  First probe the existing stats data, and if it shows the table 
> > > > > to
> > > > > be already vacuumed, return immediately.  If not, *then* force a stats
> > > > > re-read, and check a second time.
> > > > Does the above mean that the second and subsequent 
> > > > table_recheck_autovac()
> > > > will be improved to first check using the previous refreshed statistics?
> > > > I think that certainly works.
> > > >
> > > > If that's correct, I'll try to create a patch for the PoC
> > >
> > > I still don't know how to reproduce Jim's troubles, but I was able to 
> > > reproduce
> > > what was probably a very similar problem.
> > >
> > > This problem seems to be more likely to occur in cases where you have
> > > a large number of tables,
> > > i.e., a large amount of stats, and many small tables need VACUUM at
> > > the same time.
> > >
> > > So I followed Tom's advice and created a patch for the PoC.
> > > This patch will enable a flag in the table_recheck_autovac function to use
> > > the existing stats next time if VACUUM (or ANALYZE) has already been done
> > > by another worker on the check after the stats have been updated.
> > > If the tables continue to require VACUUM after the refresh, then a refresh
> > > will be required instead of using the existing statistics.
> > >
> > > I did simple test with HEAD and HEAD + this PoC patch.
> > > The tests were conducted in two cases.
> > > (I changed few configurations. see attached scripts)
> > >
> > > 1. Normal VACUUM case
> > >   - SET autovacuum = off
> > >   - CREATE tables with 100 rows
> > >   - DELETE 90 rows for each tables
> > >   - SET autovacuum = on and restart PostgreSQL
> > >   - Measure the time it takes for all tables to be VACUUMed
> > >
> > > 2. Anti wrap round VACUUM case
> > >   - CREATE brank tables
> > >   - SELECT all of these tables (for generate stats)
> > >   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> > >   - Consumes a lot of XIDs by using txid_curent()
> > >   - Measure the time it takes for all tables to be VACUUMed
> > >
> > > For each test case, the following results were obtained by changing
> > > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > > Also changing num of tables to 1000, 5000, 1 and 2.
> > >
> > > Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> > > unstable,
> > > but I think it's enough to ask for a trend.
> > >
> > > ===
> > > [1.Normal VACUUM case]
> > >  tables:1000
> > >   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> > >   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> > >   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> > >   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> > >   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> > >
> > >  tables:5000
> > >   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> > >   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> > >   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> > >   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> > >   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> > >
> > >  tables:1
> > >   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> > >   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> > >   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> > >   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> > >   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> > >
> > >  tables:2
> > >   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> > >   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
> > >   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
> > >   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
> > >   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
> > >
> > > [2.Anti wrap round VACUUM case]
> > >  tables:1000
> > >   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
> > >   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
> > >   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
> > >   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
> > >   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec
> > >
> > >  tables:5000
> > >   aut

Re: autovac issue with large number of tables

2020-11-24 Thread Kasahara Tatsuhito
Hi,

On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada  wrote:
>
> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>  wrote:
> >
> > Hi,
> >
> > On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >  wrote:
> > > > I wonder if we could have table_recheck_autovac do two probes of the 
> > > > stats
> > > > data.  First probe the existing stats data, and if it shows the table to
> > > > be already vacuumed, return immediately.  If not, *then* force a stats
> > > > re-read, and check a second time.
> > > Does the above mean that the second and subsequent table_recheck_autovac()
> > > will be improved to first check using the previous refreshed statistics?
> > > I think that certainly works.
> > >
> > > If that's correct, I'll try to create a patch for the PoC
> >
> > I still don't know how to reproduce Jim's troubles, but I was able to 
> > reproduce
> > what was probably a very similar problem.
> >
> > This problem seems to be more likely to occur in cases where you have
> > a large number of tables,
> > i.e., a large amount of stats, and many small tables need VACUUM at
> > the same time.
> >
> > So I followed Tom's advice and created a patch for the PoC.
> > This patch will enable a flag in the table_recheck_autovac function to use
> > the existing stats next time if VACUUM (or ANALYZE) has already been done
> > by another worker on the check after the stats have been updated.
> > If the tables continue to require VACUUM after the refresh, then a refresh
> > will be required instead of using the existing statistics.
> >
> > I did simple test with HEAD and HEAD + this PoC patch.
> > The tests were conducted in two cases.
> > (I changed few configurations. see attached scripts)
> >
> > 1. Normal VACUUM case
> >   - SET autovacuum = off
> >   - CREATE tables with 100 rows
> >   - DELETE 90 rows for each tables
> >   - SET autovacuum = on and restart PostgreSQL
> >   - Measure the time it takes for all tables to be VACUUMed
> >
> > 2. Anti wrap round VACUUM case
> >   - CREATE brank tables
> >   - SELECT all of these tables (for generate stats)
> >   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> >   - Consumes a lot of XIDs by using txid_curent()
> >   - Measure the time it takes for all tables to be VACUUMed
> >
> > For each test case, the following results were obtained by changing
> > autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> > Also changing num of tables to 1000, 5000, 1 and 2.
> >
> > Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> > unstable,
> > but I think it's enough to ask for a trend.
> >
> > ===
> > [1.Normal VACUUM case]
> >  tables:1000
> >   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
> >   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
> >   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
> >   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
> >   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
> >
> >  tables:5000
> >   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
> >   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
> >   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
> >   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
> >   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
> >
> >  tables:1
> >   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
> >   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
> >   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
> >   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
> >   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
> >
> >  tables:2
> >   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
> >   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
> >   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
> >   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
> >   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
> >
> > [2.Anti wrap round VACUUM case]
> >  tables:1000
> >   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
> >   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
> >   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
> >   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
> >   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec
> >
> >  tables:5000
> >   autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
> >   autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
> >   autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
> >   autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
> >   autovac

Re: autovac issue with large number of tables

2020-11-24 Thread Masahiko Sawada
On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:
>
> Hi,
>
> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
>  wrote:
> > > I wonder if we could have table_recheck_autovac do two probes of the stats
> > > data.  First probe the existing stats data, and if it shows the table to
> > > be already vacuumed, return immediately.  If not, *then* force a stats
> > > re-read, and check a second time.
> > Does the above mean that the second and subsequent table_recheck_autovac()
> > will be improved to first check using the previous refreshed statistics?
> > I think that certainly works.
> >
> > If that's correct, I'll try to create a patch for the PoC
>
> I still don't know how to reproduce Jim's troubles, but I was able to 
> reproduce
> what was probably a very similar problem.
>
> This problem seems to be more likely to occur in cases where you have
> a large number of tables,
> i.e., a large amount of stats, and many small tables need VACUUM at
> the same time.
>
> So I followed Tom's advice and created a patch for the PoC.
> This patch will enable a flag in the table_recheck_autovac function to use
> the existing stats next time if VACUUM (or ANALYZE) has already been done
> by another worker on the check after the stats have been updated.
> If the tables continue to require VACUUM after the refresh, then a refresh
> will be required instead of using the existing statistics.
>
> I did simple test with HEAD and HEAD + this PoC patch.
> The tests were conducted in two cases.
> (I changed few configurations. see attached scripts)
>
> 1. Normal VACUUM case
>   - SET autovacuum = off
>   - CREATE tables with 100 rows
>   - DELETE 90 rows for each tables
>   - SET autovacuum = on and restart PostgreSQL
>   - Measure the time it takes for all tables to be VACUUMed
>
> 2. Anti wrap round VACUUM case
>   - CREATE brank tables
>   - SELECT all of these tables (for generate stats)
>   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
>   - Consumes a lot of XIDs by using txid_curent()
>   - Measure the time it takes for all tables to be VACUUMed
>
> For each test case, the following results were obtained by changing
> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> Also changing num of tables to 1000, 5000, 1 and 2.
>
> Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> unstable,
> but I think it's enough to ask for a trend.
>
> ===
> [1.Normal VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
>   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
>   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
>   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
>   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
>   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
>   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
>
>  tables:1
>   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
>   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
>   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
>   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
>   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
>
>  tables:2
>   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
>   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
>   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
>   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
>   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
>
> [2.Anti wrap round VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
>   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
>   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
>   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
>   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
>   autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
>   autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
>   autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
>   autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec
>
>  tables:1
>   autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 sec
>   autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  86 sec
>   autovacuum_max_workers 3:   (HEAD) 120 sec VS (with patc

Re: autovac issue with large number of tables

2020-09-10 Thread Kasahara Tatsuhito
Therefore, we expect this patch [1] to be committed for its original
purpose, as well as to improve autovacuum from v14 onwards.Hi,

On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
 wrote:
>
> Hi,
>
> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
>  wrote:
> > > I wonder if we could have table_recheck_autovac do two probes of the stats
> > > data.  First probe the existing stats data, and if it shows the table to
> > > be already vacuumed, return immediately.  If not, *then* force a stats
> > > re-read, and check a second time.
> > Does the above mean that the second and subsequent table_recheck_autovac()
> > will be improved to first check using the previous refreshed statistics?
> > I think that certainly works.
> >
> > If that's correct, I'll try to create a patch for the PoC
>
> I still don't know how to reproduce Jim's troubles, but I was able to 
> reproduce
> what was probably a very similar problem.
>
> This problem seems to be more likely to occur in cases where you have
> a large number of tables,
> i.e., a large amount of stats, and many small tables need VACUUM at
> the same time.
>
> So I followed Tom's advice and created a patch for the PoC.
> This patch will enable a flag in the table_recheck_autovac function to use
> the existing stats next time if VACUUM (or ANALYZE) has already been done
> by another worker on the check after the stats have been updated.
> If the tables continue to require VACUUM after the refresh, then a refresh
> will be required instead of using the existing statistics.
>
> I did simple test with HEAD and HEAD + this PoC patch.
> The tests were conducted in two cases.
> (I changed few configurations. see attached scripts)
>
> 1. Normal VACUUM case
>   - SET autovacuum = off
>   - CREATE tables with 100 rows
>   - DELETE 90 rows for each tables
>   - SET autovacuum = on and restart PostgreSQL
>   - Measure the time it takes for all tables to be VACUUMed
>
> 2. Anti wrap round VACUUM case
>   - CREATE brank tables
>   - SELECT all of these tables (for generate stats)
>   - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
>   - Consumes a lot of XIDs by using txid_curent()
>   - Measure the time it takes for all tables to be VACUUMed
>
> For each test case, the following results were obtained by changing
> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> Also changing num of tables to 1000, 5000, 1 and 2.
>
> Due to the poor VM environment (2 VCPU/4 GB), the results are a little 
> unstable,
> but I think it's enough to ask for a trend.
>
> ===
> [1.Normal VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
>   autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
>   autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
>   autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
>   autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
>   autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
>   autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
>   autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec
>
>  tables:1
>   autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
>   autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
>   autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
>   autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
>   autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec
>
>  tables:2
>   autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
>   autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
>   autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
>   autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
>   autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec
>
> [2.Anti wrap round VACUUM case]
>  tables:1000
>   autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
>   autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
>   autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
>   autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
>   autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec
>
>  tables:5000
>   autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
>   autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
>   autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
>   autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
>   autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec
>
>  tables:1
>   autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 se

Re: autovac issue with large number of tables

2020-09-04 Thread Kasahara Tatsuhito
Hi,

On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
 wrote:
> > I wonder if we could have table_recheck_autovac do two probes of the stats
> > data.  First probe the existing stats data, and if it shows the table to
> > be already vacuumed, return immediately.  If not, *then* force a stats
> > re-read, and check a second time.
> Does the above mean that the second and subsequent table_recheck_autovac()
> will be improved to first check using the previous refreshed statistics?
> I think that certainly works.
>
> If that's correct, I'll try to create a patch for the PoC

I still don't know how to reproduce Jim's troubles, but I was able to reproduce
what was probably a very similar problem.

This problem seems to be more likely to occur in cases where you have
a large number of tables,
i.e., a large amount of stats, and many small tables need VACUUM at
the same time.

So I followed Tom's advice and created a patch for the PoC.
This patch will enable a flag in the table_recheck_autovac function to use
the existing stats next time if VACUUM (or ANALYZE) has already been done
by another worker on the check after the stats have been updated.
If the tables continue to require VACUUM after the refresh, then a refresh
will be required instead of using the existing statistics.

I did simple test with HEAD and HEAD + this PoC patch.
The tests were conducted in two cases.
(I changed few configurations. see attached scripts)

1. Normal VACUUM case
  - SET autovacuum = off
  - CREATE tables with 100 rows
  - DELETE 90 rows for each tables
  - SET autovacuum = on and restart PostgreSQL
  - Measure the time it takes for all tables to be VACUUMed

2. Anti wrap round VACUUM case
  - CREATE brank tables
  - SELECT all of these tables (for generate stats)
  - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
  - Consumes a lot of XIDs by using txid_curent()
  - Measure the time it takes for all tables to be VACUUMed

For each test case, the following results were obtained by changing
autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
Also changing num of tables to 1000, 5000, 1 and 2.

Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
but I think it's enough to ask for a trend.

===
[1.Normal VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 20 sec VS (with patch)  20 sec
  autovacuum_max_workers 2:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 3:   (HEAD) 18 sec VS (with patch)  16 sec
  autovacuum_max_workers 5:   (HEAD) 19 sec VS (with patch)  17 sec
  autovacuum_max_workers 10:  (HEAD) 19 sec VS (with patch)  17 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 77 sec VS (with patch)  78 sec
  autovacuum_max_workers 2:   (HEAD) 61 sec VS (with patch)  43 sec
  autovacuum_max_workers 3:   (HEAD) 38 sec VS (with patch)  38 sec
  autovacuum_max_workers 5:   (HEAD) 45 sec VS (with patch)  37 sec
  autovacuum_max_workers 10:  (HEAD) 43 sec VS (with patch)  35 sec

 tables:1
  autovacuum_max_workers 1:   (HEAD) 152 sec VS (with patch)  153 sec
  autovacuum_max_workers 2:   (HEAD) 119 sec VS (with patch)   98 sec
  autovacuum_max_workers 3:   (HEAD)  87 sec VS (with patch)   78 sec
  autovacuum_max_workers 5:   (HEAD) 100 sec VS (with patch)   66 sec
  autovacuum_max_workers 10:  (HEAD)  97 sec VS (with patch)   56 sec

 tables:2
  autovacuum_max_workers 1:   (HEAD) 338 sec VS (with patch)  339 sec
  autovacuum_max_workers 2:   (HEAD) 231 sec VS (with patch)  229 sec
  autovacuum_max_workers 3:   (HEAD) 220 sec VS (with patch)  191 sec
  autovacuum_max_workers 5:   (HEAD) 234 sec VS (with patch)  147 sec
  autovacuum_max_workers 10:  (HEAD) 320 sec VS (with patch)  113 sec

[2.Anti wrap round VACUUM case]
 tables:1000
  autovacuum_max_workers 1:   (HEAD) 19 sec VS (with patch) 18 sec
  autovacuum_max_workers 2:   (HEAD) 14 sec VS (with patch) 15 sec
  autovacuum_max_workers 3:   (HEAD) 14 sec VS (with patch) 14 sec
  autovacuum_max_workers 5:   (HEAD) 14 sec VS (with patch) 16 sec
  autovacuum_max_workers 10:  (HEAD) 16 sec VS (with patch) 14 sec

 tables:5000
  autovacuum_max_workers 1:   (HEAD) 69 sec VS (with patch) 69 sec
  autovacuum_max_workers 2:   (HEAD) 66 sec VS (with patch) 47 sec
  autovacuum_max_workers 3:   (HEAD) 59 sec VS (with patch) 37 sec
  autovacuum_max_workers 5:   (HEAD) 39 sec VS (with patch) 28 sec
  autovacuum_max_workers 10:  (HEAD) 39 sec VS (with patch) 29 sec

 tables:1
  autovacuum_max_workers 1:   (HEAD) 139 sec VS (with patch) 138 sec
  autovacuum_max_workers 2:   (HEAD) 130 sec VS (with patch)  86 sec
  autovacuum_max_workers 3:   (HEAD) 120 sec VS (with patch)  68 sec
  autovacuum_max_workers 5:   (HEAD)  96 sec VS (with patch)  41 sec
  autovacuum_max_workers 10:  (HEAD)  90 sec VS (with patch)  39 sec

 tables:2
  autovacuum_max_workers 1:   (HEAD) 313 sec VS (with patch) 331 sec
  autovacuum_max_worker

Re: autovac issue with large number of tables

2020-09-01 Thread Kasahara Tatsuhito
Hi,

On Wed, Aug 12, 2020 at 2:46 AM Tom Lane  wrote:
> So I think Kasahara-san's point is that the shared memory stats collector
> might wipe out those costs, depending on how it's implemented.  (I've not
> looked at that patch in a long time either, so I don't know how much it'd
> cut the reader-side costs.  But maybe it'd be substantial.)
Thanks for your clarification, that's what I wanted to say.
Sorry for the lack of explanation.

> I think the real issue here is autovac_refresh_stats's insistence that it
> shouldn't throttle pgstats re-reads in workers.
I agree that.

> I wonder if we could have table_recheck_autovac do two probes of the stats
> data.  First probe the existing stats data, and if it shows the table to
> be already vacuumed, return immediately.  If not, *then* force a stats
> re-read, and check a second time.
Does the above mean that the second and subsequent table_recheck_autovac()
will be improved to first check using the previous refreshed statistics?
I think that certainly works.

If that's correct, I'll try to create a patch for the PoC.

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com




Re: autovac issue with large number of tables

2020-08-11 Thread Tom Lane
Jim Nasby  writes:
> Without reading the 100+ emails or the 260k patch, I'm guessing that it 
> won't help because the problem I observed was spending most of it's time in
>    42.62% postgres  [.] hash_search_with_hash_value
> I don't see how moving things to shared memory would help that at all.

So I'm a bit mystified as to why that would show up as the primary cost.
It looks to me like we force a re-read of the pgstats data each time
through table_recheck_autovac(), and it seems like the costs associated
with that would swamp everything else in the case you're worried about.

I suspect that the bulk of the hash_search_with_hash_value costs are
HASH_ENTER calls caused by repopulating the pgstats hash table, rather
than the single read probe that table_recheck_autovac itself will do.
It's still surprising that that would dominate the other costs of reading
the data, but maybe those costs just aren't as well localized in the code.

So I think Kasahara-san's point is that the shared memory stats collector
might wipe out those costs, depending on how it's implemented.  (I've not
looked at that patch in a long time either, so I don't know how much it'd
cut the reader-side costs.  But maybe it'd be substantial.)

In the meantime, though, do we want to do something else to alleviate
the issue?  I realize you only described your patch as a PoC, but I
can't say I like it much:

* Giving up after we've wasted 1000 pgstats re-reads seems like locking
the barn door only after the horse is well across the state line.

* I'm not convinced that the business with skipping N entries at a time
buys anything.  You'd have to make pretty strong assumptions about the
workers all processing tables at about the same rate to believe it will
help.  In the worst case, it might lead to all the workers ignoring the
same table(s).

I think the real issue here is autovac_refresh_stats's insistence that it
shouldn't throttle pgstats re-reads in workers.  I see the point about not
wanting to repeat vacuum work on the basis of stale data, but still ...
I wonder if we could have table_recheck_autovac do two probes of the stats
data.  First probe the existing stats data, and if it shows the table to
be already vacuumed, return immediately.  If not, *then* force a stats
re-read, and check a second time.

BTW, can you provide a test script that reproduces the problem you're
looking at?  The rest of us are kind of guessing at what's happening.

regards, tom lane




Re: autovac issue with large number of tables

2020-07-30 Thread Kasahara Tatsuhito
Hi,

On Tue, Jul 28, 2020 at 3:49 AM Jim Nasby  wrote:
> I'm in favor of trying to improve scheduling (especially allowing users
> to control how things are scheduled), but that's a far more invasive
> patch. I'd like to get something like this patch in without waiting on a
> significantly larger effort.

BTW, Have you tried the patch suggested in the thread below?

https://www.postgresql.org/message-id/20180629.173418.190173462.horiguchi.kyotaro%40lab.ntt.co.jp

The above is a suggestion to manage statistics on shared memory rather
than in a file, but I think this feature may mitigate your problem.
I think that feature has yet another performance challenge, but it
might be worth a try.
The above patch will also require a great deal of effort to get into
the PostgreSQL-core, but I'm curious to see how well it works for this
problem.

Best regards,

-- 
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com




Re: autovac issue with large number of tables

2020-07-26 Thread Masahiko Sawada
On Mon, 27 Jul 2020 at 06:43, Nasby, Jim  wrote:
>
> A database with a very large number of  tables eligible for autovacuum can 
> result in autovacuum workers “stuck” in a tight loop of 
> table_recheck_autovac() constantly reporting nothing to do on the table. This 
> is because a database with a very large number of tables means it takes a 
> while to search the statistics hash to verify that the table still needs to 
> be processed[1]. If a worker spends some time processing a table, when it’s 
> done it can spend a significant amount of time rechecking each table that it 
> identified at launch (I’ve seen a worker in this state for over an hour). A 
> simple work-around in this scenario is to kill the worker; the launcher will 
> quickly fire up a new worker on the same database, and that worker will build 
> a new list of tables.
>
>
>
> That’s not a complete solution though… if the database contains a large 
> number of very small tables you can end up in a state where 1 or 2 workers is 
> busy chugging through those small tables so quickly than any additional 
> workers spend all their time in table_recheck_autovac(), because that takes 
> long enough that the additional workers are never able to “leapfrog” the 
> workers that are doing useful work.
>

As another solution, I've been considering adding a queue having table
OIDs that need to vacuumed/analyzed on the shared memory (i.g. on
DSA). Since all autovacuum workers running on the same database can
see a consistent queue, the issue explained above won't happen and
probably it makes the implementation of prioritization of tables being
vacuumed easier which is sometimes discussed on pgsql-hackers. I guess
it might be worth to discuss including this idea.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services