Hi,
I created an index on a 11devel base while sampling pg_stat_activity with a 
little tool. Tool catches a line if state = active. Collected rows are 
aggregated and sorted by activity percentage.

Test environment :

select version();
                                                                      version
----------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 11devel (Debian 11~~devel~20180227.2330-1~420.git51057fe.pgdg+1) on 
x86_64-pc-linux-gnu, compiled by gcc (Debian 7.3.0-5) 7.3.0, 64-bit
(1 ligne)

Temps : 0,762 ms

create table t1(c1 bigint, c2 double precision, c3 text);
CREATE TABLE

insert into t1 select generate_series(1,100000000,1), random(), 
md5(random()::text) ;
INSERT 0 100000000

With a select (select max(c1) from t1 group by c2;) I have this kind of output :

./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database 
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 110000
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe |  pid  |  backend_type   |                query         
       | wait_event_type |  wait_event
---------+--------------+-------+-----------------+-------------------------------------+-----------------+--------------
      76 | 1 / 151      | 10065 | parallel worker | select max(c1) from t1 
group by c2; | IO              | DataFileRead
      73 | 1 / 146      |  8262 | client backend  | select max(c1) from t1 
group by c2; | IO              | DataFileRead
      72 | 1 / 144      | 10066 | parallel worker | select max(c1) from t1 
group by c2; | IO              | DataFileRead
      26 | 1 / 53       | 10066 | parallel worker | select max(c1) from t1 
group by c2; |                 |
      26 | 1 / 51       |  8262 | client backend  | select max(c1) from t1 
group by c2; |                 |
      24 | 1 / 47       | 10065 | parallel worker | select max(c1) from t1 
group by c2; |                 |
       2 | 1 / 3        | 10066 | parallel worker | select max(c1) from t1 
group by c2; | IO              | BufFileWrite
       2 | 1 / 3        |  8262 | client backend  | select max(c1) from t1 
group by c2; | IO              | BufFileWrite
       1 | 1 / 2        | 10065 | parallel worker | select max(c1) from t1 
group by c2; | IO              | BufFileWrite



With an index creation (create index t1_i1 on t1(c1, c2);) I have this kind of 
output :

./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database 
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 110000
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe | pid  |  backend_type  |               query            
   | wait_event_type |  wait_event
---------+--------------+------+----------------+-----------------------------------+-----------------+--------------
      68 | 1 / 136      | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); | IO              | DataFileRead
      26 | 1 / 53       | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); |                 |
       6 | 1 / 11       | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); | IO              | BufFileWrite
(3 rows)


No parallel worker. At least one parallel worker was active though, I could see 
its work with a direct query on pg_stat_activity or a ps -ef :

...
postgres  8262  8230  7 08:54 ?        00:22:46 postgres: 11/main: postgres 
postgres [local] CREATE INDEX
...
postgres  9833  8230 23 14:17 ?        00:00:33 postgres: 11/main: parallel 
worker for PID 8262
...

Tool only catches activity of the client backend cause column state of 
pg_stat_activity is null for the parallel workers in this case. I added an 
option to do a  "(state = 'active' or wait_event_is not null)"  It's not 100% 
accurate though : I miss the activity of the parallel workers which is not 
waiting and it’s more difficult to know who helps whom since query is also null.
I can imagine various workarounds but 11 is in devel and maybe columns active & 
query of pg_stat_activity will be filled for the parallel workers even for an 
index creation ?

Best regards
Phil










Reply via email to