Hi,
I created an index on a 11devel base while sampling pg_stat_activity with a
little tool. Tool catches a line if state = active. Collected rows are
aggregated and sorted by activity percentage.
Test environment :
select version();
version
----------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 11devel (Debian 11~~devel~20180227.2330-1~420.git51057fe.pgdg+1) on
x86_64-pc-linux-gnu, compiled by gcc (Debian 7.3.0-5) 7.3.0, 64-bit
(1 ligne)
Temps : 0,762 ms
create table t1(c1 bigint, c2 double precision, c3 text);
CREATE TABLE
insert into t1 select generate_series(1,100000000,1), random(),
md5(random()::text) ;
INSERT 0 100000000
With a select (select max(c1) from t1 group by c2;) I have this kind of output :
./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 110000
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe | pid | backend_type | query
| wait_event_type | wait_event
---------+--------------+-------+-----------------+-------------------------------------+-----------------+--------------
76 | 1 / 151 | 10065 | parallel worker | select max(c1) from t1
group by c2; | IO | DataFileRead
73 | 1 / 146 | 8262 | client backend | select max(c1) from t1
group by c2; | IO | DataFileRead
72 | 1 / 144 | 10066 | parallel worker | select max(c1) from t1
group by c2; | IO | DataFileRead
26 | 1 / 53 | 10066 | parallel worker | select max(c1) from t1
group by c2; | |
26 | 1 / 51 | 8262 | client backend | select max(c1) from t1
group by c2; | |
24 | 1 / 47 | 10065 | parallel worker | select max(c1) from t1
group by c2; | |
2 | 1 / 3 | 10066 | parallel worker | select max(c1) from t1
group by c2; | IO | BufFileWrite
2 | 1 / 3 | 8262 | client backend | select max(c1) from t1
group by c2; | IO | BufFileWrite
1 | 1 / 2 | 10065 | parallel worker | select max(c1) from t1
group by c2; | IO | BufFileWrite
With an index creation (create index t1_i1 on t1(c1, c2);) I have this kind of
output :
./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 110000
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe | pid | backend_type | query
| wait_event_type | wait_event
---------+--------------+------+----------------+-----------------------------------+-----------------+--------------
68 | 1 / 136 | 8262 | client backend | create index t1_i1 on t1(c1,
c2); | IO | DataFileRead
26 | 1 / 53 | 8262 | client backend | create index t1_i1 on t1(c1,
c2); | |
6 | 1 / 11 | 8262 | client backend | create index t1_i1 on t1(c1,
c2); | IO | BufFileWrite
(3 rows)
No parallel worker. At least one parallel worker was active though, I could see
its work with a direct query on pg_stat_activity or a ps -ef :
...
postgres 8262 8230 7 08:54 ? 00:22:46 postgres: 11/main: postgres
postgres [local] CREATE INDEX
...
postgres 9833 8230 23 14:17 ? 00:00:33 postgres: 11/main: parallel
worker for PID 8262
...
Tool only catches activity of the client backend cause column state of
pg_stat_activity is null for the parallel workers in this case. I added an
option to do a "(state = 'active' or wait_event_is not null)" It's not 100%
accurate though : I miss the activity of the parallel workers which is not
waiting and it’s more difficult to know who helps whom since query is also null.
I can imagine various workarounds but 11 is in devel and maybe columns active &
query of pg_stat_activity will be filled for the parallel workers even for an
index creation ?
Best regards
Phil