from:"Laurenz Albe"

Re: LWlock:LockManager waits

2024-04-08 Thread Laurenz Albe

On Tue, 2024-04-09 at 11:07 +0800, James Pang wrote:
>    we found sometimes , with many sessions running same query "select ..." at 
> the same time, saw many sessions waiting on "LockManager".  for example, 
> pg_stat_activity show.  It's a production server, so no enable trace_lwlocks 
> flag. could you direct me what's the possible reason and how to reduce this 
> "lockmanager" lock?  all the sql statement are "select " ,no DML.
> 
>    time                                             wait_event           
> count(pid) 
> 2024-04-08 09:00:06.043996+00 | DataFileRead  |    42
>  2024-04-08 09:00:06.043996+00 |               |    15
>  2024-04-08 09:00:06.043996+00 | LockManager   |    31
>  2024-04-08 09:00:06.043996+00 | BufferMapping |    46
>  2024-04-08 09:00:07.114015+00 | LockManager   |    43
>  2024-04-08 09:00:07.114015+00 | DataFileRead  |    28
>  2024-04-08 09:00:07.114015+00 | ClientRead    |    11
>  2024-04-08 09:00:07.114015+00 |               |    11

That's quite obvious: too many connections cause internal contention in the 
database.

Reduce the number of connections by using a reasonably sized connection pool.

Yours,
Laurenz Albe

Re: maintenance_work_mem impact?

2024-03-19 Thread Laurenz Albe

On Tue, 2024-03-19 at 16:05 +, Adithya Kumaranchath wrote:
> I have a table approx. 20GB.
> 
> CREATE UNIQUE INDEX testindex_v1 ON testtable1 (test_index); 
> 
> My observations:
> maintenance_work_mem = 2G
> max_parallel_workers = '16'
> 
> The create index completes in 20 minutes.
> 
> When I change this:
> maintenance_work_mem = 16G
> max_parallel_workers = '16'
> 
> It completes in 9 minutes. So I can see that I can gain performance by 
> changing this number.
> 
> So it is faster but the question I have is it safe to set it to such a high 
> number?
> I am aware that onlyone of these operations can be executed at a time by a 
> database
> session, and an installation normally doesn't have many of them running 
> concurrently,
> so it's safe to set this value significantly larger. I have 128GB memory.
>    1. Any advice or thoughts?
>    2. Is there any other parameter that can accelerate index creation? 

It is safe as long as you have enough free memory on the machine.

You can verify with tools like "free" on Linux (look for "available" memory).

Yours,
Laurenz Albe

Re: Optimizing count(), but Explain estimates wildly off

2024-03-01 Thread Laurenz Albe

On Thu, 2024-02-29 at 17:15 -0600, Chema wrote:
> No major changes after doing Analyze, and also Vacuum Analyze.

Indeed.

This caught my attention:

> ->  Parallel Seq Scan on pricescope_items  (cost=0.00..1027794.01 
> rows=3277101 width=522) (actual time=0.753..41654.507 rows=2621681 loops=3)

Why does it take over 41 seconds to read a table with less than
3 million rows?  Are the rows so large?  Is the tabe bloated?
What is the size of the table as measured with pg_relation_size()
and pg_table_size()?

Yours,
Laurenz Albe

Re: Table Partitioning and Indexes Performance Questions

2024-02-29 Thread Laurenz Albe

On Thu, 2024-02-29 at 11:42 -0500, David Kelly wrote:
> I was told that partitioned table indexed must always start with the 
> partition key columns.

That's not true.

Only unique indexes (as used by primary key and unique constraints) must
contain the partitioning key (but they don't have to start with it).

> Any other performance considerations when it comes to partitioned table 
> indexing?
> Specifically, partitioning by range where the range is a single value.

Not particularly - selecting from a partitioned table is like selecting
from a UNION ALL of all partitions, except that sometimes PostgreSQL
can forgo scanning some of the partitions.
If you use very many partitions, the overhead for query planning and
execution can become noticable.

Yours,
Laurenz Albe

Re: Optimizing count(), but Explain estimates wildly off

2024-02-26 Thread Laurenz Albe

On Mon, 2024-02-26 at 18:25 -0600, Chema wrote:
> I'm trying to optimize simple queries on two tables (tenders & items) with a 
> couple
> million records.  Besides the resulting records, the app also displays the 
> count of
> total results.  Doing count() takes as much time as the other query (which 
> can be
> 30+ secs), so it's an obvious target for optimization.
> 
> Reading around, seems many people are still using this 2005 snippet to obtain 
> the
> row count estimate from Explain:

I recommend using FORMAT JSON and extracting the top row count from that.  It is
simpler and less error-prone.

> Is this still the current best practice?  Any tips to increase precision?
> Currently it can estimate the actual number of rows for over or under a 
> million,
> as seen on the sample queries (1,955,297 instead of 1,001,200; 162,080 instead
> of 1,292,010).

Looking at the samples you provided, I get the impression that the statistics 
for
the table are quite outdated.  That will affect the estimates.  Try running 
ANALYZE
and see if that improves the estimates.

Yours,
Laurenz Albe

Re: sql statement not using all primary key values and poor performance

2024-02-23 Thread Laurenz Albe

On Fri, 2024-02-23 at 18:21 +0800, James Pang wrote:
> it's a third-party vendor application, not easy to change their code.

Then the application is broken, and you should make the vendor fix it.

> is it possible to   1) in Postgresql JDBC driver connection, set
> plan_cache_mode=force_custom_plan    or 2) some other parameters can 
> workaround this issue?

You can set "prepareThreshold" to 0 to keep the JDBC driver from using
prepared statements in PostgreSQL.  I am not sure if that is enough to
fix the problem.

Yours,
Laurenz Albe

Re: sql statement not using all primary key values and poor performance

2024-02-23 Thread Laurenz Albe

On Fri, 2024-02-23 at 15:20 +0800, James Pang wrote:
>    we have a SQL from Postgresql JDBC,  primary is based on 
> (bigint,varchar2,bigint),
> but from sql plan, it convert to ::numeric so the plan just use one "varchar"
> key column and use the other 2 bigint keys as filters. what's the cause about 
> that ? 
> 
>                             Table "test.xx"
>       Column      |              Type              | Collation | Nullable | 
> Default
> --++---+--+-
>   xxxid           | bigint                         |           | not null |
>  paramname        | character varying(512)         |           | not null |
>  paramvalue       | character varying(1536)        |           |          |
>   sssid           | bigint                         |           | not null |
>  createtime       | timestamp(0) without time zone |           |          |
>  lastmodifiedtime | timestamp(0) without time zone |           |          |
>  mmmuuid          | character varying(32)          |           |          |
> Indexes:
>       "pk_xx" PRIMARY KEY, btree (xxxid, paramname, sssid)
>       "idx_xx_mmmuuid" btree (sssid, mmmuuid, paramname)
> 
> SET extra_float_digits = 3
> 
> duration: 7086.014 ms  plan:
>         Query Text: SELECT XXXFID, PARAMNAME, PARAMVALUE, SSSID, CREATETIME, 
> LASTMODIFIEDTIME, MMMUUID FROM test.XX WHERE  ( ( XXXID = $1  ) )  AND  ( 
> ( PARAMNAME = $2  ) )  AND  ( ( SSSID = $3  ) )
>         Index Scan using pk_xx on test.xx  (cost=0.57..2065259.09 
> rows=1 width=86) (actual time=7086.010..7086.011 rows=0 loops=1)
>           Output: confid, paramname, paramvalue, sssid, createtime, 
> lastmodifiedtime, mmmuuid
>           Index Cond: ((xx.paramname)::text = 'cdkkif'::text)   <<< 
> just use only one key instead all primary keys.
>           Filter: (((xx.xxxid)::numeric = '18174044'::numeric) AND 
> ((xx.sssid)::numeric = '253352'::numeric))    <<< it's bigint but 
> converted to numeric 
>           Buffers: shared read=1063470
>           I/O Timings: read=4402.029
> 
> it's from JDBC, we saw this JDBC driver try to set extra_float_digits = 3 
> before
> running the SQL ,does that make planner to convert bigint to numeric ?

Setting "extra_float_digits" is just something the JDBC driver does so as to
not lose precision with "real" and "double precision" values on old versions
of PostgreSQL.

The problem is that you bind the query parameters with the wrong data types.
Don't use "setBigDecimal()", but "setLong()" if you want to bind a "bigint".
An alternative is "setObject()" with "targetSqlType" set to "Types.BIGINT".

Yours,
Laurenz Albe

Re: "not related" code blocks for removal of dead rows when using vacuum and this kills the performance

2024-02-19 Thread Laurenz Albe

On Tue, 2024-02-20 at 05:46 +, Lars Aksel Opsahl wrote:
> If this is expected behavior it means that any user on the database that 
> writes
> a long running sql that does not even insert any data can kill performance for
> any other user in the database.

Yes, that is the case.  A long running query will hold a snapshot, and no data
visible in that snapshot can be deleted.

That can cause bloat, which can impact performance.

> So applications like QGIS who seems to keep open connections for a while can
> then also kill the performance for any other user in the data.

No, that is not a problem.  Keeping *connections* open is a good thing. It is
keeping data modifying transactions, cursors or long-running queries open
that constitutes a problem.

Yours,
Laurenz Albe

Re: "not related" code blocks for removal of dead rows when using vacuum and this kills the performance

2024-02-19 Thread Laurenz Albe

On Mon, 2024-02-19 at 16:14 +, Lars Aksel Opsahl wrote:
> Then we start testing VACUUM and very simple SQL testing in another window.
> 
> We can now show we have performance of "3343.794 ms" and not "0.123 ms", which
> is what we get when we are able to remove dead rows and run a new analyze.
> 
> The problem is that as long as the master code is active, we cannot remove
> alle dead rows and that what seems to be killing the performance.
> 
> With active I mean in hanging on pg_sleep and remember that this master has
> not created the test table or inserted any data in this test table it self.
> 
> Is the expected behavior ?

It is not entirely clear what you are doing, but it seems like you are holding
a database transaction open, and yes, then it is expected behavior that
VACUUM cannot clean up dead rows in the table.

Make sure that your database transactions are short.
Don't use table or row locks to synchronize application threads.
What you could use to synchronize your application threads are advisory locks,
they are not tied to a database transaction.

Yours,
Laurenz Albe

Re: PostgreSQL doesn't use index-only scan if there is an expression in index

2024-02-15 Thread Laurenz Albe

On Thu, 2024-02-15 at 17:37 +0300, Pavel Kulakov wrote:
> Hello,
>  
> PostgreSQL doesn't use 'Index Only Scan' if there is an expression in index.
>  
> The documentation says that PostgreSQL's planner considers a query to be 
> potentially
> executable by index-only scan only when all columns needed by the query are 
> available from the index. 
> I think an example on 
> https://www.postgresql.org/docs/16/indexes-index-only-scans.html :
>  
> SELECT f(x) FROM tab WHERE f(x) < 1;
>  
> is a bit confusing. Even the following query does not use 'Index Only Scan'
>  
> SELECT 1 FROM tab WHERE f(x) < 1;
>  
> Demonstration:
> ---
> drop table if exists test;
>  
> create table test(s text);
> create index ix_test_upper on test (upper(s));
> create index ix_test_normal on test (s);
>  
> insert into test (s)
> select 'Item' || t.i
> from pg_catalog.generate_series(1, 10, 1) t(i);
>  
> analyze verbose "test";
>  
> explain select 1 from test where s = 'Item123';
> explain select 1 from test where upper(s) = upper('Item123');
> --
> Query plan 1:
> Index Only Scan using ix_test_normal on test  (cost=0.42..8.44 rows=1 width=4)
>   Index Cond: (s = 'Item123'::text)
>   
> Query plan 2 (SHOULD BE 'Index Only Scan'):
> Index Scan using ix_test_upper on test  (cost=0.42..8.44 rows=1 width=4)
>   Index Cond: (upper(s) = 'ITEM123'::text)
>    
>  
> If I add 's' as included column to ix_test_upper the plan does use 'Index 
> Only Scan'.
> That looks strange to me: there is no 's' in SELECT-clause, only in 
> WHERE-clause in
> the form of 'upper(s)' and this is why ix_test_upper is choosen by the 
> planner.

You need to create the index like this:

  CREATE INDEX ix_test_upper ON test (upper(s)) INCLUDE (s);

See https://www.postgresql.org/docs/current/indexes-index-only-scans.html:

 "In principle, index-only scans can be used with expression indexes.
  For example, given an index on f(x) where x is a table column, it
  should be possible to execute

SELECT f(x) FROM tab WHERE f(x) < 1;

  as an index-only scan; and this is very attractive if f() is an
  expensive-to-compute function. However, PostgreSQL's planner is currently
  not very smart about such cases. It considers a query to be potentially
  executable by index-only scan only when all columns needed by the query
  are available from the index. In this example, x is not needed except in
  the context f(x), but the planner does not notice that and concludes that
  an index-only scan is not possible. If an index-only scan seems sufficiently
  worthwhile, this can be worked around by adding x as an included column,
  for example

CREATE INDEX tab_f_x ON tab (f(x)) INCLUDE (x);

Yours,
Laurenz Albe

Re: huge SubtransSLRU and SubtransBuffer wait_event

2024-02-02 Thread Laurenz Albe

On Fri, 2024-02-02 at 02:04 -0800, Nikolay Samokhvalov wrote:
> On Thu, Feb 1, 2024 at 04:42 Laurenz Albe  wrote:
> > Today, the only feasible solution is not to create more than 64 
> > subtransactions
> > (savepoints or PL/pgSQL EXCEPTION clauses) per transaction.
> 
> I think 64+ nesting level is quite rare

It doesn't have to be 64 *nested* subtransactions.  This is enough:

CREATE TABLE tab (x integer);

DO
$$DECLARE
   i integer;
BEGIN
   FOR i IN 1..70 LOOP
  BEGIN
 INSERT INTO tab VALUES (i);
  EXCEPTION
 WHEN unique_violation THEN
NULL; -- ignore
  END;
   END LOOP;
END;$$;

Yours,
Laurenz Albe

Re: huge SubtransSLRU and SubtransBuffer wait_event

2024-02-01 Thread Laurenz Albe

On Thu, 2024-02-01 at 11:50 +, James Pang (chaolpan) wrote:
> We have a Postgresqlv14.8 server, client use Postgresql JDBC connections, 
> today,
> our server see a lot of  “SubtransBuffer” and “SubtransSLRU” wait_event.
> Could you help direct me what’s the possible cause and how to resolve this 
> waits ?

Today, the only feasible solution is not to create more than 64 subtransactions
(savepoints or PL/pgSQL EXCEPTION clauses) per transaction.

Don't use extensions or the JDBC driver option to simulate statement level 
rollback,
that is the road to hell.

Yours,
Laurenz Albe

Re: Weird performance differences between cloud vendors

2024-02-01 Thread Laurenz Albe

On Thu, 2024-02-01 at 10:23 +0100, Dirk Krautschick wrote:
> I have run a test with pgbench against two cloud vendors (settings, 
> parameters almost the same).
> Both Postgres (or whatever they do internally when they call it as Postgres 
> offering, NOT Aurora or so :-) )
> 
> I have got a strange result that cloud vendor 1 is performing almost 
> everywhere better in matter of
> read and write but except in the init phase of pgbench it took almost double 
> the time.

Nobody except those vendors could tell you for certain, but perhaps on the one
system the initial data load is fast, because you have not yet exceeded your 
I/O quota,
and then I/O is throttled.

Yours,
Laurenz Albe

Re: Slow query in table where many rows were deleted. VACUUM FULL fixes it

2024-01-30 Thread Laurenz Albe

On Tue, 2024-01-30 at 11:40 +0200, Pavlos Kallis wrote:
> I have the following table:
> 
> CREATE TABLE IF NOT EXISTS public.shortened_url
> (
>     id character varying(12) COLLATE pg_catalog."default" NOT NULL,
>     created_at timestamp without time zone,
>     expires_at timestamp without time zone,
>     url text COLLATE pg_catalog."default" NOT NULL,
>     CONSTRAINT shortened_url_pkey PRIMARY KEY (id)
> )
> 
> The table contains only the following index on PRIMARY KEY:
> 
> CREATE UNIQUE INDEX IF NOT EXISTS shortened_url_pkey
>     ON public.shortened_url USING btree
>     (id COLLATE pg_catalog."default" ASC NULLS LAST)
>     TABLESPACE pg_default;
> 
> This table has approximately 5 million rows of expired URLs (expires_at < 
> now()), and 5 thousand rows of non-expired URLs (expires_at > now())
> 
> I deleted all expired URLs with this query:
> 
> DELETE FROM shortened_url WHERE expires_at < now().
> 
> Then, I tried to query the table for expired URLs:
> 
> SELECT * FROM shortened_url WHERE expires_at < now();
> 
> This query was very slow. It took around 1-2 minutes to run, while it had to 
> fetch only 5000 rows (the non-expired URLs, since the other ones were 
> deleted).
> 
> After that, I tried to run VACUUM ANALYZE and REINDEX to the table.
> The query was still slow.
> 
> Finally, I ran VACUUM FULL and re-executed the query. Only then, it started 
> running fast (1-2 seconds).
> 
> Do you have observed a similar behavior with VACUUM ANALYZE / VACUUM FULL and 
> why this can happen? 
> Is this because data is compacted after VACUUM FULL and sequential disk reads 
> are faster? 
> Shouldn't VACUUM ANALYZE reclaim the disk space and make the query run fast?
> Is this because RDS might do some magic? Is it something I am missing?

There are too many unknowns here.  Please enable "track_io_timing" and send us
the output of EXPLAIN (ANALYZE, BUFFERS) for the slow statements.

One theory could be that there was a long running transaction or something else
that prevented VACUUM from cleaning up.  For that, the output of
"VACUUM (VERBOSE) shortened_url" would be interesting.

> Additional details
> PostgreSQL version: 14.7 on db.t3.micro RDS
> PG configuration: Default of RDS

We can only speak about real PostgreSQL...

Yours,
Laurenz Albe

Re: Performance down with JDBC 42

2023-11-04 Thread Laurenz Albe

On Sat, 2023-11-04 at 19:08 +, Abraham, Danny wrote:
> Asking for help with a JDBC related issue.
> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
> 
> A big query - 3 unions and about 10 joins runs :
> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old JDBC 9.2
> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is running many 
> times) 
> 
> Question: Is there a structured way to identify optimization setup ( Planner 
> Method s ) changes?
> Are there any known changes specific to JDBC 42. 

What I would do is enable auto_explain and look at the execution plan
when the statement is run by the JDBC driver.  Then you can compare the
execution plans and spot the difference.

Yours,
Laurenz Albe

Re: GIN JSONB path index is not always used

2023-10-17 Thread Laurenz Albe

On Tue, 2023-10-17 at 15:48 +0200, Tomasz Szymański wrote:
> Hello! We have an issue with database planner choosing really expensive 
> sequence scan instead of an index scan in some cases.

To analyze that, we'd need the output from EXPLAIN (ANALYZE, BUFFERS) SELECT ...
Plain text format please, no JSON.

Yours,
Laurenz Albe

Re: Dirty reads on index scan,

2023-09-22 Thread Laurenz Albe

On Fri, 2023-09-22 at 10:35 +0200, Koen De Groote wrote:
> On Thu, Sep 21, 2023 at 9:30 PM Laurenz Albe  wrote:
> > On Thu, 2023-09-21 at 17:05 +0200, Koen De Groote wrote:
> > > I'm doing the following query:
> > > select * from my_table where hasbeenchecked = true and hasbeenverified = 
> > > true and insert_timestamp <= '2023-09-01 00:00:00.000' limit 1000;
> > > 
> > > The date is an example, it is the format that is used in the query.
> > > 
> > > The table has 81M rows. Is 50GB in size. And the index is 34MB
> > > 
> > > The index is as follows:
> > > btree (insert_timestamp DESC) WHERE hasbeenchecked = true AND 
> > > hasbeenverified = true
> > > 
> > > I'm seeing a slow query first, then a fast one, and if I move the date, a 
> > > slow query again.
> > > 
> > > What I'm seeing is:
> > > Attempt 1:
> > > Hit: 5171(40MB)
> > > Read: 16571(130MB)
> > > Dirtied: 3940(31MB)
> > > 
> > > Attempt 2:
> > > Hit: 21745 (170MB)
> > > Read: Nothing
> > > Dirtied: Nothing.
> > > 
> > > It's slow once, then consistently fast, and then slow again if I move the 
> > > date around.
> > > And by slow I mean: around 60 seconds. And fast is below 1 second.
> > 
> > That's normal behavior: after the first execution, the data are cached, so 
> > the query
> > becomes much faster.
> > 
> > Dirtying pages happens because the first reader has to set hint bits, which 
> > is an extra
> > chore.  You can avoid that if you VACUUM the table before you query it.
>
> So, if I want to speed up the query, apart from trying to vacuum it 
> beforehand, I suspect
> I've hit the limit of what this query can do?
>
> Because, the table is just going to keep growing. And it's a usually a query 
> that runs one
> time per day, so it's a cold run each time.
>
> Is this just going to get slower and slower and there's nothing that can be 
> done about it?

Essentially yes.

If the table does not have too many columns, or you can be more selective than 
"SELECT *",
you could use an index-only scan with an index like

  CREATE INDEX ON my_table (insert_timestamp)
 INCLUDE (/* all the columns in the SELECT list */)
 WHERE hasbeenchecked AND hasbeenverified;

  VACUUM my_table;

You need to configure autovacuum so that it vacuums the table often enough if 
you want
an efficient index-only scan.

If that is not feasible, you can gain speed by clustering the table.  For that, 
you need
a different index:

  CREATE INDEX ckuster_idx ON my_table (hasbeenchecked, hasbeenverified, 
insert_timestamp);

  CLUSTER my_table USING cluster_idx;  -- attention: rewrites the table

That should speed up the query considerably, because it will have to read way 
fewer pages
from disk.  However, CLUSTER is not without problems.  Look at the 
documentation for the
caveats.

Yours,
Laurenz Albe

Re: Dirty reads on index scan,

2023-09-21 Thread Laurenz Albe

On Thu, 2023-09-21 at 17:05 +0200, Koen De Groote wrote:
> I'm doing the following query:
> select * from my_table where hasbeenchecked = true and hasbeenverified = true 
> and insert_timestamp <= '2023-09-01 00:00:00.000' limit 1000;
> 
> The date is an example, it is the format that is used in the query.
> 
> The table has 81M rows. Is 50GB in size. And the index is 34MB
> 
> The index is as follows:
> btree (insert_timestamp DESC) WHERE hasbeenchecked = true AND hasbeenverified 
> = true
> 
> I'm seeing a slow query first, then a fast one, and if I move the date, a 
> slow query again.
> 
> What I'm seeing is:
> Attempt 1:
> Hit: 5171(40MB)
> Read: 16571(130MB)
> Dirtied: 3940(31MB)
> 
> Attempt 2:
> Hit: 21745 (170MB)
> Read: Nothing
> Dirtied: Nothing.
> 
> It's slow once, then consistently fast, and then slow again if I move the 
> date around.
> And by slow I mean: around 60 seconds. And fast is below 1 second.

That's normal behavior: after the first execution, the data are cached, so the 
query
becomes much faster.

Dirtying pages happens because the first reader has to set hint bits, which is 
an extra
chore.  You can avoid that if you VACUUM the table before you query it.

Yours,
Laurenz Albe

Re: Fwd: Planning time is time-consuming

2023-09-11 Thread Laurenz Albe

On Mon, 2023-09-11 at 12:57 +0800, Mikhail Balayan wrote:
> Thanks for the idea. I was surprised to find that this is not the way it 
> works and the planning time remains the same.

To benefit from the speed gains of a prepared statement, you'd have to execute 
it
at least seven times.  If a generic plan is used (which should happen), you will
see $1 instead of the literal argument in the execution plan.

Prepared statements are probably your best bet.

Yours,
Laurenz Albe

Re: Planning time is time-consuming

2023-09-10 Thread Laurenz Albe

On Fri, 2023-09-08 at 18:51 +0800, Mikhail Balayan wrote:
> I have three tables:
>     - test_db_bench_1
>     - test_db_bench_tenants
>     - test_db_bench_tenant_closure
> 
> And the query to join them:
> SELECT "test_db_bench_1"."id" id, "test_db_bench_1"."tenant_id"
>   FROM "test_db_bench_1"
>   JOIN "test_db_bench_tenants" AS "tenants_child" ON (("tenants_child"."uuid" 
> = "test_db_bench_1"."tenant_id") 
>                                                  AND 
> ("tenants_child"."is_deleted" != true))
>   JOIN "test_db_bench_tenant_closure" AS "tenants_closure" ON 
> (("tenants_closure"."child_id" = "tenants_child"."id")
>                                                           AND 
> ("tenants_closure"."barrier" <= 0))
>   JOIN "test_db_bench_tenants" AS "tenants_parent" ON (("tenants_parent"."id" 
> = "tenants_closure"."parent_id")
>                                                   AND 
> ("tenants_parent"."uuid" IN ('4c79c1c5-21ae-45a0-8734-75d67abd0330'))
>                                                   AND 
> ("tenants_parent"."is_deleted" != true))
>  LIMIT 1
> 
> 
> With following execution plan:
> 
>                                                                               
>                        QUERY PLAN
> --
> ---
>  Limit  (cost=1.56..1.92 rows=1 width=44) (actual time=0.010..0.011 rows=0 
> loops=1)
>    ->  Nested Loop  (cost=1.56..162.42 rows=438 width=44) (actual 
> time=0.009..0.009 rows=0 loops=1)
>          ->  Nested Loop  (cost=1.13..50.27 rows=7 width=36) (actual 
> time=0.008..0.009 rows=0 loops=1)
>                ->  Nested Loop  (cost=0.84..48.09 rows=7 width=8) (actual 
> time=0.008..0.009 rows=0 loops=1)
>                      ->  Index Scan using test_db_bench_tenants_uuid on 
> test_db_bench_tenants tenants_parent  (cost=0.41..2.63 rows=1 width=8) 
> (actual time=0.008..0.008 rows=0 loops=1)
>                            Index Cond: ((uuid)::text = 
> '4c79c1c5-21ae-45a0-8734-75d67abd0330'::text)
>                            Filter: (NOT is_deleted)
>                      ->  Index Scan using test_db_bench_tenant_closure_pkey 
> on test_db_bench_tenant_closure tenants_closure  (cost=0.42..45.06 rows=40 
> width=16) (never executed)
>                            Index Cond: (parent_id = tenants_parent.id)
>                            Filter: (barrier <= 0)
>                ->  Index Scan using test_db_bench_tenants_pkey on 
> test_db_bench_tenants tenants_child  (cost=0.29..0.31 rows=1 width=44) (never 
> executed)
>                      Index Cond: (id = tenants_closure.child_id)
>                      Filter: (NOT is_deleted)
>          ->  Index Scan using test_db_bench_1_idx_tenant_id_3 on 
> acronis_db_bench_heavy  (cost=0.43..14.66 rows=136 width=44) (never executed)
>                Index Cond: ((tenant_id)::text = (tenants_child.uuid)::text)
>  Planning Time: 0.732 ms
>  Execution Time: 0.039 ms
> 
> 
> Where the planning time gets in the way as it takes an order of magnitude 
> more time than the actual execution.
> 
> Is there a possibility to reduce this time? And, in general, to understand 
> why planning takes so much time.

You could try to VACUUM the involved tables; indexes with many entries pointing 
to dead tuples
can cause a long planing time.

Also, there are quite a lot of indexes on "test_db_bench_1".  On a test 
database, drop some
indexes and see if that makes a difference.

Finally, check if "default_statistics_target" is set to a high value, or if the 
"Stats target"
for some column in the "\d+ tablename" output is set higher than 100.

Yours,
Laurenz Albe

Re: Table copy with SERIALIZABLE is incredibly slow

2023-07-30 Thread Laurenz Albe

On Sun, 2023-07-30 at 23:00 -0600, peter plachta wrote:
> Background is we're trying a pg_repack-like functionality to compact a 
> 500Gb/145Gb
> index (x2) table from which we deleted 80% rows. Offline is not an option. 
> The table
> has a moderate (let's say 100QPS) I/D workload running.
> 
> The typical procedure for this type of thing is basically CDC:
> 
> 1. create 'log' table/create trigger
> 2. under SERIALIZABLE: select * from current_table insert into new_table
> 
> What we're finding is that for the 1st 30 mins the rate is 10Gb/s, then it 
> drops to
> 1Mb/s and stays there and 22 hours later the copy is still going and now 
> the log
> table is huge so we know the replay will also take a very long time.
> 
> ===
> 
> Q: what are some ways in which we could optimize the copy?
> 
> Btw this is Postgres 9.6
> 
> (we tried unlogged table (that did nothing), we tried creating indexes after
> (that helped), we're experimenting with RRI)

Why are you doing this the hard way, when pg_squeeze or pg_repack could do it?

You definitely should not be using PostgreSQL 9.6 at this time.

Yours,
Laurenz Albe

Re: TOAST Fields serialisation/deserialization performance

2023-07-26 Thread Laurenz Albe

On Wed, 2023-07-26 at 18:15 +0530, Piyush Katariya wrote:
> I have a few queries regarding the TOAST Fields serialisation/deserialization 
> performance.
> 
> The use case i am trying to solve here is to have millions of partitions and 
> aggregate the data in array field.
> 
> I wish to know if i declare certain column in table as "array of UDT/JSONB" 
> and enable
> either lz4 or zstd compression on it, does appending or prepending to that 
> array or even
> changing the intermediate fields of UDT/JSONB objects. in that array has a 
> runtime cost
> of full array data de-serialization every single time. If i perform any 
> UPDATE operation
> on its elements or add/remove new elements from any position, does PG 
> rewrites the new
> version of the column value regardless of its size.

Updating even a small part of a large JSONB value requires that the entire 
thing is
read and written, causing a lot of data churn.

This is inefficient, and you shouldn't use large JSONB values if you plan to do 
that.

If the data have a regular structure, use a regular relational data model.
Otherwise, one idea might be to split the JSONB in several parts and store each
of those parts in a different table row.  That would reduce the impact.

Yours,
Laurenz Albe

Re: Index on (fixed size) bytea value

2023-06-19 Thread Laurenz Albe

On Tue, 2023-06-20 at 08:13 +0200, Les wrote:
> I'm aware of the TOAST, and how it works. I was referring to it ("I think 
> that it should
> be as large as possible, without hitting the toast. ") I have designed a 
> separate "block"
> table specifically to avoid storing binary data in the TOAST. So my plan is 
> not going to
> involve out-of-band storage.
> 
> Just to make this very clear: a record in the block table would store a 
> block, not the
> whole file. My question is to finding the optimal block size (without hitting 
> the toast),
> and finding the optimal hash algorithm for block de-duplication.

Then you would ALTER the column and SET STORAGE MAIN, so that it does not ever 
use TOAST.

The size limit for a row would then be 8kB minus page header minus row header, 
which
should be somewhere in the vicinity of 8140 bytes.

If you want your block size to be a power of two, the limit would be 4kB, which 
would waste
almost half your storage space.

Yours,
Laurenz Albe

Re: thousands of CachedPlan entry per backend

2023-06-01 Thread Laurenz Albe

On Thu, 2023-06-01 at 08:50 +, James Pang (chaolpan) wrote:
> we found thousands of cached plan , since JDBC driver only allow max 256 
> cached
> prepared statements, how backend cache so many sql plans. If we have one 
> function,
> when application call that function will make backend to cache  every SQL 
> statement
> plan in that function too?   and for table triggers, have similar caching 
> behavior ?

Yes, as long as the functions are written in PL/pgSQL.
It only affects static SQL, that is, nothing that is run with EXECUTE.

Yours,
Laurenz Albe

Re: thousands of CachedPlan entry per backend

2023-05-31 Thread Laurenz Albe

On Thu, 2023-06-01 at 03:36 +, James Pang (chaolpan) wrote:
> PG V14.8-1 , client using Postgresql JDBC driver we found 40MB process memory 
> per
> backend, from Operating system and memorycontext dump “Grand total:”, both 
> mached.
> But from details, we found almost of entry belong to  “CacheMemoryContext”,
> from this line  CacheMemoryContext: 8737352 total in 42 blocks; 1021944 free 
> (215 chunks); 7715408 used,
> but there are thousands of lines of it’s child, the sum of blocks much more 
> than “8737352” total in 42 blocks
> 
> Our application use Postgresql JDBC driver with default 
> parameters(maxprepared statement 256),
> there are many triggers, functions in this database, and a few functions run 
> sql by an extension
> pg_background.  We have thousands of connections and have big concern why 
> have thousands of entrys
> of cached SQL ?  that will consume huge memory ,  anyway to limit the cached 
> plan entry to save memory
> consumption?  Or it looks like an abnormal behavior or bug to see so many 
> cached plan lines.

If you have thousands of connections, that's your problem.  You need effective 
connection pooling.
Then 40MB per backend won't be a problem at all.  Having thousands of 
connections will cause
other, worse, problems for you.

See for example 
https://www.cybertec-postgresql.com/en/tuning-max_connections-in-postgresql/

If you want to use functions, but don't want to benefit from plan caching, you 
can set
the configuration parameter "plan_cache_mode" to "force_custom_plan".

Yours,
Laurenz Albe

Re: Performance issues in query with multiple joins

2023-04-30 Thread Laurenz Albe

On Fri, 2023-04-28 at 15:19 +0300, Παρασκευη Πασσαρη wrote:
> We are facing a performance issue with the following query. Executing this 
> query takes about 20 seconds.
> (the database version is 14.1)

The execution plan seems to be incomplete.

Yours,
Laurenz Albe

Re: multicolumn partitioning help

2023-03-15 Thread Laurenz Albe

On Tue, 2023-03-14 at 19:33 -0400, James Robertson wrote:
> Laurenz your solution is how I thought I would work around my (lack of) 
> understanding
> of partitioning. (nested partitions).
> I was hesitant because I didn't know what sort of performance problems I 
> would create for myself.
> 
> [...] more performance [...]

If you are thinking of subpartitioning primarily in terms of boosting 
performance,
you should know that you only get performance benefits from partitioning with
very special queries that effectively have to be designed together with the
partitioning strategy.  Other statements typically become somewhat slower
through partitioning.

So it is really impossible to discuss performance benefits without knowing
the exact query.  It may be best if you build a play database with realistic 
amounts
of test data and use EXPLAIN and EXPLAIN (ANALYZE) to see the effects that
partitioning has on your queries.

Yours,
Laurenz Albe

Re: multicolumn partitioning help

2023-03-14 Thread Laurenz Albe

On Sun, 2023-03-12 at 13:59 -0400, James Robertson wrote:
> I am having issues with multicolumn partitioning. For reference I am using 
> the following link as my guide:
> https://www.postgresql.org/docs/devel/sql-createtable.html
> 
> To demonstrate my problem, I created a simple table called humans. I want to 
> partition by the year
> of the human birth and then the first character of the hash. So for each year 
> I'll have year*16 partitions. (hex)
> 
> CREATE TABLE humans (
>     hash bytea,
>     fname text,
>     dob date
>     )PARTITION BY RANGE (EXTRACT(YEAR FROM dob),substring(hash::text, 1, 1));
>     
> Reading the documentation:   "When creating a range partition, the lower 
> bound specified with
> FROM is an inclusive bound, whereas the upper bound specified with TO is an 
> exclusive bound".
> 
> However I can't insert any of the following after the first one, because it 
> says it overlaps.
> Do I need to do anything different when defining multi-column partitions?
> 
> 
> This works:
> CREATE TABLE humans_1968_0 PARTITION OF humans FOR VALUES FROM (1968, '0') TO 
> (1969, '1');
> 
> 
> These fail: 
> CREATE TABLE humans_1968_1 PARTITION OF humans FOR VALUES FROM (1968, '1') TO 
> (1969, '2');

Justin has explained what the problem is, let me supply a solution.

I think you want subpartitioning, like

  CREATE TABLE humans (
 hash bytea,
 fname text,
 dob date
  ) PARTITION BY LIST (EXTRACT (YEAR FROM dob));

  CREATE TABLE humans_2002
 PARTITION OF humans FOR VALUES IN (2002)
 PARTITION BY HASH (hash);

  CREATE TABLE humans_2002_0
 PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 0);

  [...]

  CREATE TABLE humans_2002_25
 PARTITION OF humans_2002 FOR VALUES WITH (MODULUS 26, REMAINDER 25);

and so on for the other years.

Yours,
Laurenz Albe

Re: Performance of UPDATE operation

2023-02-13 Thread Laurenz Albe

On Mon, 2023-02-13 at 16:09 +0100, Mkrtchyan, Tigran wrote:
> Typically we expect that UPDATE is a slow operation in PostgreSQL, however,
> there are cases where it's hard to understand why. In particular, I have a 
> table like
> 
> ```
> CREATE SEQUENCE t_inodes_inumber_seq
>     START WITH 1
>     INCREMENT BY 1
>     NO MINVALUE
>     NO MAXVALUE
>     CACHE 1;
> 
> 
> CREATE TABLE t_inodes (
>     inumber bigint PRIMARY KEY,
>     icrtime timestamp with time zone NOT NULL,
>     igeneration bigint NOT NULL
> );
> ```
> 
> and a transaction that inserts and update an entry in that table:
> 
> ```
> BEGIN;
> INSERT INTO t_inodes (inumber, icrtime, igeneration)
>    VALUES (nextval('t_inodes_inumber_seq'), now(), 0) RETURNING inumber \gset
> 
> UPDATE t_inodes SET igeneration = igeneration + 1 where  inumber = :inumber;
> END;
> ```
> 
> The pgbench shows the following result:
> 
> ```
> $ pgbench -h localhost -n -r  -f update.sql -t 1 -c 64 -j 64 testdb
> pgbench (15.0 (Debian 15.0-1.pgdg110+1))
> transaction type: update.sql
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 64
> maximum number of tries: 1
> number of transactions per client: 1
> number of transactions actually processed: 64/64
> number of failed transactions: 0 (0.000%)
> latency average = 11.559 ms
> initial connection time = 86.038 ms
> tps = 5536.736898 (without initial connection time)
> statement latencies in milliseconds and failures:
>  0.524   0  BEGIN;
>  0.819   0  INSERT INTO t_inodes (inumber, icrtime, 
> igeneration)
>  0.962   0  UPDATE t_inodes SET igeneration = igeneration + 1 
> where  inumber = :inumber;
>  9.203   0  END;
> ```
> 
> My naive expectation will be that updating the newly inserted record should 
> cost nothing... Are there ways
> to make it less expensive?

Updating a newly inserted row is about as expensive as inserting the row in the 
first place.

You can reduce the overall impact somewhat by creating the table with a 
"fillfactor" below
100, in your case 90 would probably be enough.  That won't speed up the UPDATE 
itself, but
it should greatly reduce the need for VACUUM.

Yours,
Laurenz Albe

Re: max_wal_senders

2023-02-08 Thread Laurenz Albe

On Wed, 2023-02-08 at 18:07 -0500, Rick Otten wrote:
> I've been thinking about the max_wal_senders parameter lately and wondering 
> if there
> is any harm in setting it too high.

No, there isn't, except that if you end up having too many *actual* WAL 
senders, it
will cause load.  A high limit is no problem as such.

> The documentation mentions an orphaned connection slot that may take a while 
> to time out.
> How can I tell if I have any of those?  I was looking for a `pg_wal_slots` 
> table
> similar to the `pg_replication_slots` table, but don't see anything obvious 
> in the catalog.

The view is "pg_stat_replication", but you won't see there if an entry is
abandoned before PostgreSQL does and terminates it.  You can set 
"tcp_keepalived_idle"
low enough so that the kernel will detect broken connections early on.

Yours,
Laurenz Albe

Re: Domain check taking place unnecessarily?

2023-02-08 Thread Laurenz Albe

On Wed, 2023-02-08 at 18:01 +, Mark Hills wrote:
> I've ruled out waiting on a lock; nothing is reported with 
> log_lock_waits=on. This is a test database with exclusive access (2.5 
> million rows):
> 
> This is PostgreSQL 14.5 on Alpine Linux. Thanks.
> 
> CREATE DOMAIN hash AS text
>     CHECK (VALUE ~ E'^[a-zA-Z0-9]{8,32}$');
>  
> devstats=> ALTER TABLE invite ADD COLUMN test text;
> ALTER TABLE
> Time: 8.988 ms
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash;
> ALTER TABLE
> Time: 30923.380 ms (00:30.923)
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT NULL;
> ALTER TABLE
> Time: 30344.272 ms (00:30.344)
>  
> devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT '123abc123'::hash;
> ALTER TABLE
> Time: 67439.232 ms (01:07.439)

It takes 30 seconds to schan the table and determine that all existing rows
satisky the constraint.

The last example is slower, because there is actually a non-NULL value to check.

If that were not a domain, but a normal check constraint, you could first add
the constraint as NOT VALID and later run ALTER TABLE ... VALIDATE CONSTRAINT 
...,
which takes a while too, but does not lock the table quite that much.
But I don't think there is a way to do that with a domain.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: time sorted UUIDs

2022-12-15 Thread Laurenz Albe

On Thu, 2022-12-15 at 10:56 +1300, Tim Jones wrote:
> could someone please comment on this article 
> https://vladmihalcea.com/uuid-database-primary-key/
> specifically re the comments (copied below) in regards to a Postgres database.
> 
> ...
> But, using a random UUID as a database table Primary Key is a bad idea for 
> multiple reasons.
> First, the UUID is huge. Every single record will need 16 bytes for the 
> database identifier,
> and this impacts all associated Foreign Key columns as well.
> Second, the Primary Key column usually has an associated B+Tree index to 
> speed up lookups or
> joins, and B+Tree indexes store data in sorted order.
> However, indexing random values using B+Tree causes a lot of problems:
>  * Index pages will have a very low fill factor because the values come 
> randomly. So, a page
>of 8kB will end up storing just a few elements, therefore wasting a lot of 
> space, both
>on the disk and in the database memory, as index pages could be cached in 
> the Buffer Pool.
>  * Because the B+Tree index needs to rebalance itself in order to maintain 
> its equidistant
>tree structure, the random key values will cause more index page splits 
> and merges as
>there is no pre-determined order of filling the tree structure.

I'd say that is quite accurate.

Yours,
Laurenz Albe

Re: Milions of views - performance, stability

2022-09-16 Thread Laurenz Albe

On Sat, 2022-09-17 at 01:05 +0200, Hubert Rutkowski wrote:
> Hello! I have written python program to benchmark view efficiency, because in 
> our platform
> they have a role to play and we noticed the performance is less than expected.

If your platform plans to use millions of views, you should revise your design. 
 As you
see, that is not going to fly.  And no, I don't consider that a bug.

> Basically, benchmark creates table:
> 
> CREATE TABLE IF NOT EXISTS foobar ( id int, text varchar(40) );
> 
> for i in range(1200300):
> INSERT INTO foobar (id, text) VALUES ({i}, 'some string');
> CREATE VIEW foobar_{i} as select * from foobar where id={i};
> 
> Couldn't be any simpler. 
> [general slowness]
> 
> What's even stranger is dropping performance: DROP TABLE foobar CASCADE;. 
> First of all, had to
> increase locks to allow it to finish, otherwise it was quickly bailing 
> because of "too little shared memory".
> alter system set  max_locks_per_transaction=4;
> 
> But even after that, it took almost 7 hours and crashed:
> 
> 2022-09-13 23:16:31.113 UTC [1] LOG:  server process (PID 404) was terminated 
> by signal 9: Killed
> 
> After updating Postgres to 14.5, it crashed in a bit different way:
> 
> 2022-09-15 19:20:26.000 UTC [67] LOG:  checkpoints are occurring too 
> frequently (23 seconds apart)
> 2022-09-15 19:20:26.000 UTC [67] HINT:  Consider increasing the configuration 
> parameter "max_wal_size".
> 2022-09-15 19:20:39.058 UTC [1] LOG:  server process (PID 223) was terminated 
> by signal 9: Killed
> 2022-09-15 19:20:39.058 UTC [1] DETAIL:  Failed process was running: drop 
> table foobar cascade;
> 
> Wihout the views, table can be dropped in 20ms. 

You misconfigured your operating system and didn't disable memory overcommit, 
so you got killed
by the OOM killer.  Basically, the operation ran out of memory.

Yours,
Laurenz Albe

Re: Select on partitioned table is very slow

2022-08-25 Thread Laurenz Albe

On Thu, 2022-08-25 at 11:10 +0200, hubert depesz lubaczewski wrote:
> Hi,
> 
> On Thu, Aug 25, 2022 at 10:49:51AM +0200, Jose Osinde wrote:
> > select logical_identifier, version_id, lastproduct
> >    from test_product_ui_partition.product_ui pui
> >     where pui.mission_id='urn:esa:psa:context:investigation:mission.em16'
> >   and 
> > pui.logical_identifier='urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'
> 
> > EXPLAIN ANALYZE FROM PGADMIN
> > 
> > Index Scan using product_ui_em16_logical_identifier_idx on
> > product_ui_em16 pui  (cost=0.69..19.75 rows=7 width=112) (actual
> > time=0.133..0.134 rows=1 loops=1)
> > [...]  Index Cond: (logical_identifier =
> > 'urn:esa:psa:em16_tgo_frd:data_raw:frd_raw_sc_n_20220729t00-20220729t235959'::citext)"
> > [...]  Filter: (mission_id =
> > 'urn:esa:psa:context:investigation:mission.em16'::citext)"
> > Planning Time: 0.237 ms
> > Execution Time: 0.149 ms
> 
> I really wish you didn't butcher explains like this, but we can work
> with it.
> 
> Please note that the condition for filter is:
> 
> mission_id = 'urn:esa:psa:context:investigation:mission.em16'::citext
> 
> Specifically, column mission_id (which is partition key) is compared
> with some value that is in citext type - same as column.
> This means that pg can take this value, compare with partitioning
> schema, and pick one partition.
> 
> Now look at the explain from java:
> 
> >   Filter: (((mission_id)::text =
> > 'urn:esa:psa:context:investigation:mission.em16'::text) AND
> 
> The rest is irrelevant.
> 
> The important part is that java sent query that doesn't compare value of
> column mission_id with some value, but rather compares *cast* of the
> column.
> 
> Since it's not column value, then partitioning can't check what's going
> on (cast can just as well make it totally different value), and it also
> can't really use index on mission_id.
> 
> Why it happens - no idea, sorry, I don't grok java.
> 
> But you should be able to test/work on fix with simple, non-partitioned
> table, just make there citext column, and try searching for value in it,
> and check explain from the search. If it will cast column - it's no
> good.
> 
> Sorry I can't tell you what to fix, but perhaps this will be enough for
> you to find solution.

Quite so.

You are probably using a prepared statement in JDBC.

You probably have to use explicit type casts, like:

select logical_identifier, version_id, lastproduct 
   from test_product_ui_partition.product_ui pui 
where pui.mission_id = ? :: citext
  and pui.logical_identifier = ? :: citext

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Oracle_FDW table performance issue

2022-07-11 Thread Laurenz Albe

On Mon, 2022-07-11 at 17:38 +0530, aditya desai wrote:
> I have one Oracle fdw table which is giving performance issue when joined
> local temp table gives performance issue.
> 
> select * from oracle_fdw_table where transaction_id in ( select 
> transaction_id from temp_table)
>   54 seconds. Seeing HASH SEMI JOIN  in EXPLAIN PLAN. temp_table has only 
> 74 records.
> 
> select * from from oracle_fdw_table where transaction_id in ( 
> 1,2,3,.,75)--- 23ms.
> 
> Could you please help me understand this drastic behaviour change?

The first query joins a local table with a remote Oracle table.  The only way 
for
such a join to avoid fetching the whole Oracle table would be to have the 
foreign scan
on the inner side of a nested loop join.  But that would incur many round trips 
to Oracle
and is therefore perhaps not a great plan either.

In the second case, the whole IN list is shipped to the remote side.

In short, the queries are quite different, and I don't think it is possible to 
get
the first query to perform as well as the second.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Need help on Query Tunning and Not using the Index Scan

2022-05-20 Thread Laurenz Albe

On Fri, 2022-05-20 at 07:37 +, Kumar, Mukesh wrote:
> We are facing an issue in running the query which takes at least 30 sec to 
> run in PostgreSQL.
>  
> We have tried to create the indexes and done the maintenance and still that 
> query is taking same time.
>  
> Below are the explain plan for the query.
>  
> https://explain.depesz.com/s/sPo2#html
>  
> We have noticed that maximum time it is takin is do a Seq Scan on Table 
> ps_delay_statistic which consist of approx. 35344812 records .
>  
> Can anyone please help on the above issue.

The problem is probably here:

->  GroupAggregate  (cost=0.57..18153.25 rows=2052 width=23) (actual 
time=13.764..13.765 rows=1 loops=1)
  Group Key: ds_1.fleet_object_number_f"
  ->  Index Scan using ndx_delay_stat_equipment on ps_delay_statistic ds_1  
(cost=0.57..18050.67 rows=16412 width=23) (actual time=0.026..10.991 rows=18180 
loops=1)
Index Cond: (fleet_object_number_f = 
(COALESCE(NULLIF('40001000277313'::text, ''::text)))::numeric)
Filter: (activity_code_f IS NOT NULL)

which comes from this subquery:

SELECT max(dp1.daily_production_id) prodId
 FROM ps_daily_production_v dp1
WHERE dp1.fleet_object_number = cast(coalesce(nullif 
(cast(40001000277313 AS varchar), ''), NULL) AS numeric)
  AND dp1.activity_code IS NOT NULL
GROUP BY dp1.fleet_object_number

Remove the superfluous GROUP BY clause that confuses the optimizer.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Selecting RAM and CPU based on max_connections

2022-05-20 Thread Laurenz Albe

On Fri, 2022-05-20 at 12:15 +0200, Andreas Kretschmer wrote:
> On 20 May 2022 10:27:50 CEST, aditya desai  wrote:
> > One of our applications needs 3000 max_connections to the database.
> > Connection pooler like pgbouncer or pgpool is not certified within the
> > organization yet. So they are looking for setting up high configuration
> > Hardware with CPU and Memory. Can someone advise how much memory and CPU
> > they will need if they want max_conenction value=3000.
> 
> Pgbouncer would be the best solution. CPU: number of concurrent connections.
> RAM: shared_buffer + max_connections * work_mem + maintenance_mem + operating 
> system + ...

Right.  And then hope and pray that a) the database doesn't get overloaded
and b) you don't hit any of the database-internal bottlenecks caused by many
connections.

I also got the feeling that the Linux kernel's memory accounting somehow lags.
I have seen cases where every snapshot of "pg_stat_activity" I took showed
only a few active connections (but each time different ones), but the
amount of allocated memory exceeded what the currently active sessions could
consume.  I may have made a mistake, and I have no reproducer, but I would
be curious to know if there is an explanation for that.
(I am aware that "top" shows shared buffers multiple times).

Yours,
Laurenz Albe

Re: DB connection issue suggestions

2022-05-10 Thread Laurenz Albe

On Wed, 2022-05-11 at 00:59 +0800, Sudhir Guna wrote:
> We have recently upgraded Postgresql 9.4 standalone server to Postgresql 11.2 
> with High Availability (2 servers : Master and Standby).
> 
> While trying to test using ETL applications and reports, we observe that the 
> ETL jobs fails with below error,
> 
> 2022/05/06 16:27:36 - Error occurred while trying to connect to the database
> 2022/05/06 16:27:36 - Error connecting to database: (using class 
> org.postgresql.Driver)
> 2022/05/06 16:27:36 - FATAL: Sorry, too many clients already
> 
> We have increased the max_connections = 1000 in postgresql.conf file.
> 
> It worked ok for a day and later we get the same error message.
> 
> Please help to advise on any additional settings required. The prior 
> Postgresql 9.4 had the default max_connections = 100and the applications 
> worked fine.

Some application that uses the database has a connection leak: it opens new 
connections
without closing old ones.  Examine "pg_stat_activity" to find out which 
application is
at fault, and then go and fix that application.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Performance for SQL queries on Azure PostgreSQL PaaS instance

2022-04-13 Thread Laurenz Albe

On Tue, 2022-04-12 at 09:10 +, Kumar, Mukesh wrote:
> We have recently done the migration from Oracle Database Version 12C to Azure
> PostgreSQL PaaS instance version 11.4 and most of the application 
> functionality
> testing has been over and tested successfully 
>  
> However, there is 1 process at application level which is taking approx. 10 
> mins
> in PostgreSQL and in oracle it is taking only 3 mins.
>  
> After investigating further we identified that process which is executed from
> application end contains 500 to 600 no of short SQL queries into the database.
> We tried to run the few queries individually on database and they are taking
> less than sec in Postgres Database to execute, and we noticed that in Oracle
> taking half of the time as is taking in PostgreSQL. for ex . in oracle same
> select statement is taking 300 millisecond and in PostgreSQL it is taking
> approx. 600 millisecond which over increases the execution of the process.
>  
> Oracle Database are hosted on ON- Prem DC with dedicated application server on
> OnPrem and same for PostgreSQL.

How can a database hosted with Microsoft be on your permises?

Apart from all other things, compare the network latency.  If a single request
results in 500 database queries, you will be paying 1000 times the network
latency per request.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: View taking time to show records

2022-03-25 Thread Laurenz Albe

On Fri, 2022-03-25 at 14:07 +, Kumar, Mukesh wrote:

> > [recommendation to fix the estimate]
> >
> > Perhaps it is also enough to blindly disable nested loop joins for the 
> > whole query,
> > rather than doing the right thing and fixing the estimates:
> >
> > BEGIN;
> > SET LOCAL enable_nestloop = off;
> > SELECT ...;
> > COMMIT;
> 
> Thanks for the below suggestion , When I ran the query with the parameter , 
> it is taking only 1 sec.
> 
> So could you please let me know if I can put this parameter to OFF . at 
> database and it will not
> create any issues to queries running in database.

That will very likely cause problems in your database, because sometimes a 
nested loop join
is by far the most efficient way to run a query.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: View taking time to show records

2022-03-25 Thread Laurenz Albe

On Thu, 2022-03-24 at 15:59 +, Kumar, Mukesh wrote:
> We have recently migrated from Oracle to PostgreSQL on version 11.4 on azure 
> postgres PaaS instance.
>  
> There is 1 query which is taking approx. 10 secs in Oracle and when we ran 
> the same query it is taking approx. 1 min
>  
> Can anyone suggest to improve the query as from application end 1 min time is 
> not accepted by client.
>  
> Please find the query and explain analyze report from below link
>  
> https://explain.depesz.com/s/RLJn#stats

I would split the query in two parts: the one from line 3 to line 49 of your 
execution plan,
and the rest.  The problem is the bad estimate of that first part, so execute 
only that, write
the result to a temporary table and ANALYZE that.  Then execute the rest of the 
query using that
temporary table.

Perhaps it is also enough to blindly disable nested loop joins for the whole 
query, rather than
doing the right thing and fixing the estimates:

BEGIN;
SET LOCAL enable_nestloop = off;
SELECT ...;
COMMIT;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Terribly slow query with very good plan?

2022-02-04 Thread Laurenz Albe

On Fri, 2022-02-04 at 10:11 +0100, Les wrote:
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> 
> select id, title,
>  (exists (select f2.id from media.oo_file f2 where f2.relpath like f.relpath 
> || '%')) as has_file
> from media.oo_folder f where f.parent_id is null
> 
> QUERY PLAN                                                                    
>                                                                               
>   |
> --+
> Index Scan using oo_folder_idx_parent on media.oo_folder f  
> (cost=0.42..488.02 rows=20 width=26) (actual time=713.419..25414.969 rows=45 
> loops=1)             |
>   Output: f.id, f.title, (SubPlan 1)                                          
>                                                                               
>   |
>   Index Cond: (f.parent_id IS NULL)                                           
>                                                                               
>   |
>   Buffers: shared hit=7014170                                                 
>                                                                               
>   |
>   SubPlan 1                                                                   
>                                                                               
>   |
>     ->  Index Only Scan using oo_file_idx_relpath on media.oo_file f2  
> (cost=0.55..108499.27 rows=5381 width=0) (actual time=564.756..564.756 rows=0 
> loops=45)|
>           Filter: (f2.relpath ~~ (f.relpath || '%'::text))                    
>                                                                               
>   |
>           Rows Removed by Filter: 792025                                      
>                                                                               
>   |
>           Heap Fetches: 768960                                                
>                                                                               
>   |
>           Buffers: shared hit=7014130                                         
>                                                                               
>   |
> Planning Time: 0.361 ms                                                       
>                                                                               
>   |
> Execution Time: 25415.088 ms                                                  
>                                                                               
>   |
> 
> It also returns 45 rows, but in 25 seconds which is unacceptable. 

You should create an index that supports LIKE; for example

CREATE INDEX ON media.oo_file (relpath COLLATE "C");

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: WAL files keep piling up

2021-12-23 Thread Laurenz Albe

On Wed, 2021-12-22 at 20:26 +0100, Zbigniew Kostrzewa wrote:
> Yes, I am aware 9.6 is pretty old, soon I will be replacing it with 11.x. 
> Thanks.

v11 is old as well.  I suggest v14.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pg_trgm word_similarity query does not use index for input strings longer than 8 characters

2021-12-06 Thread Laurenz Albe

On Tue, 2021-11-30 at 22:38 +0100, pgsql-performa...@jhacker.de wrote:
> ## Setup Information
> Hardware: Intel i5-8250U, 8GB RAM, encrypted SSD, no RAID
> [...]
>
> Configuration:
> The config file was not changed.
> [...]
>
> ## Test Case
> [...]
> CREATE EXTENSION pg_trgm;
> 
> CREATE TABLE song (
>  artist  varchar(20),
>  title   varchar(20)
> );
> 
> INSERT INTO song (artist, title)
> SELECT 'artist','title'
> FROM generate_series(1,1);
> 
> CREATE INDEX artist_trgm ON song USING GIN (artist gin_trgm_ops);
> CREATE INDEX title_trgm ON song USING GIN (title gin_trgm_ops);
> 
> -- Tips from https://wiki.postgresql.org/wiki/Slow_Query_Questions
> ANALYZE;
> VACUUM;
> REINDEX TABLE song;
> 
> \set query '12345678'
> 
> -- This query is slow
> EXPLAIN ANALYZE
> SELECT song.artist, song.title
> FROM song
> WHERE (song.artist %> :'query' OR song.title %> :'query')
> ;
> 
> set enable_seqscan=off;
> 
> -- This query is fast
> EXPLAIN ANALYZE
> SELECT song.artist, song.title
> FROM song
> WHERE (song.artist %> :'query' OR song.title %> :'query')
> ;

The table is quite small; with a bigger table, the test would be more 
meaningful.

Since you have SSDs, you should tune "random_page_cost = 1.1".
This makes the planner prefer index scans, and it leads to the index scan
being chosen in your case.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Lock contention high

2021-10-13 Thread Laurenz Albe

On Thu, 2021-10-14 at 11:33 +0530, Ashkil Dighin wrote:
> Captured the concurrent session with Netsat and pg-stat-actvity. Is the 
> procedure the right way to capture concurrent sesssions in postgresql?
> 
> Select pg_stat_activity 

[some two dozen sessions]

That doesn't look like you would get into trouble just from the
sheer number of sessions, so it must be something else.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Lock contention high

2021-10-12 Thread Laurenz Albe

On Tue, 2021-10-12 at 13:05 +0530, Ashkil Dighin wrote:
> Perf data for 24vu(TPC-C)
> 
> 
>       18.99%  postgres  postgres            [.] LWLockAcquire
>      7.09%  postgres  postgres            [.] _bt_compare
>      8.66%  postgres  postgres            [.] LWLockRelease
>      2.28%  postgres  postgres            [.] GetSnapshotData
>      2.25%  postgres  postgres            [.] hash_search_with_hash_value
>      2.11%  postgres  postgres            [.] XLogInsertRecord
>      1.98%  postgres  postgres            [.] PinBuffer
> 
> 1.Is there a way to tune the lock contention ?

How many concurrent sesions are you running?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle

2021-10-10 Thread Laurenz Albe

On Mon, 2021-10-11 at 00:09 +0200, Franck Pachot wrote:
> And people will dislike it because it mentions Oracle.

I don't think so.
While "Oracle has it" is not a good enough reason for a feature, it
is certainly no counter-indication.

> Or people will dislike it because they think this should be reserved to 
> commercial forks.

That is conceivable, but I think most vendors would prefer to have
that in standard PostgreSQL rather than having to maintain it on
their own.

> Or because it may introduce too much dependency on the OS.

That is possible.  But I think gettimeofday(2) is portable enough.

Yours,
Laurenz Albe

Re: Troubleshooting a long running delete statement

2021-10-06 Thread Laurenz Albe

On Wed, 2021-10-06 at 18:00 +, Dirschel, Steve wrote:
> Are there other statistics in Postgres I may have looked at to methodically 
> come to the conclusion that the problem was the missing index on the parent 
> FK column?

You could use the query from my article to find the missing indexes:
https://www.cybertec-postgresql.com/en/index-your-foreign-key/

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle

2021-10-05 Thread Laurenz Albe

On Mon, 2021-10-04 at 21:51 -0400, Mladen Gogala wrote:
> 
> On 10/4/21 02:34, Laurenz Albe wrote:
> > On Fri, 2021-10-01 at 15:06 -0500, Jeff Holt wrote:
> > > TLDR; If I spend the time necessary to instrument the many functions that 
> > > are the equivalent
> > > of the Oracle counterparts, would anyone pull those changes and use them?
> > > Specifically, for those who know Oracle, I'm talking about implementing:
> > > 1. The portion of the ALTER SESSION that enables extended SQL trace
> > > 2. Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages
> > > 3. Instrument the thousand or so functions that are the equivalent of 
> > > those found in Oracle's V$EVENT_NAME
> > > 4. Dynamic performance view V$DIAG_INFO
> >
> > Anything that improves user experience in that respect is welcome, but 
> > consider
> > that each database has different approaches to solve the same problems.
> > 
> > Before you go to the length of implementing a lot of stuff, check in with
> > the -hackers list and discuss your ideas.
> > 
> > Please be a lot more specific than in this e-mail.  While it is certainly
> > fine to sketch your ambitios vision, focus on one specific thing you can
> > imagine implementing and come up with a design for that.
> > 
> > Note that "Oracle has it" is not a good enough reason for a PostgreSQL
> > feature.  We think we can do better than they do (at least in many 
> > respects).
> > Also, don't assume that everyone on the -hackers list will be familiar with
> > certain PostgreSQL features.
> > 
> > One think that you should keep in mind is that Oracle has to provide 
> > different
> > features in that area because they are not open source.  In PostgreSQL, I 
> > can
> > simply read the code or attach a debugger to a backend, and when it comes to
> > profiling, "perf" works pretty well.  So there is less need for these 
> > things.
> > 
> > I don't want to discourage you, but contributing to PostgreSQL can be a 
> > lengthy
> > and tedious process.  On the upside, things that make it into core are 
> > usually
> > fairly mature.
> > 
> 
> Laurenz, you are obviously not aware who are you talking to. Let me 
> introduce you: Cary Millsap and Jeff Holt are authors of the "Optimizing 
> Oracle for Performance", one of the most influential books in the entire 
> realm of  Oracle literature.

I have never heard of Jeff Holt, but then there are a lot of wonderful
and smart people I have never heard of.  I tend to be respectful in
my conversation, regardless if I know the other person or not.

> Haughty lectures about "Oracle has it" not being good enough could 
> hardly be more out of place here.

I have no idea how you arrive at the conclusion that I was delivering
a haughty lecture.  Somebody asked if PostgreSQL would consider applying
patches he is ready to write, somebody who seems not to be familiar
with the way PostgreSQL development works, so I tried to give helpful
pointers.

> To put it as politely as is possible in this case, shut your pie hole.

I think you have just disqualified yourself from taking part in this
conversation.  I recommend that you don't embarrass Jeff Holt by trying
to champion him.

Yours,
Laurenz Albe

Re: Better, consistent instrumentation for postgreSQL using a similar API as Oracle

2021-10-03 Thread Laurenz Albe

On Fri, 2021-10-01 at 15:06 -0500, Jeff Holt wrote:
> TLDR; If I spend the time necessary to instrument the many functions that are 
> the equivalent
> of the Oracle counterparts, would anyone pull those changes and use them?
> Specifically, for those who know Oracle, I'm talking about implementing:
>    1. The portion of the ALTER SESSION that enables extended SQL trace
>    2. Most of the DBMS_MONITOR and DBMS_APPLICATION_INFO packages
>    3. Instrument the thousand or so functions that are the equivalent of 
> those found in Oracle's V$EVENT_NAME
>    4. Dynamic performance view V$DIAG_INFO
> For the last 35 years, I've made my living helping people solve Oracle 
> performance problems by looking at it
> 
[...]
> Now looking closely at postgreSQL, I see an opportunity to more quickly 
> implement Oracle's current feature list.

Anything that improves user experience in that respect is welcome, but consider
that each database has different approaches to solve the same problems.

Before you go to the length of implementing a lot of stuff, check in with
the -hackers list and discuss your ideas.

Please be a lot more specific than in this e-mail.  While it is certainly
fine to sketch your ambitios vision, focus on one specific thing you can
imagine implementing and come up with a design for that.

Note that "Oracle has it" is not a good enough reason for a PostgreSQL
feature.  We think we can do better than they do (at least in many respects).
Also, don't assume that everyone on the -hackers list will be familiar with
certain PostgreSQL features.

One think that you should keep in mind is that Oracle has to provide different
features in that area because they are not open source.  In PostgreSQL, I can
simply read the code or attach a debugger to a backend, and when it comes to
profiling, "perf" works pretty well.  So there is less need for these things.

I don't want to discourage you, but contributing to PostgreSQL can be a lengthy
and tedious process.  On the upside, things that make it into core are usually
fairly mature.

Yours,
Laurenz Albe

Re: Query going to all paritions

2021-10-01 Thread Laurenz Albe

On Fri, 2021-10-01 at 12:58 +0530, aditya desai wrote:
> Hi,
> Query on  one of our partitioned tables which is range partitioned on 
> "run"date" column is going to all partitions despite having run_date in WHERE 
> clause. "enable_parition_pruning" is also on. I am
> unable to generate a query plan as the query never runs fully even waiting 
> for say half an hour.
> 
> We have composite indexes on run_date,status. Do I need to create an index on 
> run_date only?

You need to share the query and probably the table definition.  EXPLAIN output
(without ANALYZE) will also help.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Postgres chooses slow query plan from time to time

2021-09-14 Thread Laurenz Albe

On Tue, 2021-09-14 at 10:55 +0300, Kristjan Mustkivi wrote:
> 2021-09-14 06:55:33 UTC, pid=12345  db=mydb, usr=myuser, client=ip,
> app=PostgreSQL JDBC Driver, line=55 LOG:  duration: 5934.165 ms  plan:
>   Query Text: SELECT *   FROM myschema.mytable pbh WHERE
> pbh.product_code = $1   AND pbh.cage_player_id = $2   AND
> pbh.cage_code = $3   AND balance_type = $4   AND pbh.modified_time <
> $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY
>   Limit  (cost=0.70..6.27 rows=1 width=66) (actual time=5934.154..5934.155 
> rows=1 loops=1)
>     Buffers: shared hit=7623 read=18217
>     ->  Index Scan Backward using mytable_idx2 on mytable pbh 
> (cost=0.70..21639.94 rows=3885 width=66) (actual time=5934.153..5934.153 
> rows=1 loops=1)
>   Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND 
> (modified_time < $5))

If it scanned the index for 6 seconds before finding the first result,
I'd suspect one of the following:

- the index is terribly bloated

- there were lots of deleted rows, and the index entries were marked as "dead"

- something locked the table for a long time

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pg_restore schema dump to schema with different name

2021-08-24 Thread Laurenz Albe

On Mon, 2021-08-23 at 17:54 +, Nagaraj Raj wrote:
> Wouldn’t be easy if we have option to_schema ?

Sure, but it wouldn't be easy to implement that.
It would have to be a part of "pg_dump".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pg_restore schema dump to schema with different name

2021-08-23 Thread Laurenz Albe

On Mon, 2021-08-23 at 09:44 +, Nagaraj Raj wrote:
> I know I can alter schema name after restoring but the problem is the name 
> already exist and I don't want to touch that existing schema.
> The dump type is "custom".
> 
> So effectively I want something like.
> pg_dump -U postgres --schema "source_schema" --format "c" --create --file 
> "source_schema.bak" my_db
> pg_restore -U postgres --exit-on-error --dbname "my_db"  --destination-schema 
> "destination_schema"

The only way to do that is to create a new database, import the data there,
rename the schema and dump again.

Then import that dump into the target database.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Performance benchmark of PG

2021-07-19 Thread Laurenz Albe

On Mon, 2021-07-19 at 15:39 +0530, Manish Lad wrote:
> We are planning to migrate Oracle exadata database to postgresql and db size 
> ranges from 1 tb to 60 TB. 
> 
> Will the PG support this with the performance matching to that of exadata 
> applince? 
> If anyone could point me in the right direction where i xan get the 
> benchmarking done
>  for these two databases either on prime or any cloud would be great. 

You won't find any trustworthy benchmarks anywhere, because Oracle expressedly
forbids publishing of benchmark results in its license, unless Oracle has given
its permission.

The question cannot be answered, because performance depends on your workload,
configuration, software and hardware.  Perhaps PostgreSQL will be faster, 
perhaps not.

Test and see.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Linear slow-down while inserting into a table with an ON INSERT trigger ?

2021-07-18 Thread Laurenz Albe

On Sun, 2021-07-18 at 09:36 +0200, Tobias Gierke wrote:
> Thank you for the detailed explanation ! Just one more question: I've 
> did an experiment and reduced the fillfactor on the table updated by the 
> trigger to 50%, hoping  the HOT feature would kick in and each 
> subsequent INSERT would clean up the "HOT chain" of the previous INSERT 
> ... but execution times did not change at all compared to 100% 
> fillfactor, why is this ? Does the HOT feature only work if a different 
> backend accesses the table concurrently ?

No, but until the transaction is done, the tuples cannot be removed,
no matter if they are HOT or not.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: temporary file log lines

2021-07-12 Thread Laurenz Albe

On Thu, 2021-07-08 at 17:22 -0400, MichaelDBA wrote:
> I got a question about PG log lines with temporary file info like this:
> 
> case 1: log line with no contextual info
> 2021-07-07 20:28:15 UTC:10.100.11.95(50274):myapp@mydb:[35200]:LOG: 
> temporary file: path "base/pgsql_tmp/pgsql_tmp35200.0", size 389390336
> 
> case 2: log line with contextual info
> 2021-07-07 20:56:18 UTC:172.16.193.118(56080):myapp@mydb:[22418]:LOG: 
> temporary file: path "base/pgsql_tmp/pgsql_tmp22418.0", size 1048576000
> 2021-07-07 20:56:18 
> UTC:172.16.193.118(56080):myapp@mydb:[22418]:CONTEXT:  PL/pgSQL function 
> memory.f_memory_usage(boolean) line 13 at RETURN QUERY
> 
> There are at least 2 cases where stuff can spill over to disk:
> * queries that don't fit in work_mem, and
> * temporary tables that don't fit in temp_buffers
> 
> Question, if log_temp_files is turned on (=0), then how can you tell 
> from where the temporary log line comes from?
> I see a pattern where work_mem spill overs have a CONTEXT line that 
> immediately follows the LOG LINE with keyword, temporary. See case 2 above.
> 
> For other LOG lines with keyword, temporary, there is no such pattern. 
> Could those be the ones caused by temp_buffer spill overs to disk?  case 
> 1 above.
> 
> I really want to tune temp_buffers, but I would like to be able to 
> detect when temporary tables are spilling over to disk, so that I can 
> increase temp_buffers.
> 
> Any help would be appreciated.

I am not sure if you can istinguish those two cases from the log.

What I would do is identify the problematic query and run it with
EXPLAIN (ANALYZE, BUFFERS).  Then you should see which part of the query
creates the temporary files.

If it is a statement in a function called from your top level query,
auto_explain with the correct parameters can get you that output for
those statements too.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: overcommit_ratio setting

2021-06-15 Thread Laurenz Albe

On Tue, 2021-06-15 at 08:34 +0800, Yi Sun wrote:
> > overcommit_ratio < (RAM - swap) / RAM * 100
> > 
> > Here, RAM is the RAM available to PostgreSQL.
> 
> Thank you for your reply
> 
> 1. Our env RAM are 4GB, 8 GB, 16 GB... as below url suggestion, could we 
> configure swap as below?
> https://opensource.com/article/18/9/swap-space-linux-systems
> 
> RAM swap
> 
> 2GB – 8GB   = RAM
> >8GB8GB

I wouldn't change the swap space to fit overcommit_ratio, but
the other way around.
With a properly configured PostgreSQL, you won't need a lot of swap space.

> 2. If the RAM is 4GB and 8GB, the formula  (RAM - swap) / RAM * 100 result 
> will become to 0,
>how could we configure overcommit_ratio please?

You have to use floating point arithmetic.

The result will only be 0 if RAM = swap.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: overcommit_ratio setting

2021-06-14 Thread Laurenz Albe

On Mon, 2021-06-14 at 18:16 +0800, Yi Sun wrote:
> After checking doc, only mentioned vm.overcommit_memory=2, but didn't 
> mentioned vm.overcommit_ratio recommended value
> https://www.postgresql.org/docs/11/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
> some articles mentioned that 80 or 90 configuration in their env
> So is it OK just to configure  vm.overcommit_ratio to 90 please?

It depends on the size of RAM and swap space:

overcommit_ratio < (RAM - swap) / RAM * 100

Here, RAM is the RAM available to PostgreSQL.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: 15x slower PreparedStatement vs raw query

2021-05-04 Thread Laurenz Albe

On Tue, 2021-05-04 at 13:59 +, Alex wrote:
> "Powerful general purpose relational database" but not smart... 

Too smart can easily become slow...

> I propose a feature to use information from previously executed queries to 
> adjust the query plan time accordingly.
> Reusing the same generic plan may and will lead to very long execution times.

AI can go wrong too, and I personally would be worried that such cases
are very hard to debug...

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: INSERTS waiting with wait_event is "transactionid"

2021-04-09 Thread Laurenz Albe

On Thu, 2021-04-08 at 20:14 +, Nagaraj Raj wrote:
> We are trying to load data around 1Bil records into one table with INSERT 
> statements
>  (not able to use COPY command) and they are been waiting for a lock and the 
> wait_event
>  is "transactionid", I didn't find any information in the documents. Queries 
> have been
>  waiting for hours.

That means that your statement is stuck behind a row lock.

Row locks are stored on the table row itself and contain the transaction ID.
So the process has to wait until the transaction goes away, which is implemented
as waiting for a lock on the transaction ID.

There must be a long running transaction that locks a row that is needed for
the INSERT.  It could be a row in a different table that is referenced by a
foreign key.

Make that long running transaction go away.  Transactions should never last 
that long.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Substitute for synonym in Oracle after migration to postgres

2021-04-06 Thread Laurenz Albe

On Tue, 2021-04-06 at 13:22 +0530, aditya desai wrote:
> We have to access data from one schema to another. We have created
>  a view for this but performance is not good.

The performance of a view that is just a simple SELECT to a table
in a different schema will be just as good as using that table
directly.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: SQL performance issue (postgresql chooses a bad plan when a better one is available)

2021-03-22 Thread Laurenz Albe

On Mon, 2021-03-22 at 08:10 -0500, Chris Stephens wrote:
> The following SQL takes ~25 seconds to run. I'm relatively new to postgres
>  but the execution plan (https://explain.depesz.com/s/N4oR) looks like it's
>  materializing the entire EXISTS subquery for each row returned by the rest
>  of the query before probing for plate_384_id existence. postgres is
>  choosing sequential scans on sample_plate_384 and test_result when suitable,
>  efficient indexes exist. a re-written query produces a much better plan
>  (https://explain.depesz.com/s/zXJ6). Executing the EXISTS portion of the
>  query with an explicit PLATE_384_ID yields the execution plan we want as
>  well (https://explain.depesz.com/s/3QAK). unnesting the EXISTS and adding
>  a DISTINCT on the result also yields a better plan.

Great!  Then use one of the rewritten queries.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: wide table, many many partitions, poor query performance

2021-03-15 Thread Laurenz Albe

On Mon, 2021-03-15 at 10:53 -0600, S Bob wrote:
> We have a client that is running PostgreSQL 12, they have a table with 
> 212 columns and 723 partitions
> 
> It seems the planning time is consumed by generating 723 sub plans
> 
> I suspect it's due to the fact that they are using hash based 
> partitioning, example:
> 
> CREATE TABLE rental_transaction_hash_p723 PARTITION OF 
> rental_transaction FOR VALUES WITH (MODULUS 723, REMAINDER 723);
> 
> Based on a strategy like this, queries will ALWAYS scan all partitions 
> unless a hash value is specified as part of the query, correct? I 
> suspect this is the issue... looking for confirmation, or feedback if 
> i'm off base

That is correct.

The only use I can see in hash partitioning is to put the partitions
on different storage devices in order to spread I/O - kind of striping
on the database level.

Unless you can benefit from that, your queries will become slower.

Yours,
Laurenz Albe

Re: Slow query performance inside a transaction on a clean database

2021-03-08 Thread Laurenz Albe

On Fri, 2021-03-05 at 17:55 +, val.jane...@gmail.com wrote:
> I have a SELECT query that uses a long chain of CTEs (6) and is executed
> repeatedly as part of the transaction (with different parameters). It is
> executed quickly most of the time, but sometimes becomes very slow. I
> managed to consistently reproduce the issue by executing a transaction
> containing this query on an empty database. The query is fast for the first
> 150-170 inserted resources, but ~50% of the executions afterwards take 5.6s
> instead of 1.4ms. Additionally it only becomes slow if resources are
> inserted in a random order, if I insert resources sorted by
> `start_date_time` column the query is always fast.
> 
> Here are the results of EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT
> JSON):
> - Slow: https://explain.depesz.com/s/e4Fo
> - Fast: https://explain.depesz.com/s/7HFJ

If your transaction modifies the data significantly (which it does if the
table is empty before you start), you should throw in an ANALYZE on the
affected tables occasionally.

Normally, autovacuum takes care of that, but it cannot see your data
until the transaction is committed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Autovacuum not functioning for large tables but it is working for few other small tables.

2021-02-19 Thread Laurenz Albe

On Fri, 2021-02-19 at 10:51 +, M Tarkeshwar Rao wrote:
> Please find the Vacuum(verbose) output. Can you please suggest what is the 
> reason?
> How can we avoid these scenarios?
> 
> The customer tried to run the VACUUM(verbose) last night, but it was running
>  continuously for 5 hours without any visible progress. So they had to abort 
> it
>  as it was going to exhaust their maintenance window.
> 
> db_Server14=# VACUUM (VERBOSE) audittraillogentry;
> INFO:  vacuuming "mmsuper.audittraillogentry"
> INFO:  scanned index "audittraillogentry_pkey" to remove 11184539 row versions
> DETAIL:  CPU 25.24s/49.11u sec elapsed 81.33 sec
> INFO:  scanned index "audit_intime_index" to remove 11184539 row versions
> DETAIL:  CPU 23.27s/59.28u sec elapsed 88.63 sec
> INFO:  scanned index "audit_outtime_index" to remove 11184539 row versions
> DETAIL:  CPU 27.02s/55.10u sec elapsed 92.04 sec
> INFO:  scanned index "audit_sourceid_index" to remove 11184539 row versions
> DETAIL:  CPU 110.81s/72.29u sec elapsed 260.71 sec
> [and so on, the same 6 indexes are repeatedly scanned]

PostgreSQL performs VACUUM in batches of "maintenance_work_mem" size
of tuple identifiers.  If that parameter is small, the indexes have
to be scanned often.

Try increasing "maintenance_work_mem" to 1GB (if you have enough RAM),
that will make it faster.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: High COMMIT times

2021-01-08 Thread Laurenz Albe

On Thu, 2021-01-07 at 10:49 -0700, Craig Jackson wrote:
> We had a similar situation recently and saw high commit times that were caused
>  by having unindexed foreign key columns when deleting data with large tables 
> involved.
> You might check to see if any new foreign key constraints have been added
>  recently or if any foreign key indexes may have inadvertently been removed.
>  Indexing the foreign keys resolved our issue. 

Were these deferred foreign key constraints?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: High COMMIT times

2021-01-06 Thread Laurenz Albe

On Wed, 2021-01-06 at 10:19 -0600, Don Seiler wrote:
> This week we've noticed that we're starting to see spikes where COMMITs are 
> taking much longer than usual.
>  Sometimes, quite a few seconds to finish.
>
> This is PostgreSQL 12.4 on Ubuntu 18.04, all running in MS Azure (*not* 
> managed by them).

Unless you are using WITH HOLD cursors on large result sets, this is very likely
I/O overload.  Use tools like "sar", "vmstat" and "iostat" to monitor your I/O 
load.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: "Required checkpoints occurs too frequently"

2020-12-11 Thread Laurenz Albe

On Fri, 2020-12-11 at 13:42 +0530, Atul Kumar wrote:
> We are getting this alert frequently "Required checkpoints occurs too
> frequently" on postgres version 11.8
> 
> The RAM of the server is 16 GB.
> 
> and we have already set the max_wal_size= 4096 MB
> min_wal_size= 192 MB.

You should increase "max_wal_size" even more.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: How to prioritise walsender reading from pg_wal over WAL writes?

2020-11-18 Thread Laurenz Albe

On Wed, 2020-11-18 at 12:15 +, Alexey Bashtanov wrote:
> My reply did not get threaded appropriately.
> My original question was here:
> https://www.postgresql.org/message-id/a74d5732-60fd-d18b-05fd-7b2b97099f19%40imap.cc
> I'd like to prioritize walsender for replication not to lag too much.
> Otherwise, when I have load spikes on master, standby lags, sometimes by 
> hundreds of gigabytes.

I would first determine where the bottleneck is.

Is it really the walsender, or is it on the network or in the standby server's 
replay?

Check the difference between "sent_lsn", "replay_lsn" from 
"pg_stat_replication" and
pg_current_wal_lsn() on the primary.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: How to prioritise walsender reading from pg_wal over WAL writes?

2020-11-18 Thread Laurenz Albe

On Wed, 2020-11-18 at 11:54 +, Alexey Bashtanov wrote:
> If it helps, here's the details of the hardware config.
> The controller is AVAGO MegaRAID SAS 9361-4i,
> the SSDs are INTEL SSDSC2KG960G8 (configured as a raid1).
> Current scheduler used is deadline.
> Currently XFS is mounted without nobarriers, but I'm going to set that when 
> there's high load next time to see how it affects throughput.
> 
> Properties of the array
> VD6 Properties :
> ==
> Strip Size = 256 KB
> Number of Blocks = 1874329600
> VD has Emulated PD = Yes
> Span Depth = 1
> Number of Drives Per Span = 2
> Write Cache(initial setting) = WriteBack
> Disk Cache Policy = Disk's Default
> Encryption = None
> Data Protection = Disabled
> Active Operations = None
> Exposed to OS = Yes
> Creation Date = 28-10-2020
> Creation Time = 05:17:16 PM
> Emulation type = default
> Cachebypass size = Cachebypass-64k
> Cachebypass Mode = Cachebypass Intelligent
> Is LD Ready for OS Requests = Yes
> SCSI NAA Id = 600605b00af03650272c641ca30c3196

Why??

WAL buffers has the most recent information, so that would result in 
unnecessary delay and I/O.

You'd have to hack the code, but I wonder what leads you to this interesting 
requirement.

Yours,
Laurenz Albe
-- 
+43-670-6056265
CYBERTEC PostgreSQL International GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

Re: Partition pruning with joins

2020-11-03 Thread Laurenz Albe

On Tue, 2020-11-03 at 13:20 +, Ehrenreich, Sigrid wrote:
> I would like to join a partitioned table and have the joined columns in the 
> where clause to be used for partition pruning.
> From some readings in the internet, I conclude that this was not possible in 
> v12. I hoped for the
>  “improvements in partition pruning” in v13, but it seems to me, that it is 
> still not possible, or is it and I am missing something here?
> 
> My testcase:
> create table fact (part_key integer) partition by range (part_key);
> create table fact_100 partition of fact for values from (1) to (101);
> create table fact_200 partition of fact for values from (101) to (201);
> 
> insert into fact (part_key) select floor(random()*100+1) from 
> generate_series(1,1);
> insert into fact (part_key) select floor(random()*100+101) from 
> generate_series(1,1);
> 
> create table dim as (select distinct part_key from fact);
> create unique index on dim (part_key);
> 
> analyze fact;
> analyze dim;
> 
> -- Statement
> explain SELECT
> count(*)
> FROM
> dim INNER JOIN fact ON (dim.part_key=fact.part_key)
> WHERE dim.part_key >= 110 and dim.part_key <= 160;
> 
> Plan shows me, that all partitions are scanned:
> Aggregate  (cost=461.00..461.01 rows=1 width=8)
>   ->  Hash Join  (cost=4.64..448.25 rows=5100 width=0)
> Hash Cond: (fact.part_key = dim.part_key)
> ->  Append  (cost=0.00..390.00 rows=2 width=4)
>   ->  Seq Scan on fact_100 fact_1  (cost=0.00..145.00 rows=1 
> width=4)  ⇐ unnecessarily scanned
>   ->  Seq Scan on fact_200 fact_2  (cost=0.00..145.00 rows=1 
> width=4)
> ->  Hash  (cost=4.00..4.00 rows=51 width=4)
>   ->  Seq Scan on dim  (cost=0.00..4.00 rows=51 width=4)
> Filter: ((part_key >= 110) AND (part_key <= 160))

One thing you could try is to partition "dim" just like "fact" and
set "enable_partitionwise_join = on".

I didn't test it, but that might do the trick.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-16 Thread Laurenz Albe

On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
> Below query always shows up on top in the CPU matrix. Also despite having 
> indexes it does sequential scans
> (probably because WHERE condition satisfies almost all of the data from 
> table). This query
> runs on the default landing page in application and needs to fetch records in 
> less that 100 ms
>  without consuming too much CPU.
> 
>  Any opinions? Table is very huge and due to referential identity and 
> business requirements we could not
>  implement partitioning as well.
> 
> There is index on (countrycode,facilitycode,jobstartdatetime)
> 
> explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1) 
> stat_count from job j
>  where 1=1 and j.countrycode = 'TH'
> and j.facilitycode in 
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30 
> 00:00:00' ) or j.jobstartdatetime IS NULL )  group by j.jobstatuscode)
>  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc 
> right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;
> 
>   QUERY PLAN
> 
>  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual 
> time=1314.809..1314.849 rows=10 loops=1)
>  ->  Parallel Seq Scan on job j  
> (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434 
> rows=163200 loops=3)
>Filter: (((countrycode)::text = 
> 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp 
> without time zone) AND (jobst
> artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR 
> (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY 
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
> ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>Rows Removed by Filter: 449035
>Buffers: shared hit=60086 read=11834
>I/O Timings: read=59.194
> 

You should rewrite the subquery as a UNION to avoid the OR:

  ... WHERE j.countrycode = 'TH'
and j.facilitycode in 
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime between '2020-08-01 00:00:00' and '2020-09-30 
00:00:00'

and

  ... WHERE j.countrycode = 'TH'
and j.facilitycode in 
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime IS NULL

These indexes could speed up the resulting query:

  CREATE INDEX ON job (countrycode, facilitycode);
  CREATE INDEX ON job (countrycode, jobstartdatetime);
  CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime IS NULL;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Poor Performance running Django unit tests after upgrading from 10.6

2020-10-14 Thread Laurenz Albe

On Thu, 2020-10-15 at 01:21 -0400, Roger Hunwicks wrote:
> We have a large Django application running against a Postgresql database.
> 
> When we run the test suite using Postgresql 10.7 in a Docker container
> we consistently get:
> 
> Ran 1166 tests in 1291.855s
> 
> When we first started running the same test suite against Postgresql
> 12.4 we got:
> 
> Ran 1166 tests in 8502.030s
> 
> I think that this reduction in performance is caused by the lack of
> accurate statistics [...]
> 
> We have since managed to get the performance of the test run using
> 12.4 back to approximately the normal range by customizing the
> Postgresql parameters. `seq_page_cost=0.1` and `random_page_cost=0.11`
> seem to be key, but we are also setting `shared_buffers`, etc. and all
> the other typical parameters. With Postgresql 10.7 we weren't setting
> anything and performance was fine using just the defaults, given the
> tiny data volumes.
> 
> However, even though we have similar performance for 12.4 for most
> test runs, it remains very variable. About 30% of the time we get
> something like:
> 
> I think we have narrowed down the problem to a single, very complex,
> materialized view using CTEs; the unit tests create the test data and
> then refresh the materialized view before executing the actual test
> code.
> 
> Database logging using autoexplain shows things like:
> 
> db_1 | 2020-10-14 10:27:59.692 UTC [255] LOG:  duration:
> 4134.625 ms  plan:
> db_1 | Query Text: REFRESH MATERIALIZED VIEW
> price_marketpricefacts_materialized
> db_1 | Merge Join
> (cost=14141048331504.30..9635143213364288.00 rows=116618175994107184
> width=3302) (actual time=4134.245..4134.403 rows=36 loops=1)
> 
> For comparison, the equivalent query on 10.7 has:
> 
> db_1 | 2020-10-15 03:28:58.382 UTC [163] LOG:  duration:
> 10.500 ms  plan:
> db_1 | Query Text: REFRESH MATERIALIZED VIEW
> price_marketpricefacts_materialized
> db_1 | Hash Left Join  (cost=467650.55..508612.80
> rows=199494 width=3302) (actual time=10.281..10.341 rows=40 loops=1)
> 
> I can get performance almost identical to 10.7 by altering the unit
> tests so that in each test that refreshes the materialized view prior
> to executing the query, we execute `ANALYZE;` prior to refreshing the
> view.
> 
> Is it worth us trying to debug the plan for situations with low row
> counts and poor statistics? Or is this use case not really covered:
> the general advice is obviously to make sure that statistics are up to
> date before troubleshooting performance problems. On the other hand,
> it is not easy for us to make sure that we run analyze inside the
> transaction in each unit test; it also seems a bit wasteful.
> 
> Opinions and advice gratefully received.

Yes, the query plan for the query that defines the materialized view
is the interesting data point.  Run an EXPLAIN (ANALYZE, BUFFERS) on
that query.

If your statistics are off because the data have just been imported a
second ago, run an explicit ANALYZE on the affected tables after import.

If your statistics are off because they are not calculated often enough,
consider lowering "autovacuum_analyze_scale_factor".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Too many waits on extension of relation

2020-10-05 Thread Laurenz Albe

On Mon, 2020-10-05 at 10:32 +0530, avinash varma wrote:
> Can someone please guide me how to improve/reduce these wait events.
> 
> Postgres Version:9.5
> 
> LOG: process 3718 still waiting for ExclusiveLock on extension of relation 
> 266775 of database 196511 after 1000.057 ms
> Detail: Process holding the lock: 6423. Wait queue: 3718, 4600, 2670, 4046.
> Context: SQL statement "INSERT INTO 
> cms_c207c1e2_0ce7_422c_aafb_77d43f61e563.cms_item [...]

Process 6423 is holding a lock on the table into which you'd like to INSERT
that blocks several other sessions.

Make sure that the transaction in this database session ends, e.g. by

  SELECT pg_cancel_backend(6423);

Either there is a session that did not close its transaction (coding bug),
or a database statement ran inordinately long.

Yours,
Laurenz Albe
-- 
+43-670-6056265
CYBERTEC PostgreSQL International GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Laurenz Albe

On Tue, 2020-08-18 at 19:52 -0400, Jim Jarvie wrote:
> I have a system which implements a message queue with the basic pattern that 
> a process selects a group of,
>  for example 250, rows for processing via SELECT .. LIMIT 250 FOR UPDATE SKIP 
> LOCKED.
> 
> When there are a small number of concurrent connections to process the queue, 
> this seems to work as
>  expected and connections quickly obtain a unique block of 250 rows for 
> processing.
> However, as I scale up the number of concurrent connections, I see a spike in 
> CPU (to 100% across 80 cores)
>  when the SELECT FOR UPDATE SKIP LOCKED executes and the select processes 
> wait for multiple minutes
>  (10-20 minutes) before completing.  My use case requires around 256 
> concurrent processors for the queue
>  but I've been unable to scale beyond 128 without everything grinding to a 
> halt.
> 
> The queue table itself fits in RAM (with 2M hugepages) and during the wait, 
> all the performance counters
>  drop to almost 0 - no disk read or write (semi-expected due to the table 
> fitting in memory) with 100%
>  buffer hit rate in pg_top and row read around 100/s which is much smaller 
> than expected.
> 
> After processes complete the select and the number of waiting selects starts 
> to fall, CPU load falls and
>  then suddenly the remaining processes all complete within a few seconds and 
> things perform normally until
>  the next time there are a group of SELECT  FOR UPDATE statements which bunch 
> together and things then repeat.
> 
> I found that performing extremely frequent vacuum analyze (every 30 minutes) 
> helps a small amount but
>  this is not that helpful so problems are still very apparent.
> 
> I've exhausted all the performance tuning and analysis results I can find 
> that seem even a little bit
>  relevant but cannot get this cracked.
> 
> Is anyone on the list able to help with suggestions of what I can do to track 
> why this CPU hogging happens
>  as this does seem to be the root of the problem?

You should

- check with "pgstattuple" if the table is bloated.

- use "perf" to see where the CPU time is spent.

- look at "pg_stat_activity" for wait events (unlikely if the CPU is busy).

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Unclamped row estimates whith OR-ed subplans

2020-06-19 Thread Laurenz Albe

On Fri, 2020-06-19 at 17:12 +0200, Benjamin Coutu wrote:
> please consider the following SQL query:
> 
> SELECT * FROM "transactions" WHERE 
> "account" IN (SELECT "ID" FROM "accounts" WHERE "name" ~~* '%test%') 
> OR
> "contract" IN (SELECT "ID" FROM "contracts" WHERE "name" ~~* '%test%')
> 
> This yields the following plan on Postgres 11:
> 
> Seq Scan on transactions  (cost=67.21..171458.03 rows=1301316 width=1206)
>   Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
>   SubPlan 1
> ->  Bitmap Heap Scan on accounts  (cost=33.36..61.16 rows=46 width=4)
>   Recheck Cond: ((name)::text ~~* '%test%'::text)
>   ->  Bitmap Index Scan on s_accounts  (cost=0.00..33.35 rows=46 
> width=0)
> Index Cond: ((name)::text ~~* '%test%'::text)
>   SubPlan 2
> ->  Seq Scan on contracts  (cost=0.00..5.93 rows=5 width=4)
>   Filter: ((name)::text ~~* '%test%'::text)
> 
> So the where clause of this query has just two subplans OR-ed together, one 
> is estimated to yield 46 rows and one is estimated to yield 5 rows.
> I'd expect the total rows for the seqscan to be estimated at 46 then, 
> following the logic that rows_seqscan = max(rows_subplan1, rows_subplan2). As 
> you can see, the optimizer estimates a whopping
> 1301316 rows instead.
> 
> I am absolutely aware that those are hashed sub plans below a seqscan and 
> that Postgres therefore has to scan all tuples of the table. But the problem 
> is that upper nodes (which are excluded from
> this example for simplicity) think they will receive 1301316 rows from the 
> seqscan, when in fact they will probably only see a hand full, which the 
> planner could have (easily?) deduced by taking the
> greater of the two subplan row estimates.
> 
> What am I missing, or is this perhaps a shortfall of the planner?

The subplans are executed *fpr each row* found in "transactions",
and the estimate on the subplans is *per execution".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: simple query running for ever

2020-06-17 Thread Laurenz Albe

On Tue, 2020-06-16 at 20:35 +, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in 
> it, even its not giving EXPLAIN ANALYZE.
> 
> select T0."physical_address_sid", T0."individual_entity_proxy_id", 
> T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on 
> T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on 
> T0."individual_entity_proxy_id" = T2."individual_entity_proxy_id" 
> 
> 
> "Merge Left Join  (cost=55.96..18147747.08 rows=213620928 width=20)"
> "  Merge Cond: (t0.individual_entity_proxy_id = 
> t2.individual_entity_proxy_id)"
> "  ->  Index Scan using pk_prospect on prospect t0  (cost=0.57..10831606.89 
> rows=213620928 width=16)"
> "  ->  Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on 
> individual_demographic t2  (cost=0.57..5013756.93 rows=260652064 width=12)"
> 
> 
> 
> Any suggestions or help would be highly appreciated. 

The only potential improvement I can see is to strive for an
"index only scan" on "prospect".

For that, you'd have to add and INCLUDE clause to "pk_prospect"
so that "physical_address_sid" and "individual_entity_proxy_id"
are included and VACUUM the table.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Request to help on Query improvement suggestion.

2020-05-24 Thread Laurenz Albe

On Fri, 2020-05-22 at 16:15 +0530, devchef2020 d wrote:
> PostgreSQL : 9.5.15

> Created Indexes on column parental_path:
> =
> 
> CREATE INDEX cable_pair_parental_path_idx
>   ON SCHEMA.TABLE_NAME
>   USING btree
>   (md5(parental_path) COLLATE pg_catalog."default");
>   
> CREATE INDEX cable_pair_parental_path_idx_fulltext
>   ON SCHEMA.TABLE_NAME
>   USING gist
>   (parental_path COLLATE pg_catalog."default");

> SELECT seq_no + 1 FROM SCHEMA.TABLE_NAME WHERE (parental_path LIKE 
> '%,sheath--' || cable_seq_id || ',%' OR parental_path LIKE 'sheath--' || 
> cable_seq_id || ',%' OR parental_path LIKE '%,sheath--' ||
> cable_seq_id OR parental_path = 'sheath--' || cable_seq_id) ORDER BY seq_no 
> DESC LIMIT 1;
> 
> Explain Plan:
> =
> 
> Limit  (cost=108111.60..108111.61 rows=1 width=4) (actual 
> time=4597.605..4597.605 rows=0 loops=1)
>  Output: ((seq_no + 1)), seq_no
>  Buffers: shared hit=2967 read=69606 dirtied=1
>  ->  Sort  (cost=108111.60..108113.09 rows=595 width=4) (actual 
> time=4597.603..4597.603 rows=0 loops=1)
>Output: ((seq_no + 1)), seq_no
>Sort Key: TABLE_NAME.seq_no DESC
>Sort Method: quicksort  Memory: 25kB
>Buffers: shared hit=2967 read=69606 dirtied=1
>->  Seq Scan on SCHEMA.TABLE_NAME  (cost=0.00..108108.63 rows=595 
> width=4) (actual time=4597.595..4597.595 rows=0 loops=1)
>  Output: (seq_no + 1), seq_no
>  Filter: ((TABLE_NAME.parental_path ~~ '%,sheath--64690,%'::text) 
> OR (TABLE_NAME.parental_path ~~ 'sheath--64690,%'::text) OR 
> (TABLE_NAME.parental_path ~~ '%,sheath--64690'::text) OR
> (TABLE_NAME.parental_path = 'sheath--64690'::text))
>  Rows Removed by Filter: 1930188
>  Buffers: shared hit=2967 read=69606 dirtied=1

An index on an expression can only be used if the expression is exactly the 
same as on one
side of an operator in a WHERE condition.

So your only chance with that query is to hope for a bitmap OR with an index on 
"parental path".

Two things to try:

1)  CREATE INDEX ON table_name (parental_path text_pattern_ops);

2)  CREATE EXTENSION pg_trgm;
CREATE INDEX ON table_name USING GIN (parental_path gin_trgm_ops);

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: OOM Killer kills PostgreSQL

2020-05-20 Thread Laurenz Albe

On Wed, 2020-05-20 at 09:30 +0200, Piotr Włodarczyk wrote:
> We met unexpected PostgreSQL shutdown. After a little investigation
> we've discovered that problem is in OOM killer which kills our PostgreSQL.
> Unfortunately we can't find query on DB causing this problem. Log is as below:

Is there nothing in the PostgreSQL log?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Laurenz Albe

On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > Just set "autovacuum_max_workers" higher.
> 
> No, that wouldn't help. If you just increase autovacuum_max_workers, the 
> total cost limit of
> autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many 
> workers and it
> further delays autovacuum per each worker. Instead you need to increase 
> autovacuum_vacuum_cost_limit
> as well when you increase the number of workers.

True, I should have mentioned that.

> But, if you do that and also increase workers, well, you would easily reach 
> the limitations
> of the disk. I am not sure it is anywhere advised to have 20 
> autovacuum_max_workers unless
> i have a disk with lots of IOPS and with very tiny tables across all the 
> databases.

Sure, if you have a high database load, you will at some point exceed the 
limits of
the machine, which is not surprising.  What I am trying to say is that you have 
to ramp
up the resources for autovacuum together with increasing the overall workload.
You should consider autovacuum as part of that workload.

If your machine cannot cope with the workload any more, you have to scale, which
is easily done by adding more machines if you have many databases.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: pg_attribute, pg_class, pg_depend grow huge in count and size with multiple tenants.

2020-05-07 Thread Laurenz Albe

On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > The nice thing about separate databases is that it is easy to scale
> > horizontally.
> 
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 
> clusters
> means you may have to have a lot of manual vacuuming in place as well.

Just set "autovacuum_max_workers" higher.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: NUMA settings

2020-05-05 Thread Laurenz Albe

On Tue, 2020-05-05 at 10:11 +0200, Marc Rechté wrote:
> > I think that "vm.swappiness" should be 0.
> > PostgreSQL does its own memory management, any swapping by the kernel
> > would go against that.
> > 
> > Yours,
> > Laurenz Albe
> > 
> As said in the post, we wish the idle backends to be swapped out if 
> necessary. Therefore lowering swappiness would produce the opposite 
> effect: swapping out Linux file cache rather than backends memory.

I see.  Sorry for not paying attention.

An idle backend consumes only a few MB of RAM, though.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Please help! Query jumps from 1s -> 4m

2020-05-05 Thread Laurenz Albe

On Mon, 2020-05-04 at 20:12 +0100, James Thompson wrote:
> The change is abrupt, on the 10th execution (but I hadn't spotted it was 
> always after the
> same number of executions until your suggestion - thanks for pointing me in 
> that direction).
> 
> I don't see any custom configuration on our end that changes the threshold 
> for this from 5->10.
> Debugging the query call I also see that PgConnection has the 
> prepareThreshold set to 5.
> 
> Additionally, the execution plans for the 10th + following queries look fine, 
> they have the
> same structure as if I run the query manually. It's not that the query plan 
> switches,
> it seems as though the same query plan is just > 200X slower than usual.
> 
> As for the heap fetches -> as far as I can tell, on both occasions the 
> fetches are relatively
> low and shouldn't account for minutes of execution (even if one is lower than 
> the other).
> Looking through one days logs I do find cases with lower heap fetches too, 
> for example as
> below which has 1977 fetches instead of the previous 6940 but took approx the 
> same time:
> ->  Index Only Scan using table1_typea_include_uniqueid_col16_idx on table1 
> table1alias1  (cost=0.56..17.25 rows=1 width=60) (actual 
> time=56.858..120893.874 rows=67000 loops=1)
> Index Cond: (col20 = $2005)
> Filter: (((col3 = $2004) OR (col3 IS NULL)) AND ((col8)::text = ANY 
> ((ARRAY[$1004, ..., $2003])::text[])))
> Rows Removed by Filter: 2662793
> Heap Fetches: 1977
> Buffers: shared hit=84574 read=3522
> 
> Would you agree the statement threshold / heap fetches seems unlikely to be 
> causing this? Any other thoughts?

It does sound suspiciously like custom plans vs. generic plan.

If you are using JDBC, then the cut-off of 10 would make sense:
the JDBC driver uses (server) prepared statements only after the
fifth execution, and the prepared statement will use a generic plan
only after the fifth execution.

It would be good to see the execution plan from the third, seventh
and thirteenth execution.  You could use "auto_explain" for that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: NUMA settings

2020-05-05 Thread Laurenz Albe

On Tue, 2020-05-05 at 07:56 +0200, Marc Rechté wrote:
> Thanks for answers. Further readings make me think that we should *not* 
> start postgres with numactl --interleave=all: this may have counter 
> productive effect on backends anon memory (heap, stack). IMHO, what is 
> important is to use Huge Pages for shared buffers: they are allocated 
> (reserved) by the kernel at boot time and spread evenly on all nodes. On 
> top of that they never swap.
> 
> My (temp) conclusions are following:
> vm.zone_reclaim_mode = 0
> kernel.numa_balancing = 0 (still not sure with that choice)
> wm.swappiness = 60 (default)
> start postgres as usual (no numactl)

Thanks for sharing your insights.

I think that "vm.swappiness" should be 0.
PostgreSQL does its own memory management, any swapping by the kernel
would go against that.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: NUMA settings

2020-04-29 Thread Laurenz Albe

On Wed, 2020-04-29 at 08:54 +0200, Marc Rechté wrote:
> I am trying to figure out the recommended settings for a PG dedicated 
> machine regarding NUMA.
> 
> I assume that the shared buffers are using Huge Phages only. Please 
> correct if I am wrong:
> 
> 1) postgres is started with numactl --interleave=all, in order to spread 
> memory pages evenly on nodes.
> 2) wm.swappiness is left to the default 60 value, because Huge Pages 
> never swap, and we wish the idle backend to be swapped out if necessary.
> 3) vm.zone_reclaim_mode = 0. I am not sure it is the right choice.
> 4) kernel.numa_balancing = 1. Only if it is confirmed that it will not 
> affect postgres, because started with the interleave policy.

I am not the top expert on this, but as far as I can tell:

- Disabling NUMA is good if you want to run a single database cluster
  on the machine that should use all resources.

  If you want to run several clusters that share the resources, leaving
  NUMA support enabled might be the better thing to do.

- If you can, disable NUMA in the BIOS, on as low a level as possible.

- I think "kernel.numa_balancing" should be 0.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Postgres not using index on views

2020-04-06 Thread Laurenz Albe

On Mon, 2020-04-06 at 14:19 +, Rick Vincent wrote:
> I am seeing a performance problem with postgresql v 11.7 on views, and I am 
> wondering if
> anyone can tell me why or has any suggestion.

Your account is somewhat confused - too many questions rolled into one
rant, I would say.

There are two points that may clear up the case:

- If you have no WHERE clause, a sequential scan of the table is usually
  the best way to do it.  The exception is an index only scan if the index
  contains all that is required, but in PostgreSQL you need a recently
  VACUUMed table for that.

- The expensive part in your view is the "extractValueJS" function.
  Try to tune that for better performance.

If any of your problems are not explained by that, please say so.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Best way to delete big amount of records from big table

2020-03-27 Thread Laurenz Albe

On Fri, 2020-03-27 at 16:15 +0100, Ekaterina Amez wrote:
> > You should define primary and foreign keys if you can, but I guess
> > I don't have to tell you that.
> 
> Excuse me if this is a silly question but I've read (or understood) that it's 
> better
> to remove constraints to improve delete performance... this is related to 
> indexes only? or also to PK-FK?

I meant, add constraints *after* you are done cleaning up.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Best way to delete big amount of records from big table

2020-03-27 Thread Laurenz Albe

On Fri, 2020-03-27 at 15:13 +0100, Ekaterina Amez wrote:
> I'm trying to clean up a database with millions of records of 
> useless-but-don't-remove-just-in-case data. [...]
> 
> But also I'm cleaning tables with 150million records where I'm going to 
> remove 60% of existing data and after a few tests I'm not sure what's 
> the best approach as all seem to take similar time to run. These tables 
> are grouped in 4 tables group with master, detail, master_history, 
> detail_history structure. None of the tables have primary key nor 
> foreign key or any constraint but the sequence used for what should be 
> the PK column, though this column is not defined as PK.

You should define primary and foreign keys if you can, but I guess
I don't have to tell you that.

> I've decided to delete from the last one in chunks (10 days of data per 
> chunk but it coud be any other quantity) so I've created a function.  
> I've tested it with indexes (in master_hist for filtering data and in 
> detail_hist for the fk and pk), without indexes, after analyzing table, 
> and no matter what I always end up with more or less the same execution 
> time. I can afford the time it's getting to run but I'd like to know if 
> it's there a better way to do this.

There is no need to delete in batches unless you have a need to keep
transactions short (danger of deadlock because the data are still
modified, or you cannot afford to block autovacuum that long).

If you can drop the indexes while you do it (downtime), go for it.
Perhaps there is a way to use partial indexes that exclude all the
data that you have to delete, then work could go on as normal.

> I'm testing on version 9.2 BUT 
> production server is 8.4 (legacy application, supposed to be in at least 
> 9.2 but recently discovered it was 8.4, planning upgrade but not now). 
> Config parameters are default ones.

Now that is a seriously bad idea.  You should test on the same version
as you have running in production.  And you should insist in an upgrade.
People who insist in running ancient software often insist in ancient
hardware as well, and both is a good way to get data corruption.
If the system blows up, they are going to blame you.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: LDAP with TLS is taking more time in Postgresql 11.5

2020-02-25 Thread Laurenz Albe

On Tue, 2020-02-25 at 01:20 +0530, Mani Sankar wrote:
> We have recently upgraded our postgres servers from 9.4 version to 11.5 
> version. Post upgrade we are see delay in authentication. 
> 
> Issue is when we are using ldaptls=1 the authentication takes 1 second or 
> greater than that. But if I disable ldaptls it's getting authenticated within 
> milliseconds.
> 
> But in 9.4 even if I enable ldaptls it's getting authenticated within 
> milliseconds any idea why we are facing the issue?

I would use a packet sniffer like Wireshark to examine the message flow and see 
where the time is spent.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: tablespace to benefit from ssd ?

2020-02-19 Thread Laurenz Albe

On Wed, 2020-02-19 at 05:42 +0100, Nicolas PARIS wrote:
> I have both hdd and ssd disk on the postgres server. The cluster is
> right now created on the hdd only. I am considering using a tablespace
> to put some highly used postgres object on the ssd disk. Of course the
> ssd is small compared to the hdd, and I need to choose carefully what
> objects are stored on that side.
> 
> I am wondering what kind of object (indexes, data) would benefit from
> ssd. The database primary/foreign keys are highly used and there is
> almost no sequencial scan. However the server has a large amount of ram
> memory and I suspect all of those indexes are already cached in ram.
> 
> I have read that tablespaces introduce overhead of maintenance and
> introduce complication for replication. But on the other hand I have
> this ssd disk ready for something.
> 
> Any recommandation ?

Put "pg_stat_statements" into "shared_preload_libraries" and restart the server.

Set "track_io_timing" to on.

Let your workload run for at least a day.

Install the "pg_stat_statements" extension and run

  SELECT blk_read_time, query
  FROM pg_stat_statements
  ORDER BY blk_read_time DESC LIMIT 20;

That will give you the 20 queries that spent the most time reading from I/O.

Examine those queries with EXPLAIN (ANALYZE, BUFFERS) and see which tables or
indexes cause the I/O.

Then you have a list of candidates for the fast tablespace.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: SubtransControlLock and performance problems

2020-02-18 Thread Laurenz Albe

On Mon, 2020-02-17 at 19:41 +0100, Pavel Stehule wrote:
> I tested 
> 
> CREATE OR REPLACE FUNCTION public.fx(integer)
>  RETURNS void
>  LANGUAGE plpgsql
> AS $function$
> begin
>   for i in 1..$1 loop
>   begin
> insert into foo values(i);
> exception when others then
>   raise notice 'yyy';
> end;
> end loop;
> end;
> $function$
> 
> and I don't see any significant difference between numbers less than 64 and 
> higher

Did you have several concurrent sessions accessing the rows that others created?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: SubtransControlLock and performance problems

2020-02-17 Thread Laurenz Albe

On Mon, 2020-02-17 at 15:03 +, Lars Aksel Opsahl wrote:
> I have tested in branch ( 
> https://github.com/larsop/resolve-overlap-and-gap/tree/add_postgis_topology_using_func)
> where I use only have functions and no procedures and I still have the same 
> problem with subtransaction locks. 
> 
> Can I based on this assume that the problem is only related to exceptions  ?

No, it is related to BEGIN ... EXCEPTION ... END blocks, no matter if
an exception is thrown or not.

As soon as execution enters such a block, a subtransaction is started.

> Does this mean that if have 32 threads running in parallel and I get 2 
> exceptions in each thread I have reached a state where I will get contention ?

No, it means that if you enter a block with an EXCEPTION clause more
than 64 times in a single transaction, performance will drop.

> Is it any way increase from 64 to a much higher level, when compiling the 
> code ?

Yes, you can increase PGPROC_MAX_CACHED_SUBXIDS in src/include/storage/proc.h

> Basically what I do here is that I catch exceptions when get them and tries 
> to solve the problem in a alternative way.

Either use shorter transactions, or start fewer subtransactions.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: SubtransControlLock and performance problems

2020-02-17 Thread Laurenz Albe

Lars Aksel Opsahl wrote:
> What happens is that after some minutes the CPU can fall to maybe 20% usage 
> and most of
> the threads are blocked by SubtransControlLock, and when the number 
> SubtransControlLock
> goes down the CPU load increases again. The jobs usually goes through without 
> any errors,
> but it takes to long time because of the SubtransControlLock blocks.

That's typically a sign that you are using more than 64 subtransactions per 
transaction.

That could either be SAVEPOINT SQL statements or PL/pgSQL code with blocks
containing the EXCEPTION clause.

The data structure in shared memory that holds information for each session
can cache 64 subtransactions, beyond that it has to access "pg_subtrans" to get
the required information, which leads to contention.

Often the problem is caused by a misguided attempt to wrape every single
statement in a subtransaction to emulate the behavior of other database
systems, for example with the "autosave = always" option of the JDBC driver.

The solution is to use fewer subtransactions per transaction.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Partial index creation always scans the entire table

2020-02-15 Thread Laurenz Albe

On Sat, 2020-02-15 at 19:04 +0800, MingJu Wu wrote:
> When creating partial indexes, can postgres utilize another index for 
> figuring which rows
> should be included in the partial index, without performing a full table scan?

No; it has to be a full sequential scan.

> My scenario is that I have a table with 50M rows that are categorized into 
> 10K categories.
> I need to create a partial index for each category. I have created a index on 
> the category
> column, hoping that postgres can use this information when creating the 
> partial indexes.
> However, postgres always performs full table scan.

There is your problem.

You don't need a partial index per category, you need a single index that 
*contains* the category.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Writing 1100 rows per second

2020-02-05 Thread Laurenz Albe

On Wed, 2020-02-05 at 12:03 -0500, Arya F wrote:
> I'm looking to write about 1100 rows per second to tables up to 100 million 
> rows. I'm trying to
> come up with a design that I can do all the writes to a database with no 
> indexes. When having
> indexes the write performance slows down dramatically after the table gets 
> bigger than 30 million rows.
> 
> I was thinking of having a server dedicated for all the writes and have 
> another server for reads
> that has indexes and use logical replication to update the read only server.
> 
> Would that work? Or any recommendations how I can achieve good performance 
> for a lot of writes?

Logical replication wouldn't make a difference, because with many indexes, 
replay of the
inserts would be slow as well, and replication would lag more and more.

No matter what you do, there will be no magic way to have your tables indexed 
and
have fast inserts at the same time.

One idea I can come up with is a table that is partitioned by a column that 
appears
in a selective search condition, but have no indexes on the table, so that you 
always get
away with a sequential scan of a single partition.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Query optimization advice for beginners

2020-01-28 Thread Laurenz Albe

On Mon, 2020-01-27 at 13:15 +, Kemal Ortanca wrote:
> There is a query that runs slowly when I look at the logs of the database. 
> When I check the
> resources of the system, there is no problem in the resources, but this query 
> running slowly.
> There is no "Seq Scan" in the queries, so the tables are already indexed. But 
> I did not
> fully understand if the indexes were made correctly. When I analyze the query 
> result on
> explain.depesz, it seems that the query is taking too long. 
> 
> How should I fix the query below? How should I read the output of 
> explain.depesz? 
> 
> https://explain.depesz.com/s/G4vq

Normally you focus on where the time is spent and the mis-estimates.

The mis-estimates are notable, but this time not the reason for a
wrong choice of join strategy: evern though there are overestimates,
a nested loop join is chosen.

The time is spent in the 16979 executions of the outer subquery,
particularly in the inner subquery.

Because the query uses correlated subqueries, PostgreSQL has to execute
these conditions in the fashion of a nested loop, that is, the subquery
is executed for every row found.

If you manage to rewrite the query so that it uses (outer) joins instead
of correlated subqueries, the optimizer can use different strategies
that may be more efficient.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters

2020-01-16 Thread Laurenz Albe

On Thu, 2020-01-16 at 19:18 +0200, Cosmin Prund wrote:
> Indexes:
> "PK_LucrareBugetDate" PRIMARY KEY, btree ("LucrareBugetVersiuneId", 
> "LucrareBugetDateId")
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" btree 
> ("LucrareBugetVersiuneId", "LucrareBugetDateId")
> Foreign-key constraints:
> "FK_LucrareBugetDate_LucrareBugetVersiune_LucrareBugetVersiuneId" FOREIGN 
> KEY ("LucrareBugetVersiuneId") REFERENCES 
> "LucrareBugetVersiune"("LucrareBugetVersiuneId") ON DELETE CASCADE
> 
> EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from 
> "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 92) and 
> ("LucrareBugetDateId" in (10,11));
>   
> QUERY PLAN
> ---
>  Index Scan using 
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on 
> "LucrareBugetDate"  (cost=0.56..2.37 rows=1 width=13) (actual 
> time=0.096..978.398 rows=2 loops=1)
>Index Cond: ("LucrareBugetVersiuneId" = 92)
>Filter: ("LucrareBugetDateId" = ANY ('{10,11}'::integer[]))
>Rows Removed by Filter: 1869178
>Buffers: shared hit=161178
>  Planning time: 0.699 ms
>  Execution time: 978.433 ms

Well, what should the poor thing do?
There is no index on "LucrareBugetDateId".

Rather, you have two indexes on ("LucrareBugetVersiuneId", 
"LucrareBugetDateId"),
one of which should be dropped.

Try with an index on "LucrareBugetDateId".

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: How to set parallel_tuple_cost

2019-12-20 Thread Laurenz Albe

On Fri, 2019-12-20 at 13:03 -0500, Jeff Janes wrote:
> The docs for parallel_tuple_cost are quite terse, as the reference section of 
> the docs usually are:
> 
> "Sets the planner's estimate of the cost of transferring one tuple from a 
> parallel worker process to another process. The default is 0.1."
> 
> Usually you can find more extensive discussion of such settings in informal 
> resources like blog posts or mailing lists,
> but Googling the name I don't find much for this setting.  Is there good 
> information out there somewhere?
> 
> If you were take the doc description literally, then the default value seems 
> much too high, as it doesn't take
> 10x the (default) cpu_tuple_cost to transfer a tuple up from a parallel 
> worker.  On the other hand, you probably
> don't want a query which consumes 8x the CPU resources just to finish only 5% 
> faster (on an otherwise idle server with 8 CPUs).
> Maybe this Amdahl factor is what inspired the high default value?

Hmm.  The parameter was introduced into the discussion here:
https://www.postgresql.org/message-id/CAA4eK1L0dk9D3hARoAb84v2pGvUw4B5YoS4x18ORQREwR%2B1VCg%40mail.gmail.com
and while the name was changed from "cpu_tuple_comm_cost" to 
"parallel_tuple_cost"
later, the default value seems not to have been the subject of discussion.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Consecutive Query Executions with Increasing Execution Time

2019-12-18 Thread Laurenz Albe

On Tue, 2019-12-17 at 11:11 -0500, Jeff Janes wrote:
> On Tue, Dec 17, 2019 at 8:08 AM Laurenz Albe  wrote:
> > On Mon, 2019-12-16 at 15:50 -0500, Tom Lane wrote:
> > > Peter Geoghegan  writes:
> > > > Why do the first and the twentieth executions of the query have almost
> > > > identical "buffers shared/read" numbers? That seems odd.
> > > 
> > > It's repeat execution of the same query, so that doesn't seem odd to me.
> > 
> > Really?  Shouldn't the blocks be in shared buffers after a couple
> > of executions?
> 
> If it is doing a seq scan (I don't know if it is) they intentionally use a
> small ring buffer to, so they evict their own recently used blocks, rather
> than evicting other people's blocks.  So these blocks won't build up in
> shared_buffers very rapidly just on the basis of repeated seq scans.

Sure, but according to the execution plans it is doing a Parallel Index Only 
Scan.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

1 2 >

1 - 100 of 169 matches

Mail list logo