On 5/5/25 11:00, Mladen Marinović wrote:


On Mon, May 5, 2025 at 11:24 AM Achilleas Mantzios <a.mantz...@cloud.gatewaynet.com> wrote:


    On 5/5/25 09:52, Mladen Marinović wrote:
    Hi,

    We recently migrated our production instances from PG11 to PG17.
    While doing so we upgraded our pgBouncer instances from 1.12 to
    1.24. As everything worked on the test servers we pushed this to
    production a few weeks ago. We did not notice any problems until
    a few days ago (but the problems were here from the start). The
    main manifestation of the problems is a service that runs a fixed
    query to get a backlog of unprocessed data (limited to a 1000
    rows). When testing the query using pgAdmin connected directly to
    the database we get a result in cca. 20 seconds. The same query
    runs for 2 hours when using pgBouncer to connect to the same
    database.


    That's a huge jump, I hope you guys did extensive testing of your
    app. In which language is your app written? If java, then define
    prepareThreshold=0 in your jdbc and set max_prepared_statements =
    0 in pgbouncer.

Mainly python, but the problem was noticed in a java service.
Prepare treshold was already set to 0. We changed the max_prepared_statements to 0 from the default (200) but no change was noticed.

    How about search paths ? any difference on those between the two
    runs ? Do you set search_path in pgbouncer ? what is "cca." btw ?


    The more interesting part is that when we issue an explain of the
    same query we get different plans. We did this a few seconds
    apart so there should be no difference in collected statistics.
    We ruled out prepared statements, as we suspected the generic
    plan might be the problem, but it is not. Is there any pgBouncer
    or PG17 parameter that might be the cause of this?


    Does this spawn any connections (such as dblink) ? are there
    limits per user/db pool_size in pgbouncer ?

No additional connection nor dbling. Just plain SQL (CTE, SELECT, INSERT, UPDATE, DELETE,...) There are limits, but they are not hit. The query just uses a different plan and runs slower because of that.

    Pgbouncer, in contrast to its old friend PgPool-II is completely
    passive, just passes through SQL to the server as fast as possible
    as it can. But I am sure you know that. Good luck, keep us posted!

Yes, that is what puzzles me.

What is the pgbouncer's timeout in the server connections ?

How about "idle in transaction" ? do you get any of those? What's the isolation level ?

How about the user ? is this the same user doing pgadmin queries VS via the app ?

Can you identify the user under which the problem is manifested and :

ALTER user "unlucky_user" SET log_statement = 'all';

ALTER user "unlucky_user" SET log_min_duration_statement = 0; -- to help you debug the prepared statements .. just in case , and other stuff not printed by log_statement = all.

Regards,
Mladen Marinović

Reply via email to