Re: proposal: Allocate work_mem From Pool

2022-07-12 Thread Joseph D Wagner

Before I try to answer that, I need to know how the scheduler works.



As I understand the term used, there is no scheduler inside Postgres
for user connections -- they're handled by the OS kernel.


Then, I'm probably using the wrong term. Right now, I have
max_worker_processes set to 16. What happens when query #17
wants some work done? What do you call the thing that handles
that? What is its algorithm for allocating work to the processes?
Or am I completely misunderstanding the role worker processes
play in execution?

Joseph Wagner




RE: proposal: Allocate work_mem From Pool

2022-07-12 Thread Joseph D Wagner
>> I think it would be better if work_mem was allocated from a pool
>> of  memory

> I think this has been proposed before, and the issue/objection
> with this idea is probably that query plans will be inconsistent,
> and end up being sub-optimal.

> work_mem is considered at planning time, but I think you only
> consider its application execution.  A query that was planned
>  with the configured work_mem but can't obtain the expected
> amount at execution time might perform poorly. Maybe it
> should be replanned with lower work_mem, but that would
> lose the arms-length relationship between the planner-executor.

> Should an expensive query wait a bit to try to get more
> work_mem? What do you do if 3 expensive queries are all
> waiting ?

Before I try to answer that, I need to know how the scheduler works.

Let's say there's a max of 8 worker process, and 12 queries trying to run.
When does query #9 run? After the first of 1-8 completes, simple FIFO?
Or something else?

Also, how long goes a query hold a worker process?  All the way to
completion?  Or does is perform some unit of work and rotate to
another query?

Joseph D Wagner

P.S.  If there's a link to all this somewhere, please let me know.
Parsing through years of email archives is not always user friendly or
helpful.





proposal: Allocate work_mem From Pool

2022-07-10 Thread Joseph D Wagner
I'm new here, so forgive me if this is a bad idea or my lack of knowledge on
how to optimize PostgreSQL.

I find PostgreSQL to be great with a large number of small transactions,
which covers most use cases.  However, my experience has not been so great
on the opposite end -- a small number of large transactions, i.e. Big Data.

I had to increase work_mem to 3GB to stop my queries from spilling to disk.
However, that's risky because it's 3GB per operation, not per
query/connection; it could easily spiral out of control.

I think it would be better if work_mem was allocated from a pool of memory
as need and returned to the pool when no longer needed.  The pool could
optionally be allocated from huge pages.  It would allow large and mixed
workloads the flexibility of grabbing more memory as needed without spilling
to disk while simultaneously being more deterministic about the maximum that
will be used.

Thoughts?

Thank you for your time.

Joseph D. Wagner

My specifics:
 -64 GB box
 -16 GB shared buffer, although queries only using about 12 GB of that
 -16 GB effective cache
 -2-3 GB used by OS and apps
 -the rest is available for Postgresql queries/connections/whatever as
needed