On 31.10.2018 22:07, Darafei "Komяpa" Praliaskouski wrote:
Hi,

I've tried porting some of PostGIS algorithms to utilize multiple cores via OpenMP to return faster.

Question is, what's the best policy to allocate cores so we can play nice with rest of postgres?

What I'd like to see is some function that I can call and get a number of threads I'm allowed to run, that will also advise rest of postgres to not use them, and a function to return the cores back (or do it automatically at the end of query). Is there an infrastructure for that?

I do not completely understand which PostGIS algorithms  you are going to make parallel.
So may be you should first clarify it.
There are three options to perform parallel execution of the single query in Postgres now:

1. Use existed Postgres parallel capabilities. For example if there is some expensive function f() which you are going to execute concurrently, then  you do not need to do anything: parallel seq scan will do it for you. You can configure arbitrary number of parallel workers and so control level of concurrency. The restriction of the current Postgres parallel query processing implementation is that
- parallel workers are started for each query
- it is necessary to serialize and pass to parallel workers a lot of things from coordinator - in case of seqscan, workers will compete for pages to scan, so effective number of workers should be < 10, while most powerful modern servers have hundreds of COU cores.

2. Implement you own parallel plan nodes using existed Postgres parallel infrastructure. Such approach has most chances to be committed in Postgres core. But disadvantages are mostly the same as in 1) Exchange of data between different process is much more complex and expensive than access to common memory in case of threads. Mostly likely you will have to use shared message queue and dynamic shared memory, implemented in Postgres specially for interaction of parallel workers .

3. Use multithreading to provide concurrent execution of your particular algorithm (s[awn threads within backend). You should be very careful with this approach, because Postgres code is not thread safe. So you should not try to execute in thread any subplan or call any postgres functions (unless you are 100% sure that them are thread safe). This approach may be easy to implement and provide better performance than 1). But please notice its limitations. I have used such approach in my IMCS extension (In-Memory-Columnar-Store).

You can look at pg_strom extension as an example of providing parallel query execution (in this case using parallel capabilities of video cards).

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply via email to