Hi all,
I've collected some interesting results during my experiments which I couldn't
figure out the reason behind them and need your assistance.
I'm running PostgreSQL 9.0 on a quad-core machine having two level on-chip
cache
hierarchy. PostgreSQL has a large and warmed-up buffer
cache thus, no disk I/O is observed during experiments (i.e. for each query
buffer cache hit rate is 100%). I'm pinning each query/process to an individual
core. Queries are simple read-only queries (only selects). Nested loop (without
materialize) is used for the join operator.
When I pin a single query to an individual core, its execution time is observed
as 111 seconds. This result is my base case. Then, I fire two instances of the
same query concurrently and pin them to two different cores separately.
However,
each execution time becomes 132 seconds in this case. In a similar trend,
execution times are increasing for three instances (164 seconds) and four
instances (201 seconds) cases too. What I was expecting is a linear improvement
in throughput (at least). I tried several different queries and got the same
trend at each time.
I wonder why execution times of individual queries are increasing when I
increase the number of their instances.
Btw, I don't think on-chip cache hit/miss rates make a difference since L2
cache
misses are decreasing as expected. I'm not an expert in PostgreSQL internals.
Maybe there is a lock-contention (spinlocks?) occurring even if the queries are
read-only. Anyways, all ideas are welcome.
Thanks in advance,
Regards,
Umut