On Thu, Jun 26, 2025 at 2:48 AM James Pang wrote:
>we faced this issue 3 times this week, each time last only 2 seconds, so
> not easy to run perf in peak business time to capture that, anyway, I will
> try. before that, I want to understand if "os page cache" or "pg buffer
> cache" can con
thanks for you explaination, from Postgresql perspective, is it possible
to see bgwriter,or checkpointer blocking backend process reading/wrting ?
or vice versa ?
Thanks,
James
Frits Hoogland 於 2025年6月26日週四 下午4:07寫道:
> Postgres lives as a process in linux, and keeps its own cache, and tries
Postgres lives as a process in linux, and keeps its own cache, and tries to use
that as much as possible for data. This is postgres shared buffers, commonly
called the buffer cache.
For WAL, sessions write to the wal buffer (separate from the postgres buffer
cache), and need to write to disk u
in addition to "DataFileRead", actually we have more session waiting on
"extend", and we enabled log_lock, for example
2025-06-24 18:00:11.368 :[1865315]:[4-1]:mbsLOG: process 1865315 still
waiting for ExclusiveLock on extension of relation 14658239 of database
16384 after 1000.161 ms
2025-06-
we faced this issue 3 times this week, each time last only 2 seconds, so
not easy to run perf in peak business time to capture that, anyway, I will
try. before that, I want to understand if "os page cache" or "pg buffer
cache" can contribute to the wait_event time "extend" and "DataFileRead",
or
Okay. So it's a situation that is reproducable.
And like was mentioned, the system time (percentage) is very high.
Is this a physical machine, or a virtual machine?
The next thing to do, is use perf to record about 20 seconds or so during a
period of time when you see this behavior (perf record -
Thanks, I make a summary of the issue, no connection storm(fork)
either, just suddenly many session waiting on "extend" and "DataFileRead",
it last 2 seconds, this server has 64 vcpu and running there long time
without issue, only last weekend, we patch from 14.8 to 14.14. We checked
with Infr
> On Thu, 2025-06-26 at 10:32 +0800, James Pang wrote:
>> thans for you suggestions, we have iowait from sar command too, copy here,
>> checking with infra team not found abnormal IO activities either.
>> 02:00:01 PM CPU %usr %nice %sys %iowait %irq %soft %steal
>> %guest %gnic
On Thu, 2025-06-26 at 10:32 +0800, James Pang wrote:
> thans for you suggestions, we have iowait from sar command too, copy here,
> checking with infra team not found abnormal IO activities either.
> 02:00:01 PM CPU %usr %nice %sys %iowait %irq %soft %steal
> %guest %gnice %i
thans for you suggestions, we have iowait from sar command too, copy here,
checking with infra team not found abnormal IO activities either.
02:00:01 PM CPU%usr %nice%sys %iowait%irq %soft %steal
%guest %gnice %idle
02:00:03 PM all 15.920.00 43.020.650.7
> On 25 Jun 2025, at 07:59, Laurenz Albe wrote:
>
> On Wed, 2025-06-25 at 11:15 +0800, James Pang wrote:
>> pgv14, RHEL8, xfs , we suddenly see tens of sessions waiting on
>> "DataFileRead" and
>> "extend", it last about 2 seconds(based on pg_stat_activity query) , during
>> the
>> waiting t
transparent_hugepage=never in our prod servers, %iowait is low 0.x-1.x%
, read/write iops <2k, and read/write wait 0.x ms. we did not find other
abnormal logs from OS logs either. Yes, we are discussing with our
application team to reduce concurrency. more questions about DataFileRead
and ext
On Wed, 2025-06-25 at 11:15 +0800, James Pang wrote:
> pgv14, RHEL8, xfs , we suddenly see tens of sessions waiting on
> "DataFileRead" and
> "extend", it last about 2 seconds(based on pg_stat_activity query) , during
> the
> waiting time, "%sys" cpu increased to 80% , but from "iostat" , no high
pgv14, RHEL8, xfs , we suddenly see tens of sessions waiting on
"DataFileRead" and "extend", it last about 2 seconds(based on
pg_stat_activity query) , during the waiting time, "%sys" cpu increased to
80% , but from "iostat" , no high iops and io read/write latency increased
either.
many sess
14 matches
Mail list logo