Re: [GENERAL] time series data
On Mon, Oct 2, 2017 at 10:27 AM, Khalil Khamlichi wrote: > we have records like this > > ccdb1=# select user_name, agent_status, event_time from cc_events ; > > user_name | agent_status | event_time > ---+--+- > user1 | ready| 2017-01-01 10:00:00 > user1 | talking | 2017-01-01 10:02:00 > user1 | after_call | 2017-01-01 10:08:00 > user1 | ready| 2017-01-01 10:10:00 > user1 | talking | 2017-01-01 10:12:00 > user1 | after_call | 2017-01-01 10:15:00 > user1 | paused | 2017-01-01 10:17:00 > user1 | ready| 2017-01-01 10:25:00 > user1 | talking | 2017-01-01 10:26:00 > (9 rows) > > > so user1 was READY at 2017-01-01 10:00:00 then he received a call > that he attended at 2017-01-01 10:02:00 and so on ... > so user1 was ready for 2 minutes, then he was talking for 6 minutes > then he was in after_call (doing after call work) for 2 minutes and > this is the kind of information we want to query. > > my solution so far that I came with, is in my table I have 1 more > field : end_time > so when an event comes in and before the insert I do : > update cc_events set end_time = current_timestamp where user_name = > 'user_of_event' and end_time is null; > > then I insert new event leaving the end_time as null so that next > event will update it and so on. > > its working fine, I have the start and end times for each event, its > not too painful to query (sum(end-start) while grouping by user_name, > agent_status), but its one more update on the table and also limited > in what you can query about, > > I know this must be a common problem in every software that deals with > events, so I suppose something is already built-in in postgres to deal > with it. Khalil, changing your schema is one solution with certain benefits - but it's definitely not necessary when you have the power of PostgreSQL at your fingertips. You can solve your problem without changing anything at all. :) All you need is a window function: https://www.postgresql.org/docs/9.6/static/tutorial-window.html Here's an example which I just now tested on 9.6.3, 9.5.7, 9.4.12 and 9.3.17 (all the latest minors currently supported on RDS). You can try these queries on your own system; they should work anywhere. First, I added a second user to your data set to make sure we were handling that case correctly. == create table cc_events (user_name varchar(8), agent_status varchar(20), event_time timestamp); insert into cc_events values ('user1', 'ready', '2017-01-01 10:00:00'), ('user1', 'talking', '2017-01-01 10:02:00'), ('user2', 'ready', '2017-01-01 10:04:00'), ('user2', 'talking', '2017-01-01 10:05:00'), ('user1', 'after_call', '2017-01-01 10:07:00'), ('user1', 'ready', '2017-01-01 10:08:00'), ('user1', 'talking', '2017-01-01 10:10:00'), ('user1', 'after_call', '2017-01-01 10:15:00'), ('user2', 'after_call', '2017-01-01 10:18:00'), ('user1', 'paused', '2017-01-01 10:20:00'), ('user2', 'paused', '2017-01-01 10:21:00'); select * from cc_events order by user_name, event_time; == here's a basic window function in action: == select user_name, agent_status, event_time, lead(event_time) over (partition by user_name order by event_time) next_event_time from cc_events order by event_time; user_name | agent_status | event_time | next_event_time ---+--+-+- user1 | ready| 2017-01-01 10:00:00 | 2017-01-01 10:02:00 user1 | talking | 2017-01-01 10:02:00 | 2017-01-01 10:07:00 user2 | ready| 2017-01-01 10:04:00 | 2017-01-01 10:05:00 user2 | talking | 2017-01-01 10:05:00 | 2017-01-01 10:18:00 user1 | after_call | 2017-01-01 10:07:00 | 2017-01-01 10:08:00 user1 | ready| 2017-01-01 10:08:00 | 2017-01-01 10:10:00 user1 | talking | 2017-01-01 10:10:00 | 2017-01-01 10:15:00 user1 | after_call | 2017-01-01 10:15:00 | 2017-01-01 10:20:00 user2 | after_call | 2017-01-01 10:18:00 | 2017-01-01 10:21:00 user1 | paused | 2017-01-01 10:20:00 | user2 | paused | 2017-01-01 10:21:00 | == and now we just add one more column which does the subtraction to calculate the duration: == select user_name, agent_status, event_time, lead(event_time) over (partition by user_name order by event_time) next_event_time, (lead(event_time) over (partition by user_name order by event_time)) - event_time as duration from cc_events order by event_time; user_name | agent_status | event_time | next_event_time | duration ---+--+-+-+-- user1 | ready| 2017-01-01 10:00:00 | 2017-01-01 10:02:00 | 00:02:00 user1 | talking | 2017-01-01 10:02:00 | 2017-01-01 10:07:00 | 00:05:00 user2 | ready| 2017-01-01 10:04:00 | 2017-01-01 10:05:00 |
Re: [GENERAL] BDR, wal sender, high system cpu, mutex_lock_common
Hi Craig, Anyway, this OS is guess OS in vmware (vsphere). Thank for your response and help. On Tue, Oct 3, 2017 at 8:49 PM, Craig Ringer wrote: > > > Can you get stacks please? > > Use -g > # Events: 2K cpu-clock # # Overhead Command Shared ObjectSymbol # . # 86.96% postgres [kernel.kallsyms] [k] __mutex_lock_common.isra.5 | --- __mutex_lock_common.isra.5 read 2.85% postgres [kernel.kallsyms] [k] do_raw_spin_lock | --- do_raw_spin_lock | |--90.48%-- read | |--8.33%-- recv | --1.19%-- write 2.44% postgres [kernel.kallsyms] [k] mutex_unlock | --- mutex_unlock read 2.03% postgres [kernel.kallsyms] [k] arch_local_irq_restore | --- arch_local_irq_restore read 1.32% postgres postgres [.] ValidXLogRecord | --- ValidXLogRecord 1.25% postgres [kernel.kallsyms] [k] mutex_lock | --- mutex_lock read -- regards ujang jaenudin | DBA Consultant (Freelancer) http://ora62.wordpress.com http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
Re: [GENERAL] BDR, wal sender, high system cpu, mutex_lock_common
On 3 October 2017 at 19:45, milist ujang wrote: > Hi all, > > I've an environment 9.4 + bdr: > PostgreSQL 9.4.4 > You're on a pretty old postgres-bdr. Update. You're missing a lot of fixes from mainline. > This is consolidation databases, in this machine there are around 250+ wal > sender processes. > Not a great use case for BDR. Consider pglogical. > > finally get which processes (wal senders) that are using mutexes: > > perf top -e task-clock -p 55382 > > Can you get stacks please? Use -g -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[GENERAL] BDR, wal sender, high system cpu, mutex_lock_common
Hi all, I've an environment 9.4 + bdr: PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit kernel version: 3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux This is consolidation databases, in this machine there are around 250+ wal sender processes. top output revealed high system cpu: %Cpu(s): 1.4 us, 49.7 sy, 0.0 ni, 48.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st profiling cpu with perf: perf top -e cpu-clock Events: 142K cpu-clock 82.37% [kernel][k] __mutex_lock_common.isra.5 4.49% [kernel][k] do_raw_spin_lock 2.23% [kernel][k] mutex_lock 2.16% [kernel][k] mutex_unlock 2.12% [kernel][k] arch_local_irq_restore 1.73% postgres[.] ValidXLogRecord 0.87% [kernel][k] __mutex_unlock_slowpath 0.78% [kernel][k] arch_local_irq_enable 0.63% [kernel][k] sys_recvfrom finally get which processes (wal senders) that are using mutexes: perf top -e task-clock -p 55382 Events: 697 task-clock 88.08% [kernel] [k] __mutex_lock_common.isra.5 3.27% [kernel] [k] do_raw_spin_lock 2.34% [kernel] [k] arch_local_irq_restore 2.10% postgres [.] ValidXLogRecord 1.87% [kernel] [k] mutex_unlock 1.87% [kernel] [k] mutex_lock 0.47% [kernel] [k] sys_recvfrom strace output of wal sender process: % time seconds usecs/call callserrors syscall -- --- --- - - 98.301.030072 5213063201463 read 1.690.017686 0201464201464 recvfrom 0.010.000110 0 806 lseek 0.000.43 0 474 468 rt_sigreturn 0.000.00 0 6 open 0.000.00 0 6 close -- --- --- - - 100.001.047911415819403395 total strace detail, majority read from pipe and recvfrom from socket (but most of them are EAGAIN): read(15, "~\320\5\0\1\0\0\0\0@\235\1\360\16\0\0\334\26\0\0\0\0\0\0\365\27\0\0\0\0\0\0"..., 8192) = 8192 <0.25> read(6, 0x7fffdd837b3f, 1) = -1 EAGAIN (Resource temporarily unavailable) <0.000116> recvfrom(10, 0x7fffdd837b17, 1, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.49> ls -l /proc/62388/fd/15 lr-x-- 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/15 -> /data/pg_xlog/00010EF00061 ls -l /proc/62388/fd/6 lr-x-- 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/6 -> pipe:[1090892506] ls -l /proc/62388/fd/10 lrwx-- 1 postgres postgres 64 Oct 1 08:39 /proc/62388/fd/10 -> socket:[1096584060] I wonder, is there kernel version has better handling mutexes? Or is it the expected behavior? Sorry for cross-posting, I have posted the same on pgsql-performance too -- regards ujang jaenudin | DBA Consultant (Freelancer) http://ora62.wordpress.com http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab
Re: [GENERAL] Checkpoint write time - anything unusual?
pinker wrote: > I've just run pgBadger on my pg logs and wonder if those checkpoint > statistics is something I should worry about or not? > The highest write time is about 47 minutes but I'm not sure if that's > checkpoint_completion_target*checkpoint_target value or real time between > sending the command to write and getting response? > If that's the first one - then I assume everything is ok because: > checkpoint_timeout = 8min > checkpoint_completion_target = 0.9 > > but I'm not sure which value it is... I think that is fine. if checkpoint_completion_target is 0.9, spending 0.9 * 3600 = 3240 seconds per hour doing checkpoints would be normal. The whole point of this parameter is to spread checkpoints across a longer time to avoid I/O spikes. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general