Re: [GENERAL] time series data

2017-10-03 Thread Schneider
On Mon, Oct 2, 2017 at 10:27 AM, Khalil Khamlichi
 wrote:
> we have records like this
>
> ccdb1=# select user_name, agent_status, event_time from cc_events ;
>
>  user_name | agent_status | event_time
> ---+--+-
>  user1 | ready| 2017-01-01 10:00:00
>  user1 | talking  | 2017-01-01 10:02:00
>  user1 | after_call   | 2017-01-01 10:08:00
>  user1 | ready| 2017-01-01 10:10:00
>  user1 | talking  | 2017-01-01 10:12:00
>  user1 | after_call   | 2017-01-01 10:15:00
>  user1 | paused   | 2017-01-01 10:17:00
>  user1 | ready| 2017-01-01 10:25:00
>  user1 | talking  | 2017-01-01 10:26:00
> (9 rows)
>
>
> so user1 was READY at  2017-01-01 10:00:00 then he received a call
> that he attended at 2017-01-01 10:02:00 and so on ...
> so user1 was ready for 2 minutes, then he was talking for 6 minutes
> then he was in after_call (doing after call work)  for 2 minutes and
> this is the kind of information we want to query.
>
> my solution so far that I came with, is in my table I have 1 more
> field : end_time
> so when an event comes in and before the insert I do :
> update cc_events set end_time = current_timestamp  where user_name =
> 'user_of_event' and end_time is null;
>
> then I insert new event leaving the end_time as null so that next
> event will update it and so on.
>
> its working fine, I have the start and end times for each event, its
> not too painful to query (sum(end-start) while grouping by user_name,
> agent_status), but its one more update on the table and also limited
> in what you can query about,
>
> I know this must be a common problem in every software that deals with
> events, so I suppose something is already built-in in postgres to deal
> with it.

Khalil, changing your schema is one solution with certain benefits -
but it's definitely not necessary when you have the power of
PostgreSQL at your fingertips.  You can solve your problem without
changing anything at all.  :)

All you need is a window function:
https://www.postgresql.org/docs/9.6/static/tutorial-window.html

Here's an example which I just now tested on 9.6.3, 9.5.7, 9.4.12 and
9.3.17 (all the latest minors currently supported on RDS).  You can
try these queries on your own system; they should work anywhere.

First, I added a second user to your data set to make sure we were
handling that case correctly.

==
create table cc_events (user_name varchar(8), agent_status
varchar(20), event_time timestamp);

insert into cc_events values
('user1', 'ready', '2017-01-01 10:00:00'),
('user1', 'talking', '2017-01-01 10:02:00'),
('user2', 'ready', '2017-01-01 10:04:00'),
('user2', 'talking', '2017-01-01 10:05:00'),
('user1', 'after_call', '2017-01-01 10:07:00'),
('user1', 'ready', '2017-01-01 10:08:00'),
('user1', 'talking', '2017-01-01 10:10:00'),
('user1', 'after_call', '2017-01-01 10:15:00'),
('user2', 'after_call', '2017-01-01 10:18:00'),
('user1', 'paused', '2017-01-01 10:20:00'),
('user2', 'paused', '2017-01-01 10:21:00');

select * from cc_events order by user_name, event_time;

==
here's a basic window function in action:

==
select user_name, agent_status, event_time,
  lead(event_time) over (partition by user_name order by event_time)
next_event_time
from cc_events order by event_time;

 user_name | agent_status | event_time  |   next_event_time
---+--+-+-
 user1 | ready| 2017-01-01 10:00:00 | 2017-01-01 10:02:00
 user1 | talking  | 2017-01-01 10:02:00 | 2017-01-01 10:07:00
 user2 | ready| 2017-01-01 10:04:00 | 2017-01-01 10:05:00
 user2 | talking  | 2017-01-01 10:05:00 | 2017-01-01 10:18:00
 user1 | after_call   | 2017-01-01 10:07:00 | 2017-01-01 10:08:00
 user1 | ready| 2017-01-01 10:08:00 | 2017-01-01 10:10:00
 user1 | talking  | 2017-01-01 10:10:00 | 2017-01-01 10:15:00
 user1 | after_call   | 2017-01-01 10:15:00 | 2017-01-01 10:20:00
 user2 | after_call   | 2017-01-01 10:18:00 | 2017-01-01 10:21:00
 user1 | paused   | 2017-01-01 10:20:00 |
 user2 | paused   | 2017-01-01 10:21:00 |

==
and now we just add one more column which does the subtraction to
calculate the duration:

==
select user_name, agent_status, event_time,
  lead(event_time) over (partition by user_name order by event_time)
next_event_time,
  (lead(event_time) over (partition by user_name order by event_time))
- event_time as duration
from cc_events order by event_time;

user_name | agent_status | event_time  |   next_event_time   | duration
---+--+-+-+--
 user1 | ready| 2017-01-01 10:00:00 | 2017-01-01 10:02:00 | 00:02:00
 user1 | talking  | 2017-01-01 10:02:00 | 2017-01-01 10:07:00 | 00:05:00
 user2 | ready| 2017-01-01 10:04:00 | 2017-01-01 10:05:00 |

Re: [GENERAL] BDR, wal sender, high system cpu, mutex_lock_common

2017-10-03 Thread milist ujang
Hi Craig,

Anyway, this OS is guess OS in vmware (vsphere).
Thank for your response and help.

On Tue, Oct 3, 2017 at 8:49 PM, Craig Ringer  wrote:
>
>
> Can you get stacks please?
>
> Use -g
>

# Events: 2K cpu-clock
#
# Overhead   Command  Shared ObjectSymbol
#     .  
#
86.96%  postgres  [kernel.kallsyms]  [k] __mutex_lock_common.isra.5
|
--- __mutex_lock_common.isra.5
read

 2.85%  postgres  [kernel.kallsyms]  [k] do_raw_spin_lock
|
--- do_raw_spin_lock
   |
   |--90.48%-- read
   |
   |--8.33%-- recv
   |
--1.19%-- write

 2.44%  postgres  [kernel.kallsyms]  [k] mutex_unlock
|
--- mutex_unlock
read

 2.03%  postgres  [kernel.kallsyms]  [k] arch_local_irq_restore
|
--- arch_local_irq_restore
read

 1.32%  postgres  postgres   [.] ValidXLogRecord
|
--- ValidXLogRecord

 1.25%  postgres  [kernel.kallsyms]  [k] mutex_lock
|
--- mutex_lock
read


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] BDR, wal sender, high system cpu, mutex_lock_common

2017-10-03 Thread Craig Ringer
On 3 October 2017 at 19:45, milist ujang  wrote:

> Hi all,
>
> I've an environment 9.4 + bdr:
> PostgreSQL 9.4.4
>

You're on a pretty old  postgres-bdr. Update. You're missing a lot of fixes
from mainline.


> This is consolidation databases, in this machine there are around 250+ wal
> sender processes.
>

Not a great use case for BDR.

Consider pglogical.


>
> finally get which processes (wal senders) that are using mutexes:
>
> perf top -e task-clock -p 55382
>
>
Can you get stacks please?

Use -g

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[GENERAL] BDR, wal sender, high system cpu, mutex_lock_common

2017-10-03 Thread milist ujang
Hi all,

I've an environment 9.4 + bdr:
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit

kernel version:
3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux

This is consolidation databases, in this machine there are around 250+ wal
sender processes.

top output revealed high system cpu:
%Cpu(s):  1.4 us, 49.7 sy,  0.0 ni, 48.8 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st

profiling cpu with perf:

perf top -e cpu-clock

Events: 142K cpu-clock
 82.37%  [kernel][k] __mutex_lock_common.isra.5
  4.49%  [kernel][k] do_raw_spin_lock
  2.23%  [kernel][k] mutex_lock
  2.16%  [kernel][k] mutex_unlock
  2.12%  [kernel][k] arch_local_irq_restore
  1.73%  postgres[.] ValidXLogRecord
  0.87%  [kernel][k] __mutex_unlock_slowpath
  0.78%  [kernel][k] arch_local_irq_enable
  0.63%  [kernel][k] sys_recvfrom


finally get which processes (wal senders) that are using mutexes:

perf top -e task-clock -p 55382

Events: 697  task-clock
 88.08%  [kernel]  [k] __mutex_lock_common.isra.5
  3.27%  [kernel]  [k] do_raw_spin_lock
  2.34%  [kernel]  [k] arch_local_irq_restore
  2.10%  postgres  [.] ValidXLogRecord
  1.87%  [kernel]  [k] mutex_unlock
  1.87%  [kernel]  [k] mutex_lock
  0.47%  [kernel]  [k] sys_recvfrom

strace output of wal sender process:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 98.301.030072   5213063201463 read
  1.690.017686   0201464201464 recvfrom
  0.010.000110   0   806   lseek
  0.000.43   0   474   468 rt_sigreturn
  0.000.00   0 6   open
  0.000.00   0 6   close
-- --- --- - - 
100.001.047911415819403395 total


strace detail, majority read from pipe and recvfrom from socket (but most
of them are EAGAIN):

read(15, 
"~\320\5\0\1\0\0\0\0@\235\1\360\16\0\0\334\26\0\0\0\0\0\0\365\27\0\0\0\0\0\0"...,
8192) = 8192 <0.25>
read(6, 0x7fffdd837b3f, 1) = -1 EAGAIN (Resource temporarily unavailable)
<0.000116>
recvfrom(10, 0x7fffdd837b17, 1, 0, 0, 0) = -1 EAGAIN (Resource temporarily
unavailable) <0.49>

ls -l /proc/62388/fd/15
lr-x-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/15 ->
/data/pg_xlog/00010EF00061

ls -l /proc/62388/fd/6
lr-x-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/6 ->
pipe:[1090892506]

ls -l /proc/62388/fd/10
lrwx-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/10 ->
socket:[1096584060]

I wonder, is there kernel version has better handling mutexes?
Or is it the expected behavior?

Sorry for cross-posting, I have posted the same on pgsql-performance too


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] Checkpoint write time - anything unusual?

2017-10-03 Thread Laurenz Albe
pinker wrote:
> I've just run pgBadger on my pg logs and wonder if those checkpoint
> statistics is something I should worry about or not?
> The highest write time is about 47 minutes but I'm not sure if that's
> checkpoint_completion_target*checkpoint_target value or real time between
> sending the command to write and getting response?
> If that's the first one - then I assume everything is ok because:
>  checkpoint_timeout = 8min
>  checkpoint_completion_target = 0.9
> 
> but I'm not sure which value it is...

I think that is fine.

if checkpoint_completion_target is 0.9, spending 0.9 * 3600 = 3240
seconds
per hour doing checkpoints would be normal.

The whole point of this parameter is to spread checkpoints across
a longer time to avoid I/O spikes.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general