On Sun, Oct 1, 2017 at 4:17 AM, Khalil Khamlichi <khamlichi.kha...@gmail.com > wrote:
> Hi everyone, > > I have a data stream of a call center application coming in to postgres > in this format : > > user_name, user_status, event_time > > 'user1', 'ready', '2017-01-01 10:00:00' > 'user1', 'talking', '2017-01-01 10:02:00' > 'user1', 'after_call', '2017-01-01 10:07:00' > 'user1', 'ready', '2017-01-01 10:08:00' > 'user1', 'talking', '2017-01-01 10:10:00' > 'user1', 'after_call', '2017-01-01 10:15:00' > 'user1', 'paused', '2017-01-01 10:20:00' > ... > ... > > so as you see each new insert of an "event" is in fact the start_time of > that event and also the end_time of the previous one so should be used to > calculate the duration of this previous one. > > What is the best way to get user_status statistics like total duration, > frequency, avg ...etc , does any body have an experience with this sort of > data streams ? > > > Thanks in advance. > Just a suggestion, but here is what I would do. First, create your tables similar to as follows CREATE TABLE status ( call_status varchar(10) NOT NULL, CONSTRAINT status_pk PRIMARY KEY (call_status) ); INSERT INTO status (call_status) VALUES ('ready'), ('talking'), ('after_call'); CREATE TABLE user_sessions ( username name NOT NULL, session_id bigint NOT NULL, call_status varchar(10) NOT NULL, call_time timestamp NOT NULL, CONSTRAINT user_sessions_pk PRIMARY KEY (username, session_id,call_status), CONSTRAINT user_sessions_fk_status FOREIGN KEY (call_status) REFERENCES status(call_status) ); Next, you will need to generate a unique session_id for each user, but only for when call_status is 'ready'. So probably a table of the form: CREATE TABLE current_session ( username name NOT NULL, session_id serial NOT NULL, CONSTRAINT current_session_pk PRIMARY KEY (username) ); Then all you need to do is: 1. Update current_session and get the new session_id each time a user connects (call_status = 'ready'. Probably best to use a BEFORE trigger to do this, but you will need to code it yourself. 2. You can then do SELECT username, age ( (SELECT call_time FROM current_session WHERE call_status = 'talking'), ( SELECT call_time FROM current_session WHERE call_status = 'after_call') ) as duration FROM user_sessions WHERE username = '*actual_user_name*' AND session_id = *actual_session_id*; You can use similar queries for avg and frequency. -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.