[GENERAL] How can I group all children by their parent ?
How can I group all children by their parent ? id email parent_id 1 t...@test.com nil 2 te...@test.com 1 3 email 1 4 email 2 5 email nil 6 email 3 Regards, Arup Rakshit
Re: [GENERAL] How can I group all children by their parent ?
Le 2014-07-17 à 09:08, Arup Rakshit arupraks...@rocketmail.com a écrit : How can I group all children by their parent ? id email parent_id 1 t...@test.com nil 2 te...@test.com 1 3 email 1 4 email 2 5 email nil 6 email 3 ORDER BY parent_id, id ? François Beausoleil -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How can I group all children by their parent ?
Le 17/07/2014 15:08, Arup Rakshit a écrit : How can I group all children by their parent ? id email parent_id 1 t...@test.com nil 2 te...@test.com 1 3 email 1 4 email 2 5 email nil 6 email 3 Regards, Arup Rakshit Did you mean SELECT array_agg(id), array_agg(email), parent_id FROM ... GROUP BY parent_id id email parent_id [1,5] [t...@test.com,email] nil [2,3] [te...@test.com,email] 1 [4] [email] 2 [6] [email] 3 or SELECT id, email, parentid FROM ... ORDER BY parent_id id email parent_id 1 t...@test.com nil 5 email nil 2 te...@test.com 1 3 email 1 4 email 2 6 email 3 Regards Mathieu Pujol
Re: [GENERAL] How can I group all children by their parent ?
ORDER BY parent_id, id ? François Beausoleil parent_id .. But why order_by.. I thought I need to group by parent child email 1 2 te...@test.com 3 email
Re: [GENERAL] How can I group all children by their parent ?
Arup Rakshit wrote ORDER BY parent_id, id ? François Beausoleil parent_id .. But why order_by.. I thought I need to group by parent child email 1 2 test1@ 3 email The word group as you have used it can mean: Generate a single record for each parent with all children combined into that record. or Physically order the output so that while each child still has its own record all children of the same parent are listed consecutively The first solution requires GROUP BY, the second requires ORDER BY It would help if you could show an example of what you want the output to look like. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-can-I-group-all-children-by-their-parent-tp5811846p5811851.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How can I group all children by their parent ?
On Thu, 17 Jul 2014 21:28:14 +0800 Arup Rakshit arupraks...@rocketmail.com wrote: ORDER BY parent_id, id ? parent_id .. But why order_by.. I thought I need to group by Perhaps you need to tell us what your problem is instead of your solution. What exactly are you trying to accomplish here? Don't describe it in database terms. Tell us what the real world situation is and what result you want to see. For example; I have a list of people related to each other in a parent/child relationship to any level (e.g. parents, grandparents, etc.) and I want to list everyone who is a descendant of anyone in the database. Try for that level of detail and you will probably get more useful answers. Also, see http://www.catb.org/~esr/faqs/smart-questions.html for some useful hints on asking questions. -- D'Arcy J.M. Cain da...@druid.net | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 788 2246 (DoD#0082)(eNTP) | what's for dinner. IM: da...@vex.net, VoIP: sip:da...@druid.net -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why pg_toast table not get auto vacuumed?
On 07/16/2014 12:05 PM, jlliu wrote: Hi, PostgreSQL version: 9.1.16. Linux: RHEL6. After a heavy traffic run, a huge pg_toast table is seen. Its size is ~3G. There also exist other pg_toast tables in a much smaller size, for example, ~100M. The problem is that that huge pg_toast table never gets auto vacuumed while other small pg_toast tables do get auto vacuumed, as shown in pg logs. If running traffic again, the huge pg_toast table grows again. In psql see if \d+ on the parent table shows autovacuum has been disabled for that table. We have another system with the same setup but we only run light traffic on it. The pg_toast table from the same parent table also exists. But this pg_toast table only has a size ~50M and it gets auto vacuumed from time to time as shown in pg_logs. Why does not the huge pg_toast table in the first case get auto vacuumed? Are there anything we can do? A start would be to show what your autovacuum settings in postgresql.conf are: http://www.postgresql.org/docs/9.3/interactive/runtime-config-autovacuum.html Also the information from pg_stat_all_tables for that table: http://www.postgresql.org/docs/9.1/static/monitoring-stats.html Thanks, JL -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Windows Installation User account - Correct database for us
H2 or SQLite can be very reasonable choices for in-app embedded databases, so long as your app can operate within their constraints on concurrency and data size. Firefox uses SQLite and the places.sqlite database is notorious for getting corrupted every five minutes. Libreoffice/Openoffice Base uses H2 and there are plenty of reports by users about data losses due to database corruption. Personally I'd *never* use an embedded database for anything. Sincerely, Wolfgang -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Checkpoint_segments optimal value
Hello All i had a database load issue today and when i was checking logs i saw this message. I m running postgresql 8.3 db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoints are occurring too frequently (26 seconds apart) db::ts:2014-07-17 09:38:21 PDT::HINT: Consider increasing the configuration parameter checkpoint_segments. db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoint starting: xlog pls suggest the optimal checkpoint_segment value i should set *My current values are this* checkpoint_segments = 32# in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_warning = 30s # 0 is off *Current Memory values* # - Memory - effective_cache_size = 8GB shared_buffers = 1024MB # min 128kB or max_connections*16kB # (change requires restart) temp_buffers = 64MB # min 800kB #max_prepared_transactions = 5 # can be 0 or more # (change requires restart) work_mem = 512MB# min 64kB maintenance_work_mem = 2048MB # min 1MB !!! increased to 1GB 10/1/2010 mrostron (to assist autovacuum workers) #max_stack_depth = 2MB # min 100kB # - Free Space Map - max_fsm_pages = 809600 # min max_fsm_relations*16, 6 bytes each # (change requires restart) synchronous_commit = off #full_page_writes = on # recover from partial page writes #wal_buffers = 64kB # min 32kB # (change requires restart) #wal_writer_delay = 200ms # 1-1 milliseconds #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5# range 1-1000
Re: [GENERAL] Checkpoint_segments optimal value
On Thu, 17 Jul 2014 10:44:32 -0700 Prabhjot Sheena prabhjot.she...@rivalwatch.com wrote: Hello All i had a database load issue today and when i was checking logs i saw this message. I m running postgresql 8.3 db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoints are occurring too frequently (26 seconds apart) db::ts:2014-07-17 09:38:21 PDT::HINT: Consider increasing the configuration parameter checkpoint_segments. db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoint starting: xlog Is this the first time you've had the load problem? How often does it happen? How often is that message in the logs? The downside (as described in the docs) is that increasing the value will cause PG to take longer to recover in the event of a crash. That and that you need enough disk space to hold the extra segments. If the warning isn't happening too often, I would try increasing it only a little and see if it helps. If it's not enough you can then increase it some more. Various sources around the Internet suggest that you don't want to go much larger than 256 for this (if only because it's uncommon to do so and is probably indicative of other tuning that you need to do). Unfortunatley, you need to restart PG for the change to take effect, so you have to balance experimenting with your tuning against how often you can get away with a server restart. If it's just that one time that you got that message, then you might want to double checkpoint_segments to 64. Anything more than that seems unjustified, unless you're seeing the problem a lot more often than your email suggests. If the problem is happening frequently, you're probably way better off organizing an upgrade to PG 9.3 than fidgeting with a lot of tuning. The number of performance improvements from 8.3 to 9.3 is quite significant. pls suggest the optimal checkpoint_segment value i should set *My current values are this* checkpoint_segments = 32# in logfile segments, min 1, 16MB each #checkpoint_timeout = 5min # range 30s-1h #checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0 #checkpoint_warning = 30s # 0 is off *Current Memory values* # - Memory - effective_cache_size = 8GB shared_buffers = 1024MB # min 128kB or max_connections*16kB # (change requires restart) temp_buffers = 64MB # min 800kB #max_prepared_transactions = 5 # can be 0 or more # (change requires restart) work_mem = 512MB# min 64kB maintenance_work_mem = 2048MB # min 1MB !!! increased to 1GB 10/1/2010 mrostron (to assist autovacuum workers) #max_stack_depth = 2MB # min 100kB # - Free Space Map - max_fsm_pages = 809600 # min max_fsm_relations*16, 6 bytes each # (change requires restart) synchronous_commit = off #full_page_writes = on # recover from partial page writes #wal_buffers = 64kB # min 32kB # (change requires restart) #wal_writer_delay = 200ms # 1-1 milliseconds #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5# range 1-1000 -- Potentialtech wmo...@potentialtech.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint_segments optimal value
Potentialtech wmo...@potentialtech.com writes: If the warning isn't happening too often, I would try increasing it only a little and see if it helps. If it's not enough you can then increase it some more. Various sources around the Internet suggest that you don't want to go much larger than 256 for this (if only because it's uncommon to do so and is probably indicative of other tuning that you need to do). Unfortunatley, you need to restart PG for the change to take effect, so you have to balance experimenting with your tuning against how often you can get away with a server restart. Huh? You don't need a restart, just a reload (SIGHUP) to change that. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint_segments optimal value
Thanks i have changed that to 64 and reloaded it. When i had load issue today there was this exact same query that hits the db like 50 to 60 times from different machines in 3 to 4 minutes and was taking long time to execute and was holding up the database. i did recreate an index and it started performing better. My question is why it is not fetching the result from the memory since its the same query that runs again and again. This is the actual query i m taking about: SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id Pls suggest if i can do something to fix this On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Potentialtech wmo...@potentialtech.com writes: If the warning isn't happening too often, I would try increasing it only a little and see if it helps. If it's not enough you can then increase it some more. Various sources around the Internet suggest that you don't want to go much larger than 256 for this (if only because it's uncommon to do so and is probably indicative of other tuning that you need to do). Unfortunatley, you need to restart PG for the change to take effect, so you have to balance experimenting with your tuning against how often you can get away with a server restart. Huh? You don't need a restart, just a reload (SIGHUP) to change that. regards, tom lane
Re: [GENERAL] Checkpoint_segments optimal value
On Thu, 17 Jul 2014 14:06:28 -0400 Tom Lane t...@sss.pgh.pa.us wrote: Potentialtech wmo...@potentialtech.com writes: If the warning isn't happening too often, I would try increasing it only a little and see if it helps. If it's not enough you can then increase it some more. Various sources around the Internet suggest that you don't want to go much larger than 256 for this (if only because it's uncommon to do so and is probably indicative of other tuning that you need to do). Unfortunatley, you need to restart PG for the change to take effect, so you have to balance experimenting with your tuning against how often you can get away with a server restart. Huh? You don't need a restart, just a reload (SIGHUP) to change that. Really? Well, thanks for correcting me on that. -- Potentialtech wmo...@potentialtech.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint_segments optimal value
On Thu, 17 Jul 2014 11:28:04 -0700 Prabhjot Sheena prabhjot.she...@rivalwatch.com wrote: Thanks i have changed that to 64 and reloaded it. When i had load issue today there was this exact same query that hits the db like 50 to 60 times from different machines in 3 to 4 minutes and was taking long time to execute and was holding up the database. i did recreate an index and it started performing better. My question is why it is not fetching the result from the memory since its the same query that runs again and again. This is the actual query i m taking about: SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id Pls suggest if i can do something to fix this Provide a lot more information if you want anyone on the list to be able to help: such as explain output while the problem is happening, and some information about the makeup of the tables (column types/indexes/# rows). Guessing, based on the little information you've provided, it's likely that you have something else going on at the same time that you're not aware of, and this particular query is only a symptom. I'm saying that because SELECTs don't generally create any WAL traffic, so there were probably some INSERT/UPDATE/DELETE running at the same time that both pushed those 3 tables out of memory and/or saturated disk activity to the point that accessing everything becomes slow for a short while, and it's just those queries that you noticed. Are you making the mistake where you set log_min_duration to 1s and only worry about queries that take longer than 1s? Because I've seen (on multiple occasions) where many 1000s of queries, each running less than 1s, are the actual cause of the problem. pgBadger is particularly helpful in tracking down situations like that. On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Potentialtech wmo...@potentialtech.com writes: If the warning isn't happening too often, I would try increasing it only a little and see if it helps. If it's not enough you can then increase it some more. Various sources around the Internet suggest that you don't want to go much larger than 256 for this (if only because it's uncommon to do so and is probably indicative of other tuning that you need to do). Unfortunatley, you need to restart PG for the change to take effect, so you have to balance experimenting with your tuning against how often you can get away with a server restart. Huh? You don't need a restart, just a reload (SIGHUP) to change that. regards, tom lane -- Potentialtech wmo...@potentialtech.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Whats is lock type transactionid?
Hi, I have been facing lock contention in my Postgresql 9.1 DB. And when I am querying in the pg_locks table I found a lock type with transactionid. Could someone please tell me what it means? Thanks.
Re: [GENERAL] Whats is lock type transactionid?
On Thu, Jul 17, 2014 at 3:34 PM, AI Rumman rumman...@gmail.com wrote: Hi, I have been facing lock contention in my Postgresql 9.1 DB. And when I am querying in the pg_locks table I found a lock type with transactionid. Could someone please tell me what it means? Thanks. from http://www.postgresql.org/docs/9.3/static/view-pg-locks.html : Every transaction holds an exclusive lock on its virtual transaction ID for its entire duration. If a permanent ID is assigned to the transaction (which normally happens only if the transaction changes the state of the database), it also holds an exclusive lock on its permanent transaction ID until it ends. When one transaction finds it necessary to wait specifically for another transaction, it does so by attempting to acquire share lock on the other transaction ID (either virtual or permanent ID depending on the situation). That will succeed only when the other transaction terminates and releases its locks. I believe that describes what you're seeing -- Douglas J Hunley (doug.hun...@gmail.com)
Re: [GENERAL] Whats is lock type transactionid?
I am experiencing lock contention on one single UPDATE statement at a certain time in whole day. This is a small table to UPDATE. My suspect is we are facing it for one specific ID. Could you please let me know how can I identify the tuple. I got a log like follows: blocker_target | blocker_pid | blocker_mode | depth | target | pid | mode |seq ---+-+---+---+---+--+---+--- (tuple,475999,662775,1988,6,) |3557 | ExclusiveLock | 1 | (tuple,475999,662775,1988,6,) | 3543 | ExclusiveLock | 3557,3543 (tuple,475999,662775,1988,6,) |3557 | ExclusiveLock | 1 | (tuple,475999,662775,1988,6,) | 7387 | ExclusiveLock | 3557,7387 Any idea on it. Thanks. On Thu, Jul 17, 2014 at 12:40 PM, Douglas J Hunley doug.hun...@gmail.com wrote: On Thu, Jul 17, 2014 at 3:34 PM, AI Rumman rumman...@gmail.com wrote: Hi, I have been facing lock contention in my Postgresql 9.1 DB. And when I am querying in the pg_locks table I found a lock type with transactionid. Could someone please tell me what it means? Thanks. from http://www.postgresql.org/docs/9.3/static/view-pg-locks.html : Every transaction holds an exclusive lock on its virtual transaction ID for its entire duration. If a permanent ID is assigned to the transaction (which normally happens only if the transaction changes the state of the database), it also holds an exclusive lock on its permanent transaction ID until it ends. When one transaction finds it necessary to wait specifically for another transaction, it does so by attempting to acquire share lock on the other transaction ID (either virtual or permanent ID depending on the situation). That will succeed only when the other transaction terminates and releases its locks. I believe that describes what you're seeing -- Douglas J Hunley (doug.hun...@gmail.com)
Re: [GENERAL] Whats is lock type transactionid?
On Thu, Jul 17, 2014 at 12:54 PM, AI Rumman rumman...@gmail.com wrote: I am experiencing lock contention on one single UPDATE statement at a certain time in whole day. This is a small table to UPDATE. My suspect is we are facing it for one specific ID. Could you please let me know how can I identify the tuple. Have you tried the lock monitoring queries on http://wiki.postgresql.org/wiki/Lock_Monitoring yet by chance? -- Douglas J Hunley (doug.hun...@gmail.com)
[GENERAL] Detect streaming replication failure
For reference: https://wiki.postgresql.org/wiki/Streaming_Replication Assume a master - slave streaming replication configuration, Postgresql 9.2. Assume that the master has been chugging away, but the slave PG service has been offline for a while and the wal archive has updated enough that the slave cannot catch up. When I start the slave PG instance, pg launches and runs but doesn't update. It also doesn't seem to throw any errors. The only outward sign that I can see that anything is wrong is that pg_last_xlog_replay_location() doesn't update. I can look in /var/lib/pgsql/9.2/data/pg_log/postgresql-Thu.csv and see errors there EG: 2014-07-17 22:38:23.851 UTC,,,21310,,53c8505f.533e,2,,2014-07-17 22:38:23 UTC,,0,FATAL,XX000,could not receive data from WAL stream: FATAL: requested WAL segment 000700050071 has already been removed Is that the only way to detect this condition? I guess I'm looking for something like select * from pg_is_replicating_ok(); 1 on the slave. At the moment, it appears that I can either parse the log file, or look for pg_last_xact_replay_timestamp() acceptable threshold minutes in the past. http://www.postgresql.org/docs/9.2/static/functions-admin.html Thanks, Ben -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Whats is lock type transactionid?
Yes. But as we are using bind variables, we are not able to get the ID of the tuple. On Thu, Jul 17, 2014 at 2:08 PM, Douglas J Hunley doug.hun...@gmail.com wrote: On Thu, Jul 17, 2014 at 12:54 PM, AI Rumman rumman...@gmail.com wrote: I am experiencing lock contention on one single UPDATE statement at a certain time in whole day. This is a small table to UPDATE. My suspect is we are facing it for one specific ID. Could you please let me know how can I identify the tuple. Have you tried the lock monitoring queries on http://wiki.postgresql.org/wiki/Lock_Monitoring yet by chance? -- Douglas J Hunley (doug.hun...@gmail.com)
Re: [GENERAL] Checkpoint_segments optimal value
Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; QUERY PLAN - Nested Loop (cost=0.00..24.84 rows=1 width=64) Join Filter: (run.id = work_unit.run_id) - Nested Loop (cost=0.00..16.55 rows=1 width=16) - Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12) Index Cond: (status = 1) - Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8) Index Cond: (account.id = run.account_id) - Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52) Index Cond: (work_unit.status = 3) Table Structure \d work_unit Table public.work_unit Column |Type | Modifiers +-+--- id | integer | not null run_id | integer | status | integer | script | character varying | parameters | character varying | input | character varying | start_time | timestamp without time zone | stop_time | timestamp without time zone | priority | integer | type | integer | lineitems | integer | outputs| integer | sub_type | integer | site_code | text| Indexes: work_unit_pkey PRIMARY KEY, btree (id) idx_work_unit_1_partial btree (run_id, start_time) WHERE status = 3 idx_work_unit_1_run_id btree (run_id) idx_work_unit_1_script btree (script) idx_work_unit_1_site_code btree (site_code) idx_work_unit_1_starttime btree (start_time) idx_work_unit_1_status_3_new btree (status, type) WHERE status = 3 AND type 1 idx_work_unit_1_status_5 btree (status) WHERE status = 4 idx_work_unit_1_status_part_new btree (status) WHERE status 4 idx_work_unit_1_stop_time btree (stop_time) \d run Table public.run Column| Type | Modifiers -+--+--- id | integer | not null job_id | integer | start_time | timestamp with time zone | not null status | integer | not null job_name| character varying| account_id | integer | application_id | integer | postproc_script | character varying| upload_url | character varying| complete_time | timestamp with time zone | email | character varying| size| integer | errors | integer | raw_count | integer | munge | integer | job_details | character varying| user_settings | character varying| run_type| integer | stop_after | timestamp with time zone | total_work_time | integer | flags | integer | cluster_info_id | integer | skiplineitems_count | integer | last_update_ts | timestamp with time zone | result_exists | boolean | default false abort_type | integer | Indexes: run_pkey PRIMARY KEY, btree (id) idx_run_acc_stat_comp btree (account_id, status, complete_time) idx_run_app btree (application_id) idx_run_complete_time_2 btree (complete_time) idx_run_job_name btree (job_name) run_application_account_idx btree (application_id, account_id, status) run_job btree (job_id) run_result_exists btree (result_exists) run_start_time btree (start_time) run_status btree (status) run_status_1 btree (status) WHERE status = 1 run_status_part_idx btree (status) WHERE status 3 Triggers: run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE set_last_update_ts() tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE tr_proc_update_job_summary() On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech wmo...@potentialtech.com wrote: On Thu, 17 Jul 2014 11:28:04
Re: [GENERAL] Checkpoint_segments optimal value
even though the explain plan suggests differently but its taking long long time On Thu, Jul 17, 2014 at 4:26 PM, Prabhjot Sheena prabhjot.she...@rivalwatch.com wrote: Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; QUERY PLAN - Nested Loop (cost=0.00..24.84 rows=1 width=64) Join Filter: (run.id = work_unit.run_id) - Nested Loop (cost=0.00..16.55 rows=1 width=16) - Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12) Index Cond: (status = 1) - Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8) Index Cond: (account.id = run.account_id) - Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52) Index Cond: (work_unit.status = 3) Table Structure \d work_unit Table public.work_unit Column |Type | Modifiers +-+--- id | integer | not null run_id | integer | status | integer | script | character varying | parameters | character varying | input | character varying | start_time | timestamp without time zone | stop_time | timestamp without time zone | priority | integer | type | integer | lineitems | integer | outputs| integer | sub_type | integer | site_code | text| Indexes: work_unit_pkey PRIMARY KEY, btree (id) idx_work_unit_1_partial btree (run_id, start_time) WHERE status = 3 idx_work_unit_1_run_id btree (run_id) idx_work_unit_1_script btree (script) idx_work_unit_1_site_code btree (site_code) idx_work_unit_1_starttime btree (start_time) idx_work_unit_1_status_3_new btree (status, type) WHERE status = 3 AND type 1 idx_work_unit_1_status_5 btree (status) WHERE status = 4 idx_work_unit_1_status_part_new btree (status) WHERE status 4 idx_work_unit_1_stop_time btree (stop_time) \d run Table public.run Column| Type | Modifiers -+--+--- id | integer | not null job_id | integer | start_time | timestamp with time zone | not null status | integer | not null job_name| character varying| account_id | integer | application_id | integer | postproc_script | character varying| upload_url | character varying| complete_time | timestamp with time zone | email | character varying| size| integer | errors | integer | raw_count | integer | munge | integer | job_details | character varying| user_settings | character varying| run_type| integer | stop_after | timestamp with time zone | total_work_time | integer | flags | integer | cluster_info_id | integer | skiplineitems_count | integer | last_update_ts | timestamp with time zone | result_exists | boolean | default false abort_type | integer | Indexes: run_pkey PRIMARY KEY, btree (id) idx_run_acc_stat_comp btree (account_id, status, complete_time) idx_run_app btree (application_id) idx_run_complete_time_2 btree (complete_time) idx_run_job_name btree (job_name) run_application_account_idx btree (application_id, account_id, status) run_job btree (job_id) run_result_exists btree (result_exists) run_start_time btree (start_time) run_status btree (status) run_status_1 btree (status) WHERE status = 1 run_status_part_idx btree (status) WHERE status 3 Triggers: run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH
Re: [GENERAL] Checkpoint_segments optimal value
On 7/17/2014 4:26 PM, Prabhjot Sheena wrote: Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening explain SELECT account.id http://account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id http://work_unit.id, work_unit.start_time, run.id http://run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id http://run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id http://account.id; you need to use EXPLAIN ANALYZE to get accurate data. run it when the query is fast, and again when the query is slow, paste both outputs here. also, you can paste them to http://explain.depesz.com and that will give you a nice analysis of the timing data included in the EXPLAIN ANALYZE output. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Checkpoint_segments optimal value
On 7/17/2014 11:28 AM, Prabhjot Sheena wrote: SELECT account.id http://account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id http://work_unit.id, work_unit.start_time, run.id http://run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id http://run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id http://account.id that query is an implied 3-way join. it would be clearer to write it like... SELECT account.id http://account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id http://work_unit.id, work_unit.start_time, run.id http://run.id, work_unit.priority FROM work_unit JOIN run ON work_unit.run_id = run.id http://run.id JOIN account ON run.account_id = account.id http://account.id WHERE work_unit.status = 3 AND work_unit.type != 1 AND run.status = 1; postgres would (or at least should) treat this exactly the same, but its clearer what the query is doing when its written this way. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Checkpoint_segments optimal value
i just did explain analyze and currently database is running slow coz of the query explain ANALYZE SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; QUERY PLAN -- Nested Loop (cost=0.00..24.84 rows=1 width=64) (actual time=71.824..69729.467 rows=1820 loops=1) - Nested Loop (cost=0.00..16.56 rows=1 width=60) (actual time=71.760..69628.874 rows=1820 loops=1) Join Filter: (work_unit.run_id = run.id) - Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52) (actual time=0.067..154.364 rows=1820 loops=1) Index Cond: (status = 3) - Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12) (actual time=0.081..34.338 rows=3138 loops=1820) Index Cond: (run.status = 1) - Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8) (actual time=0.044..0.046 rows=1 loops=1820) Index Cond: (account.id = run.account_id) Total runtime: 69732.893 ms On Thu, Jul 17, 2014 at 4:45 PM, John R Pierce pie...@hogranch.com wrote: On 7/17/2014 4:26 PM, Prabhjot Sheena wrote: Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; you need to use EXPLAIN ANALYZE to get accurate data. run it when the query is fast, and again when the query is slow, paste both outputs here. also, you can paste them to http://explain.depesz.com and that will give you a nice analysis of the timing data included in the EXPLAIN ANALYZE output. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Checkpoint_segments optimal value
On 7/17/2014 5:01 PM, Prabhjot Sheena wrote: i just did explain analyze and currently database is running slow coz of the query explain ANALYZE SELECT account.id http://account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id http://work_unit.id, work_unit.start_time, run.id http://run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id http://run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id http://account.id; what does the overall workload on the system look like when this is so slow? taking over a minute to do 1820 iterations of a 3138 row index scan seems way out of bounds for anything other than a very heavily overloaded server. is this running on a virtual machine where there are other virtual servers contending for the same resources ? btw, here's your analyze formatted pretty: http://explain.depesz.com/s/XRI -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Windows Installation User account - Correct database for us
On 07/17/2014 07:30 AM, Wolfgang Keller wrote: H2 or SQLite can be very reasonable choices for in-app embedded databases, so long as your app can operate within their constraints on concurrency and data size. Firefox uses SQLite and the places.sqlite database is notorious for getting corrupted every five minutes. I have never seen that on any of the Firefox installs I have done. Maybe I am just lucky:) Libreoffice/Openoffice Base uses H2 and there are plenty of reports by users about data losses due to database corruption. Actually I thought it bundled HSQLDB and it was optional whether you used it or not as it supported multiple DB backends. In any case, my experience has been that Base has issues independent of the backend. Personally I'd *never* use an embedded database for anything. So I'm guessing you do not use any devices other than full blown servers:) Sincerely, Wolfgang -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint_segments optimal value
The overall load of system is good. It is a vm but the other database is running normal and not doing much activity vmstat 5 10 procs ---memory-- ---swap-- -io --system-- -cpu-- r b swpd free buff cache si sobibo in cs us sy id wa st 0 2 22048 296804 5016 1082936000 1 300 5 1 75 18 1 9 1 22048 255184 5024 1086888400 060 8955 10518 33 3 41 15 8 11 3 22048 197664 5024 1092594400 0 0 7866 8767 37 2 44 12 6 12 1 22048 14 5048 1097662000 010 9181 10632 39 3 42 11 6 7 1 22048 148068 5076 1097010400 133 7456 7715 34 2 46 8 10 11 3 22048 87408 5076 1102783200 0 0 9846 12631 31 4 38 14 14 8 1 22048 48380 5084 1106745600 035 7887 8902 33 3 43 11 10 On Thu, Jul 17, 2014 at 5:10 PM, John R Pierce pie...@hogranch.com wrote: On 7/17/2014 5:01 PM, Prabhjot Sheena wrote: i just did explain analyze and currently database is running slow coz of the query explain ANALYZE SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id; what does the overall workload on the system look like when this is so slow? taking over a minute to do 1820 iterations of a 3138 row index scan seems way out of bounds for anything other than a very heavily overloaded server. is this running on a virtual machine where there are other virtual servers contending for the same resources ? btw, here's your analyze formatted pretty: http://explain.depesz.com/s/XRI -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Checkpoint_segments optimal value
On 7/17/2014 5:36 PM, Prabhjot Sheena wrote: The overall load of system is good. It is a vm but the other database is running normal and not doing much activity other database? is that on the same postgresql cluster? or on a seperate instance of the postgres server on the same VM? or on a seperate VM on the same host ? what does `iostat -x 5` show ? let it run a bunch of samples, then run this slow query and get more samples. pay attention to the disk drive(s) the database is stored on. high %utilization or await times would be a bad thing. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Detect streaming replication failure
you can run select * from pg_stat_replication on master to check all the salve stats. On Fri, Jul 18, 2014 at 6:50 AM, Lists li...@benjamindsmith.com wrote: For reference: https://wiki.postgresql.org/wiki/Streaming_Replication Assume a master - slave streaming replication configuration, Postgresql 9.2. Assume that the master has been chugging away, but the slave PG service has been offline for a while and the wal archive has updated enough that the slave cannot catch up. When I start the slave PG instance, pg launches and runs but doesn't update. It also doesn't seem to throw any errors. The only outward sign that I can see that anything is wrong is that pg_last_xlog_replay_location() doesn't update. I can look in /var/lib/pgsql/9.2/data/pg_log/postgresql-Thu.csv and see errors there EG: 2014-07-17 22:38:23.851 UTC,,,21310,,53c8505f.533e,2,,2014-07-17 22:38:23 UTC,,0,FATAL,XX000,could not receive data from WAL stream: FATAL: requested WAL segment 000700050071 has already been removed Is that the only way to detect this condition? I guess I'm looking for something like select * from pg_is_replicating_ok(); 1 on the slave. At the moment, it appears that I can either parse the log file, or look for pg_last_xact_replay_timestamp() acceptable threshold minutes in the past. http://www.postgresql.org/docs/9.2/static/functions-admin.html Thanks, Ben -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Watching Views
Dear List, I am interested in replicating views of my data in real time to a frontend visualizer. I've looked around, and it seems that most applications in this direction write some ad-hoc json-formatter that spits out exactly the columns it is interested in. I want something more like Cubes[1], where a user (or at least, some javascript) can say I am interested in this slice of the world, and then get updates to that slice, but unlike Cubes it must be near-real-time: I want to hook events, not just redownload tables. In principle, I am looking for some way to say ``` CREATE VIEW view13131 AS select (id, name, bank_account) from actors where age 22; WATCH view13131; ``` and get output to stdout like ``` INSERT view13131 VALUES (241, Mortimer, 131.09); ... INSERT view13131 VALUES (427, Schezwan, 95.89); UPDATE view13131 SET bank_account = 1017.12 WHERE id = 427; DELETE FROM view13131 WHERE id = 92; ... ``` (and then I would stick a SQL-to-JSON proxy in the middle to make it more web-happy, and handle disconnects with a corresponding UNWATCH operation) I am stumped about the best way to go about this. Is there any extension that does this specific task for postgres? CouchDB seems to have this implemented[2](!) but there are a lot of reasons I am wary of going that route. dat[3] is specifically for real-time replication and versioning for data sets; it's super new and shakey, but promising. I spent awhile trying to implement the replication protocol[4] before I found a tip[5] which pointed out that the confusing, opaque, binary data I was getting (like b'x00\x98\x08\x00\x00\x00\x00\n\x00\x1f\x00\x10@bid\x00\x98\x08\x00\x00\x00\x00\n\x00\x1e\x00\x18@dicks\x00\x00\x00\x98\x08\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x10\x00\x18@event_2_2\x00\x00\x00\x15@\x00\x00\x00\x00\n\x00\x08\x00\x18@event_2_1\x00\x00\x00\x15@\x00\x00\x00\x00\n\x00\x01\x00\x18@event_2_0\x00\x00\x00\x15@\x00\x00\x00\x00\t\x00\x14\x00 @event_2_2_txid_idx\x00\x00\x15@\x00\x00\x00\x00\t\x00\x0f\x00\x18@event_2_2\x00\x00\x00\x15@\x00\x00\x00\x00\t\x00\x0e\x00 @event_2_1_txid_idx\x00\x00\x15@') is--I believe, please correct me if I'm wrong--a verbatim copy of postgres's internal data structures. Will it pay off to reverse and reimplement these data structures in javascript? The tipster seemed to think not, but that was 6 years ago. Also, this solution doesn't give me the ability to slice data, though I could hack it with some kind of ridiculous proxy database setup. I discovered Skytools[6]'s Londiste, which will replicate only specific tables and seems very close to what I want, but it seems like it has a lot of administrative overhead and is targetted at postgres-to-postgres log shipping. Does anyone know if I can hook it somewhere in the middle in order to extract the CREATE, UPDATE and DELETE events? My last option that I am considering is writing code myself which sets and unsets Postgres triggers corresponding to each WATCH statement. I could implement this as PL/pgSQL or on Python+SQLAlchemy. This seems like it might end up fragile, so if I do end up going this route, I would appreciate any tidbits and gotchas you might have to share. If I ALTER VIEW will the triggers all fire appropriately? Can I even set triggers on views? Thanks in advance -Nick Guenther 4B Stats/CS University of Waterloo [1] Cubes http://cubes.databrewery.org/ [2] CouchDB. Filtered Replication. http://couchdb.readthedocs.org/en/latest/replication/protocol.html#filter-replication [3] https://github.com/maxogden/dat [4] Postgres Streaming Replication Protocol http://www.postgresql.org/docs/current/static/protocol-replication.html / Guide http://guide.couchdb.org/draft/notifications.html#continuous [5] Erik Jones, Re: reading WAL files in python http://www.postgresql.org/message-id/ce398e79-ffef-4219-9606-f5d28ad2e...@myemma.com [6] SkyTools http://wiki.postgresql.org/wiki/Skytools -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Watching Views
Nick Guenther wrote Dear List, I am interested in replicating views of my data in real time to a frontend visualizer. I've looked around, and it seems that most applications in this direction write some ad-hoc json-formatter that spits out exactly the columns it is interested in. I want something more like Cubes[1], where a user (or at least, some javascript) can say I am interested in this slice of the world, and then get updates to that slice, but unlike Cubes it must be near-real-time: I want to hook events, not just redownload tables. In principle, I am looking for some way to say ``` CREATE VIEW view13131 AS select (id, name, bank_account) from actors where age 22; WATCH view13131; ``` and get output to stdout like ``` INSERT view13131 VALUES (241, Mortimer, 131.09); ... INSERT view13131 VALUES (427, Schezwan, 95.89); UPDATE view13131 SET bank_account = 1017.12 WHERE id = 427; DELETE FROM view13131 WHERE id = 92; ... ``` 9.4 - http://www.postgresql.org/docs/9.4/static/logicaldecoding.html Though I doubt your requirement to obtain only a subset of data is something that can be accommodated; especially in SQL form. And, yes, you can create triggers on views. http://www.postgresql.org/docs/9.4/static/sql-createtrigger.html But assuming your view is meant to be dynamic, covering only the subset of data you wish to watch, no one is going to be using your view to actually Insert/Update/Delete against the underlying table(s) so it will not do you any good to add triggers to it. You probably need to create some kind of materialized view and add a trigger to the relevant source table to maintain that view on an ongoing basis. Then remove the trigger (and optionally the materialized view) when you no longer care to watch. This: http://www.postgresql.org/docs/9.4/static/sql-creatematerializedview.html or roll your own. You can also use the full power of whatever programming languages you can install onto the server (e.g., pl/perl, pl/python, pl/sh) to link up with the outside world from inside one of those triggers... David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Watching-Views-tp5811927p5811931.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general