Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Scott Rankin
On 1/6/16, 11:14 PM, "Jim Nasby"  wrote:


>On 1/6/16 12:01 PM, Scott Rankin wrote:
>> I guess we’re back to lock contention?
>
>Is there by chance an anti-wraparound vacuum happening on that table?
>
>Actually, for that matter... if autovacuum is hitting that table it's
>locking could be causing problems, and it won't release it's locks until
>the deadlock detector kicks in.
>--
>Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>Experts in Analytics, Data Architecture and PostgreSQL
>Data in Trouble? Get it in Treble! http://BlueTreble.com

Hi Jim, Tom,

Thanks both of you for your help, by the way.

I lowered the deadlock_timeout this morning to 100ms and ran for a little 
while, but I did not see any lock waits even when the statements ran slowly.

Jim, to your point, I enabled auto vacuum logging 
(log_autovacuum_min_duration=0) but no vacuums are taking place even when I see 
slow queries.

Any other ideas?

Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Tom Lane
Scott Rankin  writes:
> Any other ideas?

If there's no lock waits, and the tables are static, it's difficult to
arrive at any other conclusion but that the process is simply losing the
CPU for long intervals.  What else is going on on this box?  Have you
made any attempt to correlate the slow queries with spikes in load
average, swap activity, etc?

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Scott Rankin

On 1/7/16, 10:19 AM, "Tom Lane"  wrote:

>Scott Rankin  writes:
>> Any other ideas?
>
>If there's no lock waits, and the tables are static, it's difficult to
>arrive at any other conclusion but that the process is simply losing the
>CPU for long intervals.  What else is going on on this box?  Have you
>made any attempt to correlate the slow queries with spikes in load
>average, swap activity, etc?
>
>regards, tom lane

This box is a dedicated postgresql box, the load is running around 2-3 (on a 
24-core box).  No swapping.

I’ll keep digging!

Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Andres Freund
On 2016-01-07 13:34:51 -0500, Tom Lane wrote:
> It's fairly well established that the implementation of transparent
> huge pages in Linux kernels from the 2.6-or-so era sucks, and you're
> best off turning it off if you care about consistency of performance.

I think the feature wasn't introduced in original 2.6 kernels (3.2 or
so?), but red hat had backported it to their 2.6.32 kernel.


> I am not sure whether modern kernels have improved this area.

I think the problem has largely been solved around 3.16. Around 4.1 I
could still  reproduce problems, but the regressions were only in the
sub 10% range in my test workload.


Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Tom Lane
Rick Otten  writes:
> On Thu, Jan 7, 2016 at 12:51 PM, Scott Rankin  wrote:
>> Winner!  Both of those settings were set to always, and as soon as I
>> turned them off, the query times smoothed right out.

> Is this generally true for all PostgreSQL systems on Linux, or only for
> specific use cases?

It's fairly well established that the implementation of transparent
huge pages in Linux kernels from the 2.6-or-so era sucks, and you're
best off turning it off if you care about consistency of performance.
I am not sure whether modern kernels have improved this area.

I think you can get an idea of how big a problem you have by noting
the accumulated runtime of the khugepaged daemon.

(BTW, it would likely be a good thing to collect some current wisdom
in this area and add it to section 17.4 of our docs.)

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Jim Nasby

On 1/7/16 1:47 PM, Andres Freund wrote:

On 2016-01-07 13:34:51 -0500, Tom Lane wrote:

It's fairly well established that the implementation of transparent
huge pages in Linux kernels from the 2.6-or-so era sucks, and you're
best off turning it off if you care about consistency of performance.


I think the feature wasn't introduced in original 2.6 kernels (3.2 or
so?), but red hat had backported it to their 2.6.32 kernel.



I am not sure whether modern kernels have improved this area.


I think the problem has largely been solved around 3.16. Around 4.1 I
could still  reproduce problems, but the regressions were only in the
sub 10% range in my test workload.


BTW, looks like Scott blogged about this along with some nice graphs: 
https://sdwr98.wordpress.com/2016/01/07/transparent-huge-pages-or-why-dbas-are-great/

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Scott Rankin
On 1/6/16, 10:38 AM, "Tom Lane"  wrote:

>
>A possible theory is that the slow cases represent times when the desired
>page is not in cache, but you'd have to have a seriously overloaded disk
>subsystem for a disk fetch to take hundreds of ms.  Unless maybe this is
>running on some cloud service with totally unspecified I/O bandwidth?

This intrigues me. We are running on a, shall we say, less than name-brand 
cloud provider at the moment (transitioning to AWS later this month).  Is there 
a reasonably straightforward way of confirming this hypothesis? We have had 
many performance issues with this vendor in the past, so I wouldn’t be 
surprised.

>
>> I will look at changing the deadlock_timeout, but that might have to wait 
>> for the weekend since this is a production system.
>
>You needn't restart the server for that, just edit postgresql.conf and
>SIGHUP the postmaster.

Yep, we just try to limit any database changes until off-hours unless it’s an 
emergency.

>
>regards, tom lane

Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Tom Lane
Scott Rankin  writes:
> Thanks for the update. The query in question is a pretty simple one - it 
> joins 3 tables, all of which are static - they don’t have any writes being 
> done against them.  They have very few rows, and the query plan for them 
> indicates that they are all sequential scans.  When doing an EXPLAIN ANALYZE, 
> the delay is not consistently on one table, it can vary between the three 
> tables involved in the query.

No writes at all?  That's pretty odd then.  All the likely explanations
involve autovacuum or other forms of deferred maintenance.

A possible theory is that the slow cases represent times when the desired
page is not in cache, but you'd have to have a seriously overloaded disk
subsystem for a disk fetch to take hundreds of ms.  Unless maybe this is
running on some cloud service with totally unspecified I/O bandwidth?

> I will look at changing the deadlock_timeout, but that might have to wait for 
> the weekend since this is a production system.

You needn't restart the server for that, just edit postgresql.conf and
SIGHUP the postmaster.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Scott Rankin
On 1/6/16, 10:19 AM, "Tom Lane"  wrote:


>Scott Rankin  writes:
>> I’m trying to track down why some queries on my database system are 
>> intermittently much slower than usual.  I have some queries that run, on 
>> average, 2-3ms, and they run at a rate of about 10-20 queries/second.  
>> However, every 3-5 seconds, one of the queries will be 500-100ms.  This is 
>> making the average query time turn out to be closer to 20ms, with a very 
>> large standard deviation.
>
>> This happens to a number of otherwise very fast queries, and I’m trying to 
>> trace the reason.  I’ve turned on lock logging and checkpoint logging, and 
>> this behavior happens whether or not a checkpoint is occurring.  There are 
>> no lock waits happening in the system either.
>
>I doubt you've proved that --- log_lock_waits will only report on waits
>longer than deadlock_timeout, which you don't appear to have changed from
>its default of 1 sec.  If you're trying to capture events that last a few
>hundred msec, you're going to need to reduce deadlock_timeout to maybe
>100ms.
>
>It would help to know more about what the queries are, too.  The cause
>might be something like GIN index pending-list cleanup but we can't tell
>on the basis of this much info.
>
>regards, tom lane

Hi Tom,

Thanks for the update. The query in question is a pretty simple one - it joins 
3 tables, all of which are static - they don’t have any writes being done 
against them.  They have very few rows, and the query plan for them indicates 
that they are all sequential scans.  When doing an EXPLAIN ANALYZE, the delay 
is not consistently on one table, it can vary between the three tables involved 
in the query.

Here is the query plan for a fast run:

"Nested Loop  (cost=0.00..4.20 rows=1 width=185) (actual time=0.069..0.069 
rows=0 loops=1)"
"  Join Filter: (be.branding_id = b.branding_id)"
"  Rows Removed by Join Filter: 1"
"  ->  Nested Loop  (cost=0.00..3.19 rows=1 width=189) (actual 
time=0.040..0.057 rows=1 loops=1)"
"Join Filter: (s.setting_id = be.setting_id)"
"Rows Removed by Join Filter: 41"
"->  Seq Scan on branding_extension be  (cost=0.00..1.00 rows=1 
width=8) (actual time=0.008..0.008 rows=1 loops=1)"
"->  Seq Scan on setting s  (cost=0.00..2.04 rows=42 width=185) (actual 
time=0.004..0.018 rows=42 loops=1)"
"  ->  Seq Scan on branding b  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.006..0.008 rows=1 loops=1)"
"Filter: ((name)::text = 'crs'::text)"
"Rows Removed by Filter: 1"
"Total runtime: 0.150 ms"

And for a slow one:


"Nested Loop  (cost=0.00..4.20 rows=1 width=185) (actual time=383.862..383.862 
rows=0 loops=1)"
"  Join Filter: (be.branding_id = b.branding_id)"
"  Rows Removed by Join Filter: 1"
"  ->  Nested Loop  (cost=0.00..3.19 rows=1 width=189) (actual 
time=383.815..383.849 rows=1 loops=1)"
"Join Filter: (s.setting_id = be.setting_id)"
"Rows Removed by Join Filter: 41"
"->  Seq Scan on branding_extension be  (cost=0.00..1.00 rows=1 
width=8) (actual time=383.776..383.776 rows=1 loops=1)"
"->  Seq Scan on setting s  (cost=0.00..2.04 rows=42 width=185) (actual 
time=0.005..0.037 rows=42 loops=1)"
"  ->  Seq Scan on branding b  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.007..0.009 rows=1 loops=1)"
"Filter: ((name)::text = 'crs'::text)"
"Rows Removed by Filter: 1"
"Total runtime: 383.920 ms"

I will look at changing the deadlock_timeout, but that might have to wait for 
the weekend since this is a production system.


Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list 

[PERFORM] Queries intermittently slow

2016-01-06 Thread Scott Rankin
Hi all,

I’m trying to track down why some queries on my database system are 
intermittently much slower than usual.  I have some queries that run, on 
average, 2-3ms, and they run at a rate of about 10-20 queries/second.  However, 
every 3-5 seconds, one of the queries will be 500-100ms.  This is making the 
average query time turn out to be closer to 20ms, with a very large standard 
deviation.

This happens to a number of otherwise very fast queries, and I’m trying to 
trace the reason.  I’ve turned on lock logging and checkpoint logging, and this 
behavior happens whether or not a checkpoint is occurring.  There are no lock 
waits happening in the system either.

Here’s my info:

- PostgreSQL 9.2.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 
20120313 (Red Hat 4.4.7-4), 64-bit
- Installed from Postgresql Yum repo
- Changed Settings:

"application_name";"pgAdmin III - Query Tool";"client"
"archive_command";"/usr/crsinc/bin/wal_processor.sh %f %p";"configuration file"
"archive_mode";"on";"configuration file"
"autovacuum_analyze_scale_factor";"0.1";"configuration file"
"autovacuum_analyze_threshold";"500";"configuration file"
"autovacuum_naptime";"10min";"configuration file"
"autovacuum_vacuum_cost_delay";"50ms";"configuration file"
"autovacuum_vacuum_cost_limit";"250";"configuration file"
"autovacuum_vacuum_scale_factor";"0.1";"configuration file"
"autovacuum_vacuum_threshold";"1000";"configuration file"
"bgwriter_delay";"100ms";"configuration file"
"bgwriter_lru_maxpages";"1000";"configuration file"
"bytea_output";"escape";"session"
"checkpoint_completion_target";"0.7";"configuration file"
"checkpoint_segments";"128";"configuration file"
"checkpoint_timeout";"30min";"configuration file"
"checkpoint_warning";"895s";"configuration file"
"client_encoding";"UNICODE";"session"
"client_min_messages";"notice";"session"
"cpu_tuple_cost";"0.001";"configuration file"
"DateStyle";"ISO, MDY";"session"
"default_text_search_config";"pg_catalog.english";"configuration file"
"effective_cache_size";"94GB";"configuration file"
"effective_io_concurrency";"100";"configuration file"
"hot_standby";"off";"configuration file"
"hot_standby_feedback";"off";"configuration file"
"lc_messages";"en_US.UTF-8";"configuration file"
"lc_monetary";"en_US.UTF-8";"configuration file"
"lc_numeric";"en_US.UTF-8";"configuration file"
"lc_time";"en_US.UTF-8";"configuration file"
"listen_addresses";"*";"configuration file"
"log_checkpoints";"on";"configuration file"
"log_destination";"stderr";"configuration file"
"log_directory";"pg_log";"configuration file"
"log_filename";"postgresql-%a.log";"configuration file"
"log_line_prefix";"< user=%u db=%d host=%h time=%t pid=%p 
xid=%x>";"configuration file"
"log_lock_waits";"on";"configuration file"
"log_min_duration_statement";"100ms";"configuration file"
"log_rotation_age";"1d";"configuration file"
"log_rotation_size";"0";"configuration file"
"log_temp_files";"0";"configuration file"
"log_truncate_on_rotation";"on";"configuration file"
"logging_collector";"on";"configuration file"
"maintenance_work_mem";"6047MB";"configuration file"
"max_connections";"1300";"configuration file"
"max_stack_depth";"2MB";"environment variable"
"max_wal_senders";"4";"configuration file"
"port";"5432";"command line"
"random_page_cost";"1.9";"configuration file"
"shared_buffers";"8GB";"configuration file"
"ssl";"on";"configuration file"
"ssl_renegotiation_limit";"0";"configuration file"
"standard_conforming_strings";"off";"configuration file"
"superuser_reserved_connections";"10";"configuration file"
"TimeZone";"US/Pacific";"configuration file"
"wal_keep_segments";"64";"configuration file"
"wal_level";"hot_standby";"configuration file"
"work_mem";"188MB";"configuration file"

OS: CentOS release 6.7 (Linux databasep1.crsinc.com 2.6.32-504.16.2.el6.x86_64 
#1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux). Our server 
has 24 cores and 128 GB of RAM.

This happens via psql, JDBC, PDO connections.

Thanks in advance for any help!
Scott

This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding 

Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Tom Lane
Scott Rankin  writes:
> I’m trying to track down why some queries on my database system are 
> intermittently much slower than usual.  I have some queries that run, on 
> average, 2-3ms, and they run at a rate of about 10-20 queries/second.  
> However, every 3-5 seconds, one of the queries will be 500-100ms.  This is 
> making the average query time turn out to be closer to 20ms, with a very 
> large standard deviation.

> This happens to a number of otherwise very fast queries, and I’m trying to 
> trace the reason.  I’ve turned on lock logging and checkpoint logging, and 
> this behavior happens whether or not a checkpoint is occurring.  There are no 
> lock waits happening in the system either.

I doubt you've proved that --- log_lock_waits will only report on waits
longer than deadlock_timeout, which you don't appear to have changed from
its default of 1 sec.  If you're trying to capture events that last a few
hundred msec, you're going to need to reduce deadlock_timeout to maybe
100ms.

It would help to know more about what the queries are, too.  The cause
might be something like GIN index pending-list cleanup but we can't tell
on the basis of this much info.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Tom Lane
Scott Rankin  writes:
> On 1/6/16, 10:38 AM, "Tom Lane"  wrote:
>> A possible theory is that the slow cases represent times when the desired
>> page is not in cache, but you'd have to have a seriously overloaded disk
>> subsystem for a disk fetch to take hundreds of ms.  Unless maybe this is
>> running on some cloud service with totally unspecified I/O bandwidth?

> This intrigues me. We are running on a, shall we say, less than name-brand 
> cloud provider at the moment (transitioning to AWS later this month).  Is 
> there a reasonably straightforward way of confirming this hypothesis? We have 
> had many performance issues with this vendor in the past, so I wouldn’t be 
> surprised.

Hm, well, given that you are able to capture instances of the behavior
in EXPLAIN ANALYZE, I'd suggest trying EXPLAIN (ANALYZE,BUFFERS).
That will tell you the number of pages it found in shared buffers vs.
having to read them.  Now, a "read" just means we had to ask the kernel,
not necessarily that the page came all the way from disk; if it's in
the kernel's disk cache that won't be very much slower than a shared-
buffers hit.  Still, if the slowdowns are reliably seen only when a read
occurred, I'd say that's good evidence.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Scott Rankin
>
>So what seems more likely after more thought is that the pages are staying
>in our shared buffer arena just fine, but the kernel is randomly choosing
>to swap out parts of the arena, and the delays correspond to swap-in
>waits.  (There would still have to be a mighty crummy disk subsystem
>underlying things for swap-in to take so long, but this is a more
>plausible theory for exactly what's invoking the disk read.)
>
>Postgres can't directly see when this is happening, but you could try
>watching "iostat 1" and noticing whether swap-in events seem to be
>correlated with the slow queries.
>
>If this is the problem, then the answer is to reduce the pressure on
>system memory so that swap-outs are less likely.  You might find that
>a smaller shared_buffer arena is a good thing (so that all of it stays
>"hot" and unswappable from the kernel's perspective).  Or reduce the
>number of active backend processes.
>
>regards, tom lane

One change to my setup that my server guy pointed out - this database server is 
*not* a cloud server, but in fact a physical Dell box with SSDs.  So unless 
something is majorly wrong with our disk setup, disk IO should not be a factor.

I also checked and the box is not swapping at all.

I guess we’re back to lock contention?

Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Tom Lane
I wrote:
> A possible theory is that the slow cases represent times when the desired
> page is not in cache, but you'd have to have a seriously overloaded disk
> subsystem for a disk fetch to take hundreds of ms.  Unless maybe this is
> running on some cloud service with totally unspecified I/O bandwidth?

BTW, a glaring flaw in that theory is that if this query is touching only
about four pages worth of data, and you are running it ten times a second,
how in heck would that data ever fall out of shared buffer cache at all?
Your working set across your whole DB would have to be enormously more
than your 8GB shared_buffers setting for that to possibly happen.

So what seems more likely after more thought is that the pages are staying
in our shared buffer arena just fine, but the kernel is randomly choosing
to swap out parts of the arena, and the delays correspond to swap-in
waits.  (There would still have to be a mighty crummy disk subsystem
underlying things for swap-in to take so long, but this is a more
plausible theory for exactly what's invoking the disk read.)

Postgres can't directly see when this is happening, but you could try
watching "iostat 1" and noticing whether swap-in events seem to be
correlated with the slow queries.

If this is the problem, then the answer is to reduce the pressure on
system memory so that swap-outs are less likely.  You might find that
a smaller shared_buffer arena is a good thing (so that all of it stays
"hot" and unswappable from the kernel's perspective).  Or reduce the
number of active backend processes.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Scott Rankin
>
>Hm, well, given that you are able to capture instances of the behavior
>in EXPLAIN ANALYZE, I'd suggest trying EXPLAIN (ANALYZE,BUFFERS).
>That will tell you the number of pages it found in shared buffers vs.
>having to read them.  Now, a "read" just means we had to ask the kernel,
>not necessarily that the page came all the way from disk; if it's in
>the kernel's disk cache that won't be very much slower than a shared-
>buffers hit.  Still, if the slowdowns are reliably seen only when a read
>occurred, I'd say that's good evidence.

Well, that was a good thought - but:

Nested Loop  (cost=0.00..4.20 rows=1 width=185) (actual time=567.113..567.113 
rows=0 loops=1)
  Join Filter: (be.branding_id = b.branding_id)
  Rows Removed by Join Filter: 1
  Buffers: shared hit=4
  ->  Nested Loop  (cost=0.00..3.19 rows=1 width=189) (actual 
time=567.081..567.092 rows=1 loops=1)
Join Filter: (s.setting_id = be.setting_id)
Rows Removed by Join Filter: 41
Buffers: shared hit=3
->  Seq Scan on branding_extension be  (cost=0.00..1.00 rows=1 width=8) 
(actual time=0.011..0.012 rows=1 loops=1)
  Buffers: shared hit=1
->  Seq Scan on setting s  (cost=0.00..2.04 rows=42 width=185) (actual 
time=567.041..567.049 rows=42 loops=1)
  Buffers: shared hit=2
  ->  Seq Scan on branding b  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.007..0.009 rows=1 loops=1)
Filter: ((name)::text = 'crs'::text)
Rows Removed by Filter: 1
Buffers: shared hit=1
Total runtime: 567.185 ms


I ran it several times and there were no reads, just shared hits, even on slow 
executions.

It looks like it didn’t go out to the kernel for the data. I guess next steps 
would be to adjust the deadlock_timeout? If I’m reading the docs correctly, 
this doesn’t change how Postgres determines a deadlock, it just makes it check 
for them sooner?

Thanks,
Scott
This email message contains information that Motus, LLC considers confidential 
and/or proprietary, or may later designate as confidential and proprietary. It 
is intended only for use of the individual or entity named above and should not 
be forwarded to any other persons or entities without the express consent of 
Motus, LLC, nor should it be used for any purpose other than in the course of 
any potential or actual business relationship with Motus, LLC. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible to deliver it to the intended recipient, you are hereby notified 
that any dissemination, distribution, or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
notify sender immediately and destroy the original message.

Internal Revenue Service regulations require that certain types of written 
advice include a disclaimer. To the extent the preceding message contains 
advice relating to a Federal tax issue, unless expressly stated otherwise the 
advice is not intended or written to be used, and it cannot be used by the 
recipient or any other taxpayer, for the purpose of avoiding Federal tax 
penalties, and was not written to support the promotion or marketing of any 
transaction or matter discussed herein.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries intermittently slow

2016-01-06 Thread Jim Nasby

On 1/6/16 12:01 PM, Scott Rankin wrote:

I guess we’re back to lock contention?


Is there by chance an anti-wraparound vacuum happening on that table?

Actually, for that matter... if autovacuum is hitting that table it's 
locking could be causing problems, and it won't release it's locks until 
the deadlock detector kicks in.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance