from:"Andres Freund"

Re: [PERFORM] Slow execution of SET ROLE, SET search_path and RESET ROLE

2017-11-07 Thread Andres Freund

On 2017-11-07 18:48:14 +0100, Ulf Lohbrügge wrote:
> Hi,
> 
> 2017-11-07 16:11 GMT+01:00 Andres Freund <and...@anarazel.de>:
> 
> > Hi,
> >
> > On 2017-11-07 11:11:36 +0100, Ulf Lohbrügge wrote:
> > > I'm using PostgreSQL 9.5.9 on Debian and experience slow execution of
> > some
> > > basic SET statements.
> > >
> > > I created about 1600 roles and use that setup for a multi tenancy
> > > application:
> >
> > Hm. How often do you drop/create these roles?  How many other
> > roles/groups is one role a member of?
> >
> 
> I create between 10-40 roles per day.

Could you VACUUM (VERBOSE, FREEZE) that table and report the output?  Do
you ever delete roles?

> > Can you manually reproduce the problem? What times do you get if you
> > manually run the statement?
> >
> 
> Unfortunately not. Every time I manually execute "SET ROLE ..." the
> statement is pretty fast. I created a simple SQL file that contains the
> following statements:
> 
> --snip--
> SET ROLE tenant382;
> SET ROLE tenant1337;
> SET ROLE tenant2;
> -- repeat the lines above 100k times
> --snap--
> 
> When I execute those statements via 'time psql < set-roles.sql', the call
> lasts 138,7 seconds. So 300k "SET ROLE" statements result in 0,46ms per
> call on average.

And most of that is going to be roundtrip time. Hm. Could it be that
you're just seeing the delays when pgbouncer establishes new pooling
connections and you're attributing that to SET ROLE in your app?

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow execution of SET ROLE, SET search_path and RESET ROLE

2017-11-07 Thread Andres Freund

Hi,

On 2017-11-07 11:11:36 +0100, Ulf Lohbrügge wrote:
> I'm using PostgreSQL 9.5.9 on Debian and experience slow execution of some
> basic SET statements.
> 
> I created about 1600 roles and use that setup for a multi tenancy
> application:

Hm. How often do you drop/create these roles?  How many other
roles/groups is one role a member of?


> My application solely uses the role 'admin' to connect to the database.
> When performing sql statements for a specific tenant (e.g. tenant1337), a
> connection with user 'admin' is established and the following commands are
> executed:
> 
> SET ROLE 1337;
> SET search_path = tenant1337;
> 
> Then the application uses that connection to perform various statements in
> the database.

Just to be sure: You realize bad application code could escape from
that, right?


> My application is a web service that approximately executes some hundred
> statements per second.
> 
> I set "log_min_duration_statement = 200ms" and I get about 2k to 4k lines
> per day with statements like "SET ROLE"", "SET search_path ..." and "RESET
> ROLE":
> 
> --snip--
> 2017-11-07 09:44:30 CET [27760]: [3-1] user=admin,db=mydb LOG:  duration:
> 901.591 ms  execute : SET ROLE "tenant762"
> 2017-11-07 09:44:30 CET [27659]: [4-1] user=admin,db=mydb LOG:  duration:
> 1803.971 ms  execute : SET ROLE "tenant392"

That is weird.


> Besides those peaks in statement duration, my application performs (i.e.
> has acceptable response times) most of the time.
> 
> Is there anything I can do to improve performance here?
> Any help is greatly appreciated!

Can you manually reproduce the problem? What times do you get if you
manually run the statement?

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] query of partitioned object doesnt use index in qa

2017-09-15 Thread Andres Freund

On September 15, 2017 1:42:23 PM PDT, Tom Lane  wrote:
>One thing you could consider doing about this is creating an index
>on (body ->> 'SID'::text), which would prompt ANALYZE to gather
>statistics
>about that expression.  Even if the index weren't actually used in the
>plan, this might improve the estimates and the resulting planning
>choices
>enough to make it worth maintaining such an index.

I'm wondering if we should extend the new CREATE STATISTICS framework to be 
able to do that without requiring an index. I.e. allow expressions and add a 
new type of stats that just correspond to what normal columns have.  Could even 
create that implicitly for expression indexes, but allow to drop it, if the 
overtrading isn't worth it.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund

On 2017-04-28 01:29:14 +0200, Tomas Vondra wrote:
> I can confirm this observation. I bought the Intel 750 NVMe SSD last year,
> the device has 1GB DDR3 cache on it (power-loss protected), can do ~1GB/s of
> sustained O_DIRECT sequential writes. But when running pgbench, I can't push
> more than ~300MB/s of WAL to it, no matter what I do because of
> WALWriteLock.

Hm, interesting.  Even if you up wal_buffers to 128MB, use
synchronous_commit = off, and play with wal_writer_delay/flush_after?

- Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund

On 2017-04-27 10:29:48 -0700, Joshua D. Drake wrote:
> On 04/27/2017 09:34 AM, Andres Freund wrote:
> > On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> > > On 04/27/2017 08:59 AM, Andres Freund wrote:
> > > 
> 
> > > I would agree it isn't yet a widespread issue.
> > 
> > I'm not yet sure about that actually.  I suspect a large percentage of
> > people with such workloads aren't lingering lots on the lists.
> 
> That would probably be true. I was thinking of it more as the "most new
> users are in the cloud" and the "cloud" is going to be rare that a cloud
> user is going to be able to hit that level of writes. (at least not without
> spending LOTS of money)

You can get pretty decent NVMe SSD drives on serveral cloud providers
these days, without immediately bancrupting you.  Sure, it's instance
storage, but with a decent replication and archival setup, that's not
necessarily an issue.

It's not that hard to get to the point where postgres can't keep up with
storage, at least for some workloads.

- Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund

On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> On 04/27/2017 08:59 AM, Andres Freund wrote:
> 
> > 
> > Ok, based on the, few, answers I've got so far, my experience is indeed
> > skewed.  A number of the PG users I interacted with over the last couple
> > years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
> > (max I'veseen).  At that point WAL insertion became a major bottleneck,
> > even if storage was more than fast enough to keep up.  To address these
> > we'd need some changes, but the feedback so far suggest that it's not
> > yet a widespread issue...
> 
> I would agree it isn't yet a widespread issue.

I'm not yet sure about that actually.  I suspect a large percentage of
people with such workloads aren't lingering lots on the lists.


> The only people that are likely going to see this are going to be on bare
> metal. We should definitely plan on that issue for say 11.

"plan on that issue" - heh. We're talking about major engineering
projects here ;)


> I do have a question though, where you have seen this issue is it with
> synchronous_commit on or off?

Both. Whether that matters or not really depends on the workload. If you
have bulk writes, it doesn't really matter much.

- Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-27 Thread Andres Freund

Hi,

On 2017-04-24 21:17:43 -0700, Andres Freund wrote:
> I've lately seen more and more installations where the generation of
> write-ahead-log (WAL) is one of the primary bottlenecks.  I'm curious
> whether that's primarily a "sampling error" of mine, or whether that's
> indeed more common.
> 
> The primary reason I'm curious is that I'm pondering a few potential
> optimizations, and would like to have some guidance which are more and
> which are less important.
> 
> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?
> - What generates the bulk of WAL on your servers (9.5+ can use
>   pg_xlogdump --stats to compute that)?
> - Are you seeing WAL writes being a bottleneck?OA
> - What kind of backup methods are you using and is the WAL volume a
>   problem?
> - What kind of replication are you using and is the WAL volume a
>   problem?
> - What are your settings for wal_compression, max_wal_size (9.5+) /
>   checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
> - Could you quickly describe your workload?

Ok, based on the, few, answers I've got so far, my experience is indeed
skewed.  A number of the PG users I interacted with over the last couple
years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
(max I'veseen).  At that point WAL insertion became a major bottleneck,
even if storage was more than fast enough to keep up.  To address these
we'd need some changes, but the feedback so far suggest that it's not
yet a widespread issue...

- Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Questionaire: Common WAL write rates on busy servers.

2017-04-24 Thread Andres Freund

Hi,

I've lately seen more and more installations where the generation of
write-ahead-log (WAL) is one of the primary bottlenecks.  I'm curious
whether that's primarily a "sampling error" of mine, or whether that's
indeed more common.

The primary reason I'm curious is that I'm pondering a few potential
optimizations, and would like to have some guidance which are more and
which are less important.

Questions (answer as many you can comfortably answer):
- How many MB/s, segments/s do you see on busier servers?
- What generates the bulk of WAL on your servers (9.5+ can use
  pg_xlogdump --stats to compute that)?
- Are you seeing WAL writes being a bottleneck?OA
- What kind of backup methods are you using and is the WAL volume a
  problem?
- What kind of replication are you using and is the WAL volume a
  problem?
- What are your settings for wal_compression, max_wal_size (9.5+) /
  checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
- Could you quickly describe your workload?

Feel free to add any information you think is pertinent ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] pgsql-performance issue

2016-08-29 Thread Andres Freund

Hi,

On 2016-08-20 08:38:43 +, debasis.mohar...@ipathsolutions.co.in wrote:
> I have a PostgreSQL 9.5 instance running on Windows 8 machine with 4GB of
> RAM.This server is mainly used for inserting/updating large amounts of data
> via copy/insert/update commands, and seldom for running select queries.
> 
> Here are the relevant configuration parameters I changed:
> 
> max_connections = 100
> shared_buffers = 512MB
> effective_cache_size = 3GB
> work_mem = 12233kB
> maintenance_work_mem = 256MB
> min_wal_size = 1GB max_wal_size = 2GB
> checkpoint_completion_target = 0.7
> wal_buffers = 16MB
> default_statistics_target = 100
> 
> After setting in postgresql.conf. I run the select query to fetch large
> amount of record of 29000 in postgresql but it takes 10.3 seconds but the
> same query takes 2 seconds for execution in MSSQL.
> 
> So my query is how to improve the perfermance in postgresql.

Please provide the output EXPLAIN (ANALYZE, BUFFERS) yourquery; and your
query. Then we'll possibly be able to help you - atm we don't have
enough information.

Regards,

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6

2016-07-14 Thread Andres Freund

On 2016-07-04 16:30:51 +0300, Vladimir Borodin wrote:
>
> > 13 июня 2016 г., в 21:58, Vladimir Borodin <r...@simply.name> написал(а):
> >
> >>
> >> 13 июня 2016 г., в 0:51, Andres Freund <and...@anarazel.de 
> >> <mailto:and...@anarazel.de>> написал(а):
> >>
> >> Hi Vladimir,
> >>
> >> Thanks for these reports.
> >>
> >> On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:
> >>> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt
> >>
> >> Any chance you could redo the reports with --no-children 
> >> --call-graph=fractal
> >> added? The mode that includes child overheads unfortunately makes the
> >> output hard to interpet/compare.
> >
> > Of course. Not sure if that is important but I upgraded perf for that 
> > (because --no-children option was introduced in ~3.16), so perf record and 
> > perf report were done with different perf versions.
> >
> > 
> > 
> > 
> >
> > Also I’ve done the same test on same host (RHEL 6) but with 4.6
> > kernel/perf and writing perf data to /dev/shm for not loosing
> > events. Perf report output is also attached but important thing is
> > that the regression is not so significant:

FWIW, you can instead use -F 300 or something to reduce the sampling
frequency.

> > root@pgload05g ~ # uname -r
> > 4.6.0-1.el6.elrepo.x86_64
> > root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled
> > 1
> > root@pgload05g ~ # /tmp/run.sh
> > RHEL 6  9.4 71634   0.893
> > RHEL 6  9.5 54005   1.185
> > RHEL 6  9.6 65550   0.976
> > root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled
> > root@pgload05g ~ # /tmp/run.sh
> > RHEL 6  9.4 73041   0.876
> > RHEL 6  9.5 60105   1.065
> > RHEL 6  9.6 67984   0.941
> > root@pgload05g ~ #
> >
> > 
> > 
> > 
>
> Andres, is there any chance that you would find time to look at those 
> results? Are they actually useful?

I don't really see anything suspicious in the profile. This looks more
like a kernel scheduler issue than a postgres bottleneck one. It seems
that somehow using nonblocking IO (started in 9.5) causes scheduling
issues when pgbouncer is also local.

Could you do perf stat -ddd -a sleep 10 or something during both runs? I
suspect that the context switch ratios will be quite different.

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6

2016-07-14 Thread Andres Freund

On 2016-06-13 21:58:30 +0300, Vladimir Borodin wrote:
> 
> > 13 июня 2016 г., в 0:51, Andres Freund <and...@anarazel.de> написал(а):
> > 
> > Hi Vladimir,
> > 
> > Thanks for these reports.
> > 
> > On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:
> >> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt
> > 
> > Any chance you could redo the reports with --no-children 
> > --call-graph=fractal
> > added? The mode that includes child overheads unfortunately makes the
> > output hard to interpet/compare.
> 
> Of course. Not sure if that is important but I upgraded perf for that 
> (because --no-children option was introduced in ~3.16), so perf record and 
> perf report were done with different perf versions.
> 
> 
> 
> Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf 
> and writing perf data to /dev/shm for not loosing events. Perf report output 
> is also attached but important thing is that the regression is not so 
> significant:
> 
> root@pgload05g ~ # uname -r
> 4.6.0-1.el6.elrepo.x86_64
> root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled
> 1
> root@pgload05g ~ # /tmp/run.sh
> RHEL 69.4 71634   0.893
> RHEL 69.5 54005   1.185
> RHEL 69.6 65550   0.976
> root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled
> root@pgload05g ~ # /tmp/run.sh
> RHEL 69.4 73041   0.876
> RHEL 69.5 60105   1.065
> RHEL 69.6 67984   0.941
> root@pgload05g ~ #

Hm. Have you measured how large the slowdown is if you connect via tcp
to pgbouncer, but have pgbouncer connect to postgres via unix sockets?

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6

2016-06-12 Thread Andres Freund

Hi Vladimir,

Thanks for these reports.

On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:
> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt

Any chance you could redo the reports with --no-children --call-graph=fractal
added? The mode that includes child overheads unfortunately makes the
output hard to interpet/compare.

> The results from pg9?_perf_report.txt are attached. Note that in all cases 
> some events were lost, i.e.:
> 
> root@pgload05g ~ # perf report -g -i pg94_all.data >/tmp/pg94_perf_report.txt
> Failed to open [vsyscall], continuing without symbols
> Warning:
> Processed 537137 events and lost 7846 chunks!

You can reduce the overhead by reducing the sampling frequency, e.g. by
specifying -F 300.

Greetings,

Andres Freund

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6

2016-06-09 Thread Andres Freund

Hi,

On 2016-06-02 14:18:26 +0300, Антон Бушмелев wrote:
> UP. repeat tests on local vm.. reults are discouraging
> OSPG  TPS AVG latency
> Centos 7  9.5.3   23.711023   168.421
> Centos 7  9.5.3   26.609271   150.188
> Centos 7  9.5.3   25.220044   158.416
> Centos 7  9.5.3   25.598977   156.047
> Centos 7  9.4.8   278.572191  14.077
> Centos 7  9.4.8   247.237755  16.177
> Centos 7  9.4.8   240.007524  16.276
> Centos 7  9.4.8   237.862238  16.596

Could you provide profiles on 9.4 and 9.5?  Which kernel did you have
enabled? Is /proc/sys/kernel/sched_autogroup_enabled 1 or 0?

Regards,

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Queries intermittently slow

2016-01-07 Thread Andres Freund

On 2016-01-07 13:34:51 -0500, Tom Lane wrote:
> It's fairly well established that the implementation of transparent
> huge pages in Linux kernels from the 2.6-or-so era sucks, and you're
> best off turning it off if you care about consistency of performance.

I think the feature wasn't introduced in original 2.6 kernels (3.2 or
so?), but red hat had backported it to their 2.6.32 kernel.

> I am not sure whether modern kernels have improved this area.

I think the problem has largely been solved around 3.16. Around 4.1 I
could still  reproduce problems, but the regressions were only in the
sub 10% range in my test workload.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Proposal for unlogged tables

2016-01-04 Thread Andres Freund

On 2016-01-04 19:12:22 +0200, Mark Zealey wrote:
> If there was a command to flush a specific unlogged table to disk it would
> work around all these issues no? Perhaps if you marked the table as read
> only at the same time it would flush it to disk and ensure no more data
> could be written to it eg (ALTER TABLE ... SET READ ONLY on an unlogged
> table would flush + not truncate after crash). In our case this would be
> great as we want to use these as permanent tables for speed; but after an
> initial data dump we don't change the data again so we could just do this at
> the end of the import process.

It's more complex than that, even unmodified tables need to be processed
by vacuum every now and then (xid wraparound handling). It probably
possible to work around such things, but it's not a line or ten.

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Proposal for unlogged tables

2016-01-04 Thread Andres Freund

On 2016-01-04 16:38:40 +0200, Mark Zealey wrote:
> I don't know how the internals work but unlogged tables definitely flushed
> to disk and persist through normal server restarts. It is just according to
> the docs if the server ever has an unclean shutdown the tables are truncated
> even if they have not been updated in a year. I can't understand why it has
> to be like this and it seems that it would be much nicer to not
> automatically truncate if it doesn't have to.

Pages containing data of unlogged tables aren't ever flushed to disk
unless
a) a shutdown checkpoint is performed
b) a buffer containing data from an unlogged table is used for something
   else
c) the database being copied is the the source of a CREATE DATABASE .. TEMPLATE

Hence, if there's an unclean shutdown, there's absolutely no guarantee
about the on-disk state of unlogged tables. Even if they haven't been
modified in ages - there could have been many many dirty pages in shared
buffers when crashing.

Always flushing dirty pages of unlogged tables at checkpoint would
greatly increase the overhead for memory resident, write heavy workloads
that use unlogged tables.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] One long transaction or multiple short transactions?

2015-10-17 Thread Andres Freund

On 2015-10-17 10:26:01 -0500, Jim Nasby wrote:
> Except inserts *do* take a lot of locks, just not user-level locks.
> Operations like finding a page to insert into, seeing if that page is in
> shared buffers, loading the page into shared buffers, modifying a shared
> buffer, getting the relation extension lock if you need to add a new page.
> Then there's a whole pile of additional locking you could be looking at for
> inserting into any indexes.
> 
> Now, most of the locks I described up there are transaction-aware

Missing *not*?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Having some problems with concurrent COPY commands

2015-10-13 Thread Andres Freund

On 2015-10-13 07:14:01 -0700, Shaun Thomas wrote:
> On Tue, Oct 13, 2015 at 2:32 AM, Heikki Linnakangas  wrote:
> > 80% of the CPU time is spent in the b-tree comparison function.
> 
> In the logs, my duration per COPY command increases from about 1400ms
> for one process to about 3800ms when I have four running concurrently.
> That's really my only data point, unfortunately. Strace isn't super
> helpful because it just says 70-ish% of the time is wait4, but that's
> not significantly different than the results using one process.

Please run a profile. Compile postgres with CFLAGS='-O2 -fno-omit-frame-pointer'
as an argument to configure. That'll allow us to get a hierarchical profile.

Then first do a plain cpu profile:
perf record -g -a sleep 5
perf report > somefile

Then let's look at lock contention:
perf record -g -a -e syscalls:sys_enter_semop sleep 5
perf report > somefile

and send the results.

Thanks,

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Having some problems with concurrent COPY commands

2015-10-12 Thread Andres Freund

Hi,

On 2015-10-12 13:17:53 -0500, Shaun Thomas wrote:
> It would appear I'm running into whatever issue the xloginsert_slots patch
> tried to address, but not much discussion exists afterwards.

That patch is merged, it's just that the number of slots is
hardcoded. You can recompile postgres with different values by changing
#define NUM_XLOGINSERT_LOCKS  8
in xlog.c to a different value. A restart is enough afterwards.

> Is there any other way to
> get normal parallel COPY performance, or is that just currently impossible?
> 
> I also know 9.5 underwent a lot of locking improvements, so it might
> not be relevant. I just haven't gotten a chance to repeat my tests
> with 9.5 just yet.

Hard to say anything substantive without further information. Any chance
you could provide profiles of such a run? If yes, I can help you with
instructions. I'm just to lazy to write them up if not.

Greetings,

Andres Freund

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Multi processor server overloads occationally with system process while running postgresql-9.4

2015-10-03 Thread Andres Freund

On 2015-10-03 01:39:33 -0700, ajaykbs wrote:
> It is observed that at some times during moderate load
> the CPU usage goes up to 400% and the users are not able to complete the
> queries in expected time. But the load is contributed by some system process
> only.The average connections are normally 50.

This email is nearly impossible to read.

But it sounds a bit like you need to disable transparent hugepages
and/or zone_reclaim mode.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-09 Thread Andres Freund

On 2015-07-08 23:38:38 -0400, Tom Lane wrote:
 and...@anarazel.de (Andres Freund) writes:
  On 2015-07-08 15:38:24 -0700, Craig James wrote:
  From my admittedly naive point of view, it's hard to see why any of this
  matters. I have functions that do purely CPU-intensive mathematical
  calculations ... you could imagine something like is_prime(N) that
  determines if N is a prime number. I have eight clients that connect to
  eight backends. Each client issues an SQL command like, select
  is_prime(N) where N is a simple number.

  I mostly replied to Merlin's general point (additionally in the context of
  plpgsql).

  But I have a hard time seing that postgres would be the bottleneck for a
  is_prime() function (or something with similar characteristics) that's
  written in C where the average runtime is more than, say, a couple
  thousand cyles.  I'd like to see a profile of that.

 But that was not the case that Graeme was complaining about.

No, Craig was complaining about that case...

 One of my Salesforce colleagues has been looking into ways that we could
 decide to skip the per-statement snapshot acquisition even in volatile
 functions, if we could be sure that a particular statement isn't going to
 do anything that would need a snapshot.

Yea, I actually commented about that on IRC as well.

I was thinking about actually continuing to get a snapshot, but mark it
as 'complete on usage'. I.e. only call GetSnapshotData() only when the
snapshot is used to decide about visibility.  We probably can't do that
in the toplevel visibility case because it'll probably have noticeable
semantic effects, but ISTM it should be doable for the volatile function
using spi case.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-09 Thread Andres Freund

On 2015-07-09 10:30:35 +, Graeme B. Bell wrote:
  Well, that requires reviewing the source code of the run script and
  such.
 
 No, of course it doesn't.  It appears that you didn't look at the repo or 
 read my previous mail before you wrote this. 

FFS, I *ran* some of the tests and reported on results. With you in CC.

What I mean is that I don't just run random code from some random github
repository.

 I do not wish to antagonise you either, so please go and look at the
 repo before you write the next reply.

Over and out.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund

On 2015-07-08 11:13:04 +, Graeme B. Bell wrote:
 I'm guessing you are maybe pressed for time at the moment because I
 already clearly included this on the last email, as well as the links
 to the alternative benchmarks with the same problem I referred to on
 both of my last emails which are also trivial to drop into pgbench
 (cut/paste).

You realize that you want something here, not Merlin, right?

 e.g. did you see these parts of my previous email 
 
 To clear up the issue I build a little test harness around your comment 
 below.
 http://github.com/gbb/t;

Well, that requires reviewing the source code of the run script and
such.



I think we shouldn't discuss this on two threads (-performance, -bugs),
that makes it hard to follow. Given Tom's more detailed answer I think
the -bugs thread already contains more pertinent information.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund

On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote:
 On Wed, Jul 8, 2015 at 12:48 PM, Craig James cja...@emolecules.com wrote:
  On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake j...@commandprompt.com
  Using Apache Fast-CGI, you are going to fork a process for each instance
  of the function being executed and that in turn will use all CPUs up to the
  max available resource.
 
  With PostgreSQL, that isn't going to happen unless you are running (at
  least) 8 functions across 8 connections.
 
 
  Well, right, which is why I mentioned even with dozens of clients.
  Shouldn't that scale to at least all of the CPUs in use if the function is
  CPU intensive (which it is)?
 
 only in the absence of inter-process locking and cache line bouncing.

And addititionally memory bandwidth (shared between everything, even in
the numa case), cross socket/bus bandwidth (absolutely performance
critical in multi-socket configurations), cache capacity (shared between
cores, and sometimes even sockets!).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Andres Freund

On 2015-07-08 15:38:24 -0700, Craig James wrote:
 From my admittedly naive point of view, it's hard to see why any of this
 matters. I have functions that do purely CPU-intensive mathematical
 calculations ... you could imagine something like is_prime(N) that
 determines if N is a prime number. I have eight clients that connect to
 eight backends. Each client issues an SQL command like, select
 is_prime(N) where N is a simple number.

I mostly replied to Merlin's general point (additionally in the context of
plpgsql).

But I have a hard time seing that postgres would be the bottleneck for a
is_prime() function (or something with similar characteristics) that's
written in C where the average runtime is more than, say, a couple
thousand cyles.  I'd like to see a profile of that.

Andres


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 9.5alpha1 vs 9.4

2015-07-05 Thread Andres Freund

Hi,

On 2015-07-05 13:10:51 +0200, Mkrtchyan, Tigran wrote:
 today I have update my test system to 9.5alpha1.
 Most of the operations are ok, except delete.
 I get ~1000 times slower!

  255.88 |  566.11 |   452 | DELETE FROM t_inodes WHERE ipnfsid=$1 
 AND inlink = ?

That certainly should not be the case. Could you show the query plan for
this statement in both versions?  Any chance that there's a parameter
type mismatch for $1?

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-05-21 Thread Andres Freund

On 2015-05-21 11:54:40 -0700, Josh Berkus wrote:
 This has been talked about as a feature, but would require major work on
 PostgreSQL to make it possible.  You'd be looking at several months of
 effort by a really good hacker, and then a whole bunch of performance
 testing.  If you have the budget for this, then please let's talk about
 it because right now nobody is working on it.

I think this is overestimating the required effort quite a bit. While
not trivial, it's also not that complex to make this work.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-04-29 Thread Andres Freund

On 2015-04-29 10:06:39 +0200, Andres Freund wrote:
 Hi,
 
 On 2015-04-23 19:47:06 +, Jan Gunnar Dyrset wrote:
  I am using PostgreSQL to log data in my application. A number of rows
  are added periodically, but there are no updates or deletes. There are
  several applications that log to different databases.
  
  This causes terrible disk fragmentation which again causes performance
  degradation when retrieving data from the databases. The table files
  are getting more than 5 fragments over time (max table size about
  1 GB).
  
  The problem seems to be that PostgreSQL grows the database with only
  the room it need for the new data each time it is added. Because
  several applications are adding data to different databases, the
  additions are never contiguous.
 
 Which OS and filesystem is this done on? Because many halfway modern
 systems, like e.g ext4 and xfs, implement this in the background as
 'delayed allocation'.

Oh, it's in the subject. Stupid me, sorry for that. I'd consider testing
how much better this behaves under a different operating system, as a
shorter term relief.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL disk fragmentation causes performance problems on Windows

2015-04-29 Thread Andres Freund

Hi,

On 2015-04-23 19:47:06 +, Jan Gunnar Dyrset wrote:
 I am using PostgreSQL to log data in my application. A number of rows
 are added periodically, but there are no updates or deletes. There are
 several applications that log to different databases.
 
 This causes terrible disk fragmentation which again causes performance
 degradation when retrieving data from the databases. The table files
 are getting more than 5 fragments over time (max table size about
 1 GB).
 
 The problem seems to be that PostgreSQL grows the database with only
 the room it need for the new data each time it is added. Because
 several applications are adding data to different databases, the
 additions are never contiguous.

Which OS and filesystem is this done on? Because many halfway modern
systems, like e.g ext4 and xfs, implement this in the background as
'delayed allocation'.

Is it possible that e.g. you're checkpointing very frequently - which
includes fsyncing dirty files - and that that causes delayed allocation
not to work? How often did you checkpoint?

How did you measure the fragmentation? Using filefrag? If so, could you
perhaps send its output?

 I think that preallocating lumps of a given, configurable size, say 4
 MB, for the tables would remove this problem. The max number of
 fragments on a 1 GB file would then be 250, which is no problem. Is
 this possible to configure in PostgreSQL? If not, how difficult is it
 to implement in the database?

It's not impossible, but there are complexities because a) extension
happens under a sometimes contended lock, and doing more there will have
possible negative scalability implications. we need to restructure the
logging first to make that more realistic. b) postgres also tries to
truncate files, and we need to make sure that happens only in the right
cirumstances.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 13:08:13 +0100, Robert Kaye wrote:
  On Mar 15, 2015, at 12:41 PM, Andreas Kretschmer 
  akretsch...@spamfence.net wrote:
  
  just a wild guess: raid-controller BBU faulty
 
 We don’t have a BBU in this server, but at least we have redundant power 
 supplies.
 
 In any case, how would a fault batter possibly cause this?

Many controllers disable write-back caching when the battery is dead.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 13:07:25 +0100, Robert Kaye wrote:
 
  On Mar 15, 2015, at 12:13 PM, Josh Krupka jkru...@gmail.com wrote:
  
  It sounds like you've hit the postgres basics, what about some of the linux 
  check list items?
  
  what does free -m show on your db server?
 
  total   used   free sharedbuffers cached
 Mem: 48295  31673  16622  0  5  12670
 -/+ buffers/cache:  18997  29298
 Swap:22852   2382  20470

Could you post /proc/meminfo instead? That gives a fair bit more
information.

Also:
* What hardware is this running on?
* Why do you need 500 connections (that are nearly all used) when you
  have a pgbouncer in front of the database? That's not going to be
  efficient.
* Do you have any data tracking the state connections are in?
  I.e. whether they're idle or not? The connections graph on you linked
  doesn't give that information?
* You're apparently not graphing CPU usage. How busy are the CPUs? How
  much time is spent in the kernel (i.e. system)?
* Consider installing perf (linux-utils-$something) and doing a
  systemwide profile.

3.2 isn't the greatest kernel around, efficiency wise. At some point you
might want to upgrade to something newer. I've seen remarkable
differences around this.

You really should upgrade postgres to a newer major version one of these
days. Especially 9.2. can give you a remarkable improvement in
performance with many connections in a read mostly workload.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
 shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

I think that's a outdated wisdom, i.e. not generally true. I've now seen
a significant number of systems where a larger shared_buffers can help
quite massively.  The primary case where it can, in my experience, go
bad are write mostly database where every buffer acquiration has to
write out dirty data while holding locks. Especially during relation
extension that's bad.  A new enough kernel, a sane filesystem
(i.e. not ext3) and sane checkpoint configuration takes care of most of
the other disadvantages.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 20:42:51 +0300, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:20 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-03-15 11:09:34 -0600, Scott Marlowe wrote:
  shared_mem of 12G is almost always too large. I'd drop it down to ~1G or 
  so.
 
  I think that's a outdated wisdom, i.e. not generally true.
 
 Quite agreed. With note, that proper configured controller with BBU is needed.

That imo doesn't really have anything to do with it. The primary benefit
of a BBU with writeback caching is accelerating (near-)synchronous
writes. Like the WAL. But, besides influencing the default for
wal_buffers, a larger shared_buffers doesn't change the amount of
synchronous writes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

(please quote properly)

On 2015-03-15 19:55:23 -0400, mich...@sqlexec.com wrote:
 Why is 500 connections insane.  We got 32 CPU with 96GB and 3000 max
 connections, and we are doing fine, even when hitting our max concurrent
 connection peaks around 4500.  At a previous site, we were using 2000 max
 connections on 24 CPU and 64GB RAM, with about 1500 max concurrent
 connections.  So I wouldn't be too hasty in saying more than 500 is asking
 for trouble.  Just as long as you got your kernel resources set high enough
 to sustain it (SHMMAX, SHMALL, SEMMNI, and ulimits), and RAM for work_mem.

It may work acceptably in some scenarios, but it can lead to significant
problems. Several things in postgres things in postgres scale linearly
(from the algorithmic point of view, often CPU characteristics like
cache sizes make it wors) with max_connections, most notably acquiring a
snapshot.  It usually works ok enough if you don't have a high number of
queries per second, but if you do, you can run into horrible contention
problems.  Absurdly enough that matters *more* on bigger machines with
several sockets. It's especially bad on 4+ socket systems.

The other aspect is that such a high number of full connections usually
just isn't helpful for throughput. Not even the most massive NUMA
systems (~256 hardware threads is the realistic max atm IIRC) can
process 4.5k queries at the same time.  It'll often be much more
efficient if all connections above a certain number aren't allocated a
full postgres backend, with all it's overhead, but use a much more
lightweight pooler connection.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 20:54:51 +0300, Ilya Kosmodemiansky wrote:
 On Sun, Mar 15, 2015 at 8:46 PM, Andres Freund and...@2ndquadrant.com wrote:
  That imo doesn't really have anything to do with it. The primary benefit
  of a BBU with writeback caching is accelerating (near-)synchronous
  writes. Like the WAL.
 
 My point was, that having no proper raid controller (today bbu surely
 needed for the controller to be a proper one) + heavy writes of any
 kind, it is absolutely impossible to live with large shared_buffers
 and without io problems.

And my point is that you're mostly wrong. What a raid controller's
writeback usefully cache accelerates is synchronous writes. I.e. writes
that the application waits for. Usually raid controllers don't have much
chance to reorderer the queued writes (i.e. turning neighboring writes
into one larger sequential write). What they do excel at is making
synchronous writes to disk return faster because the data is only
written to the the controller's memory, not to actual storage. They're
also good at avoiding actual writes to disk when the *same* page is
written to multiple times in short amount of time.

In postgres writes for data that goes through shared_buffers are usally
asynchronous. We write them to the OS's page cache when a page is needed
for other contents, is undirtied by the bgwriter, or written out during
a checkpoint; but do *not* wait for the write to hit the disk.  The
controller's write back cache doesn't hugely help here, because it
doesn't make *that* much of a difference whether the dirty data stays in
the kernel's page cache or in the controller.

In contrast to that, writes to the WAL are often more or les
synchronous. We actually wait (via fdatasync()/fsync() syscalls) for
writes to hit disk in a bunch of scenarios, most commonly when
committing a transaction. Unless synchronous_commit = off every COMMIT
in a transaction that wrote data implies a fdatasync() of a file in
pg_xlog (well, we do optimize that in some condition, but let's leave
that out for now).

Additionally, if there are many smaller/interleaved transactions, we
will write()/fdatasync() out the same 8kb WAL page repeatedly. Everytime
a transaction commits (and some other things) the page that commit
record is on will be flushed. As the WAL records for insertions,
updates, deletes, commits are frequently much smaller than 8kb that will
often happen 20-100 for the same page in OLTP scenarios with narrow
rows.  That's why synchronous_commit = off can be such a huge win for
OLTP write workloads without a writeback cache - synchronous writes are
turned into asynchronous writes, and repetitive writes to the same page
are avoided. It also explains why synchronous_commit = off has much less
an effect for bulk write workloads: As there are no synchronous disk
writes due to WAL flushes at commit time (there's only very few
commits), synchronous commit doesn't have much of an effect.


That said, there's a couple reasons why you're not completely wrong:

Historically, when using ext3 with data=ordered and some other
filesystems, synchronous writes to one file forced *all* other
previously dirtied data to also be flushed. That means that if you have
pg_xlog and the normal relation files on the same filesystem, the
synchronous writes to the WAL will not only have to write out the new
WAL (often not that much data), but also all the other dirty data.  The
OS will often be unable to do efficient write combining in that case,
because a) there's not yet that much data there, b) postgres doesn't
order writes during checkpoints.  That means that WAL writes will
suddenly have to write out much more data = COMMITs are slow.  That's
where the suggestion to keep pg_xlog on a separate partion largely comes
from.

Writes going through shared_buffers are sometimes indirectly turned into
synchronous writes (from the OSs perspective at least. Which means
they'll done at a higher priority). That happens when the checkpointer
fsync()s all the files at the end of a checkpoint. When things are going
well and checkpoints are executed infrequently and completely guided by
time (i.e. triggered solely by checkpoint_timeout, and not
checkpoint_segments) that's usually not too bad. You'll see a relatively
small latency spike for transactions.
Unfortunately the ext* filesystems have a implementation problem here,
which can make this problem much worse: The way writes are priorized
during an fsync() can stall out concurrent synchronous reads/writes
pretty badly. That's much less of a problem with e.g. xfs. Which is why
I'd atm not suggest ext4 for write intensive applications.

The other situation where this can lead to big problems is if your
checkpoints aren't scheduled by time (you can recognize that by enabling
log_checkpoints and check a) that time is the trigger, b) they're
actually happening in a interval consistent with checkpoint_timeout). If
the relation files are not writtten out in a smoothed out fashion
(configured

Re: [PERFORM] MusicBrainz postgres performance issues

2015-03-15 Thread Andres Freund

On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
 Here's the problem with a large shared_buffers on a machine that's
 getting pushed into swap. It starts to swap BUFFERs. Once buffers
 start getting swapped you're not just losing performance, that huge
 shared_buffers is now working against you because what you THINK are
 buffers in RAM to make things faster are in fact blocks on a hard
 drive being swapped in and out during reads. It's the exact opposite
 of fast. :)

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping out
actively used pages, the problem isn't the 12GB of shared buffers, but
that you require so much memory for other things. That needs to be
fixed.

But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.

In a concurrent OLTP workload (~450 established connections do suggest
that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly searching
for victim buffers isn't a nice thing, and that will happen if your most
frequently used data doesn't fit into s_b.  On the other hand, if your
data set is so large that even the hottest part doesn't fit into memory
(perhaps because there's no hottest part as there's no locality at all),
a smaller shared buffers can make things more efficient, because the
search for replacement buffers is cheaper with a smaller shared buffers
setting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2014-11-10 Thread Andres Freund

On 2014-11-10 10:48:24 -0800, Josh Berkus wrote:
 On 12/31/2013 09:55 AM, Tom Lane wrote:
  Josh Berkus j...@agliodbs.com writes:
  Tom,
  There's an abbreviated version of this argument in the comments in
  my proposed patch at
  http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us
  What I'm hoping will happen next is that the complainants will hot-patch
  that and see if it fixes their problems.  We can't really determine
  what to do without that information.
  
  Unfortunately, the original reporter of this issue will not be available
  for testing for 2-3 weeks, and I haven't been able to devise a synthetic
  test which clearly shows the issue.
  
  Ping?  I've been waiting on committing that patch pending some real-world
  testing.  It'd be nice to resolve this question before we ship 9.3.3,
  which I'm supposing will be sometime in January ...
 
 Did this patch every make it in?  Or did it hang waiting for
 verification?

src/tools/git_changelog is your friend.

Author: Tom Lane t...@sss.pgh.pa.us
Branch: master Release: REL9_4_BR [fccebe421] 2014-02-25 16:04:06 -0500
Branch: REL9_3_STABLE Release: REL9_3_4 [4162a55c7] 2014-02-25 16:04:09 -0500
Branch: REL9_2_STABLE Release: REL9_2_8 [00283cae1] 2014-02-25 16:04:12 -0500
Branch: REL9_1_STABLE Release: REL9_1_13 [3e2db4c80] 2014-02-25 16:04:16 -0500
Branch: REL9_0_STABLE Release: REL9_0_17 [1e0fb6a2c] 2014-02-25 16:04:20 -0500

Use SnapshotDirty rather than an active snapshot to probe index endpoints.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres slave not catching up (on 9.2)

2014-11-09 Thread Andres Freund

Hi,

On 2014-11-08 13:11:25 +, Ruben Domingo Gaspar Aparicio wrote:
 Hello,
 
 I have built up a hot-standby between a master running 9.2.4 and a slave 
 running 9.2.9. I did the initial copy using pg_basebackup . My 
 recovery.conf looks like:
 
 standby_mode = 'on'
 primary_conninfo = 'host=p port= user=replicator'
 trigger_file = 'failover.now'
 restore_command = 'test -f /ORA/dbs02/PUPDBTST/archive/%f   ln -fns 
 /ORA/dbs02/PUPDBTST/archive/%f %p'
 archive_cleanup_command = 
 '/usr/local/pgsql/postgresql-9.2.9/bin/pg_archivecleanup 
 /ORA/dbs03/PUPDBTST/data %r'
 
 The slave (I don't have control on the master) is using 2 NFS file systems, 
 one for WALs and another one for the data, on Netapp controllers:
 
 dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs 
 (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
 dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs 
 (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600)
 
 The streaming is working perfectly:
 
 /usr/local/pgsql/postgresql-9.2.9/bin/psql -U admin -h X -p X -d 
 puppetdb -c select pid,usesysid, usename, application_name, client_addr, 
 state, sent_location,write_location,replay_location,sync_state from 
 pg_stat_replication;
   pid   | usesysid |  usename   | application_name | client_addr |   state   
 | sent_location | write_location | replay_location | sync_state
 +--++--+-+---+---++-+
 117659 |16384 | replicator | walreceiver  | 10.16.7.137 | streaming | 
 AA74/DD630978 | AA74/DD630978  | A977/F84F0BE0   | async
 (1 row)
 
 But the lag is increasing constantly, it looks the replay can not cope with:

I have a couple of questions:
1) Is the standby actually used for querying? Is it possible that replay
   frequently conflicts with active queries? As you don't have
   hot_standby_feedback enabled that seems quite possible.
2) Is the startup process on the standby CPU or IO bound?
3) Does the workload involve loads of temporary tables or generally
   transactions locking lots of tables exclusively in one transaction?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Turn off Hyperthreading! WAS: 60 core performance with 9.3

2014-08-22 Thread Andres Freund

On 2014-08-21 14:02:26 -0700, Josh Berkus wrote:
 On 08/20/2014 07:40 PM, Bruce Momjian wrote:
  Not sure how you can make such a blanket statement when so many people
  have tested and shown the benefits of hyper-threading.  
 
 Actually, I don't know that anyone has posted the benefits of HT.
 Link?

There's definitely cases where it can help. But it's highly workload
*and* hardware dependent.

 OS is RHEL with 2.6.32-431.3.1.el6.x86_64.
 
 I've emailed a kernel hacker who works at Intel for comment; for one
 thing, I'm wondering if the older kernel version is a problem for a
 system like this.

I'm not sure if it has been backported by redhat, but there
definitely have been significant improvement in SMT aware scheduling
after vanilla 2.6.32.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] query against pg_locks leads to large memory alloc

2014-08-19 Thread Andres Freund

On 2014-08-18 14:36:52 -0700, Dave Owens wrote:
 On Mon, Aug 18, 2014 at 2:21 PM, Matheus de Oliveira
 matioli.math...@gmail.com wrote:
  Do you really need such large values? What is your max_connections value?
 
 max_connections = 450 ...we have found that we run out of shared
 memory when max_pred_locks_per_transaction is less than 30k.

What was the precise error message when that happened?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 60 core performance with 9.3

2014-07-11 Thread Andres Freund

On 2014-07-11 12:40:15 +1200, Mark Kirkwood wrote:
 On 01/07/14 22:13, Andres Freund wrote:
 On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote:
 - cherry picking the last 5 commits into 9.4 branch and building a package
 from that and retesting:
 
 Clients | 9.4 tps 60 cores (rwlock)
 +--
 6   |  70189
 12  | 128894
 24  | 233542
 48  | 422754
 96  | 590796
 192 | 630672
 
 Wow - that is more like it! Andres that is some nice work, we definitely owe
 you some beers for that :-) I am aware that I need to retest with an
 unpatched 9.4 src - as it is not clear from this data how much is due to
 Andres's patches and how much to the steady stream of 9.4 development. I'll
 post an update on that later, but figured this was interesting enough to
 note for now.
 
 Cool. That's what I like (and expect) to see :). I don't think unpatched
 9.4 will show significantly different results than 9.3, but it'd be good
 to validate that. If you do so, could you post the results in the
 -hackers thread I just CCed you on? That'll help the work to get into
 9.5.
 
 So we seem to have nailed read only performance. Going back and revisiting
 read write performance finds:
 
 Postgres 9.4 beta
 rwlock patch
 pgbench scale = 2000
 
 max_connections = 200;
 shared_buffers = 10GB;
 maintenance_work_mem = 1GB;
 effective_io_concurrency = 10;
 wal_buffers = 32MB;
 checkpoint_segments = 192;
 checkpoint_completion_target = 0.8;
 
 clients  | tps (32 cores) | tps
 -++-
 6|   8313 |   8175
 12   |  11012 |  14409
 24   |  16151 |  17191
 48   |  21153 |  23122
 96   |  21977 |  22308
 192  |  22917 |  23109

On that scale - that's bigger than shared_buffers IIRC - I'd not expect
the patch to make much of a difference.

 kernel.sched_autogroup_enabled=0
 kernel.sched_migration_cost_ns=500
 net.core.somaxconn=1024
 /sys/kernel/mm/transparent_hugepage/enabled [never]
 
 Full report http://paste.ubuntu.com/886/

 #
  8.82%postgres  [kernel.kallsyms][k]
 _raw_spin_lock_irqsave
   |
   --- _raw_spin_lock_irqsave
  |
  |--75.69%-- pagevec_lru_move_fn
  |  __lru_cache_add
  |  lru_cache_add
  |  putback_lru_page
  |  migrate_pages
  |  migrate_misplaced_page
  |  do_numa_page
  |  handle_mm_fault
  |  __do_page_fault
  |  do_page_fault
  |  page_fault

So, the majority of the time is spent in numa page migration. Can you
disable numa_balancing? I'm not sure if your kernel version does that at
runtime or whether you need to reboot.
The kernel.numa_balancing sysctl might work. Otherwise you probably need
to boot with numa_balancing=0.

It'd also be worthwhile to test this with numactl --interleave.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres Replaying WAL slowly

2014-07-02 Thread Andres Freund

On 2014-07-01 15:20:37 -0400, Tom Lane wrote:
 Jeff Frost j...@pgexperts.com writes:
  On Jun 30, 2014, at 4:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Did you check whether the locks were all on temp tables of the
  ON COMMIT DROP persuasion?
 
  And indeed it did catch up overnight and the lag increased shortly after a 
  correlating spike in AccessExclusiveLocks that were generated by temp table 
  creation with on commit drop.
 
 OK, so we have a pretty clear idea of where the problem is now.
 
 It seems like there are three, not mutually exclusive, ways we might
 address this:
 
 1. Local revisions inside StandbyReleaseLocks to make it perform better in
 the presence of many locks.  This would only be likely to improve matters
 much if there's a fixable O(N^2) algorithmic issue; but there might well
 be one.

 It sounded like Andres had taken a preliminary look at #1 and found a
 possible avenue for improvement, which I'd encourage him to pursue.
 

I don't have the resources to do this right now, but yes, I think we can
get relatively easily get rid of the O(num_locks * num_subtransactions)
behaviour.

 2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on
 the grounds that no standby should be touching them.  I'm not entirely
 sure that that argument is bulletproof though; in particular, even though
 a standby couldn't access the table's data, it's possible that it would be
 interested in seeing consistent catalog entries.

Hm. We definitely perform checks surprisingly late for those. It's
possible to do SELECT * FROM pg_temp_nn.whatever; without an error f
there's no rows of if the rest of the plan doesn't do accesses to that
table. The check prohibiting access is only in bufmgr.c...
So yea, I don't think we can do this for at least  9.4. And there
it'll still be hard.

 3. Avoid WAL-logging AccessExclusiveLocks associated with
 new-in-transaction tables, temp or not, on the grounds that no standby
 could even see such tables until they're committed.  We could go a bit
 further and not take out any locks on a new-in-transaction table in the
 first place, on the grounds that other transactions on the master can't
 see 'em either.
 
 For both #2 and the conservative version of #3, the main implementation
 problem would be whether the lock WAL-logging code has cheap access to
 the necessary information.  I suspect it doesn't.

Not trivially. It's logged directly in LockAcquireExtended(). We could
add the information into locktags as there's unused fields for relation
locktags, but brrr.

 The radical version of #3 might be pretty easy to do, at least to the
 extent of removing locks taken out during CREATE TABLE.  I suspect there
 are some assertions or other consistency checks that would get unhappy if
 we manipulate relations without locks, though, so those would have to be
 taught about the exception.

 Also, we sometimes forget new-in-transaction
 status during relcache flush events; it's not clear if that would be a
 problem for this.

I think that hole is actually pluggable in newer releases - at least
there's no code around that assumes rd_createSubid now is persistent,
even across cache resets.

But I think more importantly it's probably quite possible to hit a
similar problem without ON COMMIT DROP relations. Say DISCARD TEMP
inside a transaction (with several subxacts) or so? So we probaly really
should fix the bad scaling.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 60 core performance with 9.3

2014-07-01 Thread Andres Freund

On 2014-07-01 21:48:35 +1200, Mark Kirkwood wrote:
 On 27/06/14 21:19, Andres Freund wrote:
 On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote:
 My feeling is spinlock or similar, 'perf top' shows
 
 kernel find_busiest_group
 kernel _raw_spin_lock
 
 as the top time users.
 
 Those don't tell that much by themselves, could you do a hierarchical
 profile? I.e. perf record -ga? That'll at least give the callers for
 kernel level stuff. For more information compile postgres with
 -fno-omit-frame-pointer.
 
 
 Unfortunately this did not help - had lots of unknown symbols from postgres
 in the profile - I'm guessing the Ubuntu postgresql-9.3 package needs either
 the -dev package or to be rebuilt with the enable profile option (debug and
 no-omit-frame-pointer seem to be there already).

You need to install the -dbg package. My bet is you'll see s_lock high
in the profile, called mainly from the procarray and buffer mapping
lwlocks.

 Test: pgbench
 Options: scale 500
  read only
 Os: Ubuntu 14.04
 Pg: 9.3.4
 Pg Options:
 max_connections = 200

Just as an experiment I'd suggest increasing max_connections by one and
two and quickly retesting - there's some cacheline alignment issues that
aren't fixed yet that happen to vanish with some max_connections
settings.

 shared_buffers = 10GB
 maintenance_work_mem = 1GB
 effective_io_concurrency = 10
 wal_buffers = 32MB
 checkpoint_segments = 192
 checkpoint_completion_target = 0.8
 
 
 Results
 
 Clients | 9.3 tps 32 cores | 9.3 tps 60 cores
 +--+-
 6   |  70400   |  71028
 12  |  98918   | 129140
 24  | 230345   | 240631
 48  | 324042   | 409510
 96  | 346929   | 120464
 192 | 312621   |  92663
 
 So we have anti scaling with 60 cores as we increase the client connections.
 Ouch! A level of urgency led to trying out Andres's 'rwlock' 9.4 branch [1]
 - cherry picking the last 5 commits into 9.4 branch and building a package
 from that and retesting:
 
 Clients | 9.4 tps 60 cores (rwlock)
 +--
 6   |  70189
 12  | 128894
 24  | 233542
 48  | 422754
 96  | 590796
 192 | 630672
 
 Wow - that is more like it! Andres that is some nice work, we definitely owe
 you some beers for that :-) I am aware that I need to retest with an
 unpatched 9.4 src - as it is not clear from this data how much is due to
 Andres's patches and how much to the steady stream of 9.4 development. I'll
 post an update on that later, but figured this was interesting enough to
 note for now.

Cool. That's what I like (and expect) to see :). I don't think unpatched
9.4 will show significantly different results than 9.3, but it'd be good
to validate that. If you do so, could you post the results in the
-hackers thread I just CCed you on? That'll help the work to get into
9.5.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund

On 2014-06-30 19:14:24 +0300, Heikki Linnakangas wrote:
 On 06/30/2014 05:46 PM, Soni M wrote:
 Here's what 'perf top' said on streaming replica :
 
 Samples: 26K of event 'cpu-clock', Event count (approx.): 19781
   95.97%  postgres [.] 0x002210f3
 
 Ok, so it's stuck doing something.. Can you get build with debug symbols
 installed, so that we could see the function name?

My guess it's a spinlock, probably xlogctl-info_lck via
RecoveryInProgress(). Unfortunately inline assembler doesn't always seem
to show up correctly in profiles...

What worked for me was to build with -fno-omit-frame-pointer - that
normally shows the callers, even if it can't generate a proper symbol
name.

Soni: Do you use Hot Standby? Are there connections active while you
have that problem? Any other processes with high cpu load?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund

On 2014-06-30 11:34:52 -0700, Jeff Frost wrote:
 On Jun 30, 2014, at 10:29 AM, Soni M diptat...@gmail.com wrote:

  It is 
   96.62%  postgres [.] StandbyReleaseLocks
   as Jeff said. It runs quite long time, more than 5 minutes i think
  
  i also use hot standby. we have 4 streaming replica, some of them has 
  active connection some has not. this issue has last more than 4 days. On 
  one of the standby, above postgres process is the only process that consume 
  high cpu load.
 
 compiled with -fno-omit-frame-pointer doesn't yield much more info:

You'd need to do perf record -ga instead of perf record -a to see
additional information.

But:

  76.24%  postgres   [.] StandbyReleaseLocks

already is quite helpful.

What are you doing on that system? Is there anything requiring large
amounts of access exclusive locks on the primary? Possibly large amounts
of temporary relations?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund

On 2014-06-30 12:57:56 -0700, Jeff Frost wrote:
 
 On Jun 30, 2014, at 12:54 PM, Matheus de Oliveira matioli.math...@gmail.com 
 wrote:
 
  
  On Mon, Jun 30, 2014 at 4:42 PM, Jeff Frost j...@pgexperts.com wrote:
  And if you go fishing in pg_class for any of the oids, you don't find 
  anything:
  
  That is probably because you are connected in the wrong database. Once you 
  connect to the database of interest, you don't even need to query pg_class, 
  just cast relation attribute to regclass:
  
  SELECT relation::regclass, ...
  FROM pg_locks WHERE database = (SELECT oid FROM pg_database WHERE 
  datname = current_database());
  
 
 Yah, i thought about that too, but verified I am in the correct DB.  Just for 
 clarity sake:

So these are probably relations created in uncommitted
transactions. Possibly ON COMMIT DROP temp tables?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres Replaying WAL slowly

2014-06-30 Thread Andres Freund

On 2014-06-30 19:04:20 -0400, Tom Lane wrote:
 Jeff Frost j...@pgexperts.com writes:
  So it seems like we have a candidate explanation.  I'm a bit surprised
  that StandbyReleaseLocks would get this slow if there are only a dozen
  AccessExclusiveLocks in place at any one time, though.  Perhaps that
  was a low point and there are often many more?
 
  Since we turned on the monitoring for that, we had a peak of 13,550
  AccessExclusiveLocks.

Any chance the workload also uses lots of subtransactions?

 Ah ... that's more like a number I can believe something would have
 trouble coping with.  Did you see a noticeable slowdown with this?
 Now that we've seen that number, of course it's possible there was an
 even higher peak occurring when you saw the trouble.
 
 Perhaps there's an O(N^2) behavior in StandbyReleaseLocks, or maybe
 it just takes awhile to handle that many locks.

I don't think there's a O(n^2) in StandbyReleaseLocks() itself, but in
combination with StandbyReleaseLockTree() it looks possibly bad. The
latter will call StandbyReleaseLocks() for every xid/subxid, and each of
the StandbyReleaseLocks() will then trawl the entire RecoveryLockList...

It'd probably be better to implement ReleaseLocksTree() by sorting the
subxid list and bsearch that while iterating RecoveryLockList.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 60 core performance with 9.3

2014-06-27 Thread Andres Freund

On 2014-06-27 14:28:20 +1200, Mark Kirkwood wrote:
 My feeling is spinlock or similar, 'perf top' shows
 
 kernel find_busiest_group
 kernel _raw_spin_lock
 
 as the top time users.

Those don't tell that much by themselves, could you do a hierarchical
profile? I.e. perf record -ga? That'll at least give the callers for
kernel level stuff. For more information compile postgres with
-fno-omit-frame-pointer.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Sudden crazy high CPU usage

2014-04-01 Thread Andres Freund

On 2014-03-31 19:16:58 +0200, Niels Kristian Schjødt wrote:
 Yes, I could install “perf”, though I’m not familiar with it. What would i 
 do? :-)

As root:
perf record -a sleep 5
perf report  my-nice-perf-report.txt

And then send the my-nice-perf-report.txt file.

Locally it's much nicer to see the output using perf report without
redirect into a file, you'll get an interactive UI.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund

On 2013-12-05 11:15:20 +0200, Metin Doslu wrote:
  - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is
  disappeared for 8 core machines and come back with 16 core machines on
  Amazon EC2. Would it be related with PostgreSQL locking mechanism?
 
 If we build with -DLWLOCK_STATS to print locking stats from PostgreSQL, we
 see tons of contention with default value of NUM_BUFFER_PARTITIONS which is
 16:

Is your workload bigger than RAM? I think a good bit of the contention
you're seeing in that listing is populating shared_buffers - and might
actually vanish once you're halfway cached.
From what I've seen so far the bigger problem than contention in the
lwlocks itself, is the spinlock protecting the lwlocks...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund

On 2013-12-05 11:33:29 +0200, Metin Doslu wrote:
  Is your workload bigger than RAM?
 
 RAM is bigger than workload (more than a couple of times).

  I think a good bit of the contention
  you're seeing in that listing is populating shared_buffers - and might
  actually vanish once you're halfway cached.
  From what I've seen so far the bigger problem than contention in the
  lwlocks itself, is the spinlock protecting the lwlocks...
 
 Could you clarify a bit what do you mean by halfway cached

Well, your stats showed a) fairly low lock counts overall b) a high
percentage of exclusive locks.
a) indicates the system wasn't running long.
b) tells me there were lots of changes to the buffer mapping - which
   basically only happens if a buffer is placed or removed from
   shared-buffers.

If your shared_buffers is big enough to contain most of the data you
shouldn't see many exclusive locks in comparison to the number of shared
locks.

 and spinlock protecting the lwlocks.

Every LWLock has an internal spinlock to protect its state. So whenever
somebody does a LWLockAcquire()/Release(), even if only in shared mode,
we currently acquire that spinlock, manipulate the LWLocks state, and
release the spinlock again. In lots of workloads that internal spinlock
is the contention point, not the lenght over which the lwlock is
held. Especially when they are mostly held in shared mode.

Makes sense?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Andres Freund

On 2013-12-05 17:46:44 +0200, Metin Doslu wrote:
 I tried your patches on next link. As you suspect I didn't see any
 improvements. I tested it on PostgreSQL 9.2 Stable.

You tested the correct branch, right? Which commit does git rev-parse
HEAD show?

But generally, as long as your profile hides all the important
information behind the hypervisor's cost, you're going to have a hard
time analyzing the problems. You really should try to reproduce the
problems on native hardware (as similar to the host hardware as
possible), to get accurate data. On CPU bound workloads that information
is often transportable to the virtual world.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
 On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu me...@citusdata.com wrote:
 
  Here are the results of vmstat 1 while running 8 parallel TPC-H Simple
  (#6) queries:  Although there is no need for I/O, wa fluctuates between 0
  and 1.
 
  procs ---memory-- ---swap-- -io --system--
  -cpu-
   r  b   swpd   free   buffcache si   sobiboin cs us 
  sy  id wa st
   0  0  0 30093568  84892 3872389600 0 022 14  0 
   0 100  0  0
   8  1  0 30043056  84892 3872389600 0 0 27080  52708 16 
  14  70  0  0
   8  1  0 30006600  84892 3872389600 0 0 44952 118286 43 
  44  12  1  0
   8  0  0 29986264  84900 3872389600 020 28043  95934 49 
  42   8  1  0
   7  0  0 29991976  84900 3872389600 0 0  8308  73641 52 
  42   6  0  0
   0  0  0 30091828  84900 3872389600 0 0  3996  30978 23 
  24  53  0  0
   0  0  0 30091968  84900 3872389600 0 01723   0 
   0 100  0  0
 
 
 Notice the huge %sy

My bet is on transparent hugepage defragmentation. Alternatively it's
scheduler overhead, due to superflous context switches around the buffer
mapping locks.

I'd strongly suggest doing a perf record -g -a wait a bit, ctrl-c;
perf report run to check what's eating up the time.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 18:43:35 +0200, Metin Doslu wrote:
  I'd strongly suggest doing a perf record -g -a wait a bit, ctrl-c;
  perf report run to check what's eating up the time.
 
 Here is one example:
 
 +  38.87%   swapper  [kernel.kallsyms]   [k] hypercall_page
 +   9.32%  postgres  [kernel.kallsyms]   [k] hypercall_page
 +   6.80%  postgres  [kernel.kallsyms]   [k] xen_set_pte_at

All that time is spent in your virtualization solution. One thing to try
is to look on the host system, sometimes profiles there can be more
meaningful.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 16:00:40 -0200, Claudio Freire wrote:
 On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund and...@2ndquadrant.com wrote:
  All that time is spent in your virtualization solution. One thing to try
  is to look on the host system, sometimes profiles there can be more
  meaningful.
 
 You cannot profile the host on EC2.

Didn't follow the thread from the start. So, this is EC2? Have you
checked, with a recent enough version of top or whatever, how much time
is reported as stolen?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Andres Freund

On 2013-12-04 20:19:55 +0200, Metin Doslu wrote:
 - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is
 disappeared for 8 core machines and come back with 16 core machines on
 Amazon EC2. Would it be related with PostgreSQL locking mechanism?

You could try my lwlock-scalability improvement patches - for some
workloads here, the improvements have been rather noticeable. Which
version are you testing?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-27 Thread Andres Freund

On 2013-09-27 13:57:02 -0700, Josh Berkus wrote:
 Andres, Jeff,
 
 
  As far as I can tell, the only downside of doing that is that, since hint
  bits might be set later, it is possible some dirty pages will get written
  unhinted and then re-dirtied by the hint bit setting, when more aggressive
  setting would have only one combined dirty write instead.  But that seems
  rather hypothetical, and if it really is a problem we should probably
  tackle it directly rather than by barring other optimizations.
  
  I am - as evidenced - too tired to think about this properly, but I
  think you might be right here.
 
 Any thoughts on a fix for this we could get into 9.2.5?

I don't see much chance to apply anything like this in a
backbranch. Changing IO patterns in a noticeable way in a minor release
is just asking for trouble.

Also, this really isn't going to fix the issue discussed here - this was
just about the additional ProcArrayLock contention. I don't think it
would change anything dramatical in your case.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-09-26 Thread Andres Freund

On 2013-08-27 12:17:55 -0500, Merlin Moncure wrote:
 On Tue, Aug 27, 2013 at 10:55 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2013-08-27 09:57:38 -0500, Merlin Moncure wrote:
  + bool
  + RecoveryMightBeInProgress(void)
  + {
  + /*
  +  * We check shared state each time only until we leave recovery 
  mode. We
  +  * can't re-enter recovery, so there's no need to keep checking 
  after the
  +  * shared variable has once been seen false.
  +  */
  + if (!LocalRecoveryInProgress)
  + return false;
  + else
  + {
  + /* use volatile pointer to prevent code rearrangement */
  + volatile XLogCtlData *xlogctl = XLogCtl;
  +
  + /* Intentionally query xlogctl without spinlocking! */
  + LocalRecoveryInProgress = xlogctl-SharedRecoveryInProgress;
  +
  + return LocalRecoveryInProgress;
  + }
  + }
 
  I don't think it's acceptable to *set* LocalRecoveryInProgress
  here. That should only be done in the normal routine.
 
 quite right -- that was a major error -- you could bypass the
 initialization call to the xlog with some bad luck.

I've seen this in profiles since, so I'd appreciate pushing this
forward.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Andres Freund

On 2013-09-25 00:06:06 -0700, Jeff Janes wrote:
  On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
  given transaction has finished yet can be a
   serious point of system-wide contention, because it takes the
   ProcArrayLock, once per row which needs to be checked.  So you have 20
   processes all fighting over the ProcArrayLock, each doing so 1000
  times per
   query.

That should be gone in master, we don't use SnapshotNow anymore which
had those TransactionIdIsInProgress() calls you're probably referring
to. The lookups discussed in this thread now use the statement's
snapshot. And all those have their own copy of the currently running
transactions.

  Why do we need a procarraylock for this?  Seems like the solution would
  be not to take a lock at all; the information on transaction commit is
  in the clog, after all.

More clog accesses would hardly improve the situation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Andres Freund

On 2013-09-25 11:17:51 -0700, Jeff Janes wrote:
 On Wed, Sep 25, 2013 at 10:53 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2013-09-25 00:06:06 -0700, Jeff Janes wrote:
On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
given transaction has finished yet can be a
 serious point of system-wide contention, because it takes the
 ProcArrayLock, once per row which needs to be checked.  So you have
  20
 processes all fighting over the ProcArrayLock, each doing so 1000
times per
 query.
 
  That should be gone in master, we don't use SnapshotNow anymore which
  had those TransactionIdIsInProgress() calls you're probably referring
  to. The lookups discussed in this thread now use the statement's
  snapshot. And all those have their own copy of the currently running
  transactions.

 See HeapTupleSatisfiesMVCC, near line 943 of tqual.c:
 
 else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
 return false;
 else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
 SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
 HeapTupleHeaderGetXmin(tuple));
 

Hm, sorry, misrembered things a bit there.

 If we guarded that check by moving up line 961 to before 943:
 
 if (XidInMVCCSnapshot(HeapTupleHeaderGetXmin(tuple), snapshot))
 return false;   /* treat as still in progress */
 
 Then we could avoid the contention, as that check only refers to local
 memory.

That wouldn't be correct afaics - the current TransactionIdIsInProgress
callsite is only called when no HEAP_XMIN_COMMITTED was set. So you
would have to duplicate it.

 As far as I can tell, the only downside of doing that is that, since hint
 bits might be set later, it is possible some dirty pages will get written
 unhinted and then re-dirtied by the hint bit setting, when more aggressive
 setting would have only one combined dirty write instead.  But that seems
 rather hypothetical, and if it really is a problem we should probably
 tackle it directly rather than by barring other optimizations.

I am - as evidenced - too tired to think about this properly, but I
think you might be right here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-09-17 Thread Andres Freund

Hi,

On 2013-09-17 17:55:01 +0600, Дмитрий Дегтярёв wrote:
 We have not been able to reproduce this problem on a test servers. Use this
 patch to production servers do not dare.
 
 In the course of studying the problems we have identified that many queries
 are executed on the slave several times slower. On master function
 heap_hot_search_buffer execute 100 cycles, on the slave the same query with
 the same plan function heap_hot_search_buffer execute 2000 cycles.
 Also, we were able to reproduce the problem on the master and detect that
 there s_lock of slow queries.

What you describe is normally an indication that you have too many
longrunning transactions around preventing hot pruning from working.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-09-17 Thread Andres Freund

On 2013-09-17 08:18:54 -0500, Merlin Moncure wrote:
 On Tue, Sep 17, 2013 at 6:59 AM, Andres Freund and...@2ndquadrant.com wrote:
  Hi,
 
  On 2013-09-17 17:55:01 +0600, Дмитрий Дегтярёв wrote:
  We have not been able to reproduce this problem on a test servers. Use this
  patch to production servers do not dare.
 
  In the course of studying the problems we have identified that many queries
  are executed on the slave several times slower. On master function
  heap_hot_search_buffer execute 100 cycles, on the slave the same query with
  the same plan function heap_hot_search_buffer execute 2000 cycles.
  Also, we were able to reproduce the problem on the master and detect that
  there s_lock of slow queries.
 
  What you describe is normally an indication that you have too many
  longrunning transactions around preventing hot pruning from working.
 
 Do you think it's worth submitting the lock avoidance patch for formal review?

You mean the bufmgr.c thing? Generally I think that that code needs a
good of scalability work - there's a whole thread about it
somewhere. But TBH the theories you've voiced about the issues you've
seen haven't convinced me so far.

If you can manage to prove it has a benefit in some case that's
reproducable - why not go ahead?

Quick question: Do you happen to have pg_locks output from back then
around? We've recently found servers going into somewhat similar
slowdowns because they exhausted the fastpath locks which made lwlocks
far more expensive and made s_lock go up very high in the profle.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-09-17 Thread Andres Freund

On 2013-09-17 08:32:30 -0500, Merlin Moncure wrote:
 On Tue, Sep 17, 2013 at 8:24 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-09-17 08:18:54 -0500, Merlin Moncure wrote:
  Do you think it's worth submitting the lock avoidance patch for formal 
  review?
 
  You mean the bufmgr.c thing? Generally I think that that code needs a
  good of scalability work - there's a whole thread about it
  somewhere. But TBH the theories you've voiced about the issues you've
  seen haven't convinced me so far.
 
 er, no (but I share your skepticism -- my challenge right now is to
 demonstrate measurable benefit which so far I've been unable to do).
 I was talking about the patch on  *this* thread which bypasses the
 s_lock in RecoveryInProgress()  :-).

Ah, yes. Sorry confused issues ;). Yes, I think that'd made sense.

  Quick question: Do you happen to have pg_locks output from back then
  around? We've recently found servers going into somewhat similar
  slowdowns because they exhausted the fastpath locks which made lwlocks
  far more expensive and made s_lock go up very high in the profle.
 
 I do. Unfortunately I don't have profile info.   Not sure how useful
 it is -- I'll send it off-list.

Great.

The primary thing I'd like to know is whether there are lots of
non-fastpath locks...

If you ever get into the situation I mistakenly referred to again, I'd
strongly suggest recompling postgres with -fno-omit-frame-pointer. That
makes hierarchical profiles actually useful which can help tremendously
with diagnosing issues like this...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-09-17 Thread Andres Freund

On 2013-09-17 08:40:23 -0500, Merlin Moncure wrote:
  If you ever get into the situation I mistakenly referred to again, I'd
  strongly suggest recompling postgres with -fno-omit-frame-pointer. That
  makes hierarchical profiles actually useful which can help tremendously
  with diagnosing issues like this...
 
 We may get an opportunity to do that.  I'm curious enough about the
 THP compaction issues that Kevin mentioned to to maybe consider
 cranking buffers again.  If I do that, it will be with strict
 instructions to the site operators to catch a profile before taking
 further action.

The THP issues should be very clearly diagnosable because a good part of
the time will be spent in the kernel. Lots of spinlocking there, but the
function names are easily discernible from pg's code.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Intermittent hangs with 9.2

2013-09-11 Thread Andres Freund

On 2013-09-11 07:43:35 -0500, Merlin Moncure wrote:
  I've been seeing a strange issue with our Postgres install for about a year
  now, and I was hoping someone might be able to help point me at the cause.
  At what seem like fairly random intervals Postgres will become unresponsive
  to the 3 application nodes it services. These periods tend to last for 10 -
  15 minutes before everything rights itself and the system goes back to
  normal.
 
  During these periods the server will report a spike in the outbound
  bandwidth (from about 1mbs to about 5mbs most recently), a huge spike in
  context switches / interrupts (normal peaks are around 2k/8k respectively,
  and during these periods they‘ve gone to 15k/22k), and a load average of
  100+. CPU usage stays relatively low, but it’s all system time reported,
  user time goes to zero. It doesn‘t seem to be disk related since we’re
  running with a shared_buffers setting of 24G, which will fit just about our
  entire database into memory, and the IO transactions reported by the server,
  as well as the disk reads reported by Postgres stay consistently low.
 
  We‘ve recently started tracking how long statements take to execute, and
  we’re seeing some really odd numbers. A simple delete by primary key, for
  example, from a table that contains about 280,000 rows, reportedly took
  18h59m46.900s. An update by primary key in that same table was reported as
  7d 17h 58m 30.415s. That table is frequently accessed, but obviously those
  numbers don't seem reasonable at all.
 
  Some other changes we've made to postgresql.conf:
 
  synchronous_commit = off
 
  maintenance_work_mem = 1GB
  wal_level = hot_standby
  wal_buffers = 16MB
 
  max_wal_senders = 10
 
  wal_keep_segments = 5000
 
  checkpoint_segments = 128
 
  checkpoint_timeout = 30min
 
  checkpoint_completion_target = 0.9
 
  max_connections = 500
 
  The server is a Dell Poweredge R900 with 4 Xeon E7430 processors, 48GB of
  RAM, running Cent OS 6.3.
 
  So far we‘ve tried disabling Transparent Huge Pages after I found a number
  of resources online that indicated similar interrupt/context switch issues,
  but it hasn’t resolve the problem. I managed to catch it happening once and
  run a perf which showed:
 
  +  41.40%   48154  postmaster  0x347ba9 f 0x347ba9
  +   9.55%   10956  postmaster  0x2dc820 f set_config_option
  +   8.64%9946  postmaster  0x5a3d4  f writeListPage
  +   5.75%6609  postmaster  0x5a2b0  f
  ginHeapTupleFastCollect
  +   2.68%3084  postmaster  0x192483 f
  build_implied_join_equality
  +   2.61%2990  postmaster  0x187a55 f build_paths_for_OR
  +   1.86%2131  postmaster  0x794aa  f get_collation_oid
  +   1.56%1822  postmaster  0x5a67e  f ginHeapTupleFastInsert
  +   1.53%1766  postmaster  0x1929bc f
  distribute_qual_to_rels
  +   1.33%1558  postmaster  0x249671 f cmp_numerics
 
  I‘m not sure what 0x347ba9 represents, or why it’s an address rather than a
  method name.

Try converting it to something more meaningful with addr2line, that
often has more sucess.

  That's about the sum of it. Any help would be greatly appreciated and if you
  want any more information about our setup, please feel free to ask.

 Reducing shared buffers to around 2gb will probably make the problem go away

That profile doesn't really look like one of the problem you are
referring to would look like.

Based on the profile I'd guess it's possible that you're seing problems
with GIN's fastupdate mechanism.
Try ALTER INDEX whatever SET (FASTUPDATE = OFF); VACUUM
whatever's_table for all gin indexes.

It's curious that set_config_option is so high in the profile... Any
chance you could recompile postgres with -fno-omit-frame-pointers in
CFLAGS? That would allow you to use perf -g. The performance price of
that usually is below 1% for postgres.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-11 Thread Andres Freund

On 2013-09-11 15:06:23 -0400, Andrew Dunstan wrote:
 
 On 09/11/2013 02:35 PM, Josh Berkus wrote:
 All,
 
 We've confirmed that this issue is caused by having long-running idle
 transactions on the server.  When we disabled their queueing system
 (which prodiced hour-long idle txns), the progressive slowness went away.
 
 Why that should affect 9.X far more strongly than 8.4, I'm not sure
 about.  Does that mean that 8.4 was unsafe, or that this is something
 which *could* be fixed in later versions?
 
 I'm also confused as to why this would affect BIND time rather than
 EXECUTE time.
 
 
 
 One thing that this made me wonder is why we don't have transaction_timeout,
 or maybe transaction_idle_timeout.

Because it's harder than it sounds, at least if you want to support
idle-in-transactions. Note that we do not support pg_cancel_backend()
for those yet...

Also, I think it might lead to papering over actual issues with
applications leaving transactions open. I don't really see a valid
reason for an application needing cancelling of long idle transactions.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-11 Thread Andres Freund

On 2013-09-11 11:35:45 -0700, Josh Berkus wrote:
 All,
 
 We've confirmed that this issue is caused by having long-running idle
 transactions on the server.  When we disabled their queueing system
 (which prodiced hour-long idle txns), the progressive slowness went away.
 
 Why that should affect 9.X far more strongly than 8.4, I'm not sure
 about.  Does that mean that 8.4 was unsafe, or that this is something
 which *could* be fixed in later versions?

The explanation is in
http://archives.postgresql.org/message-id/20130910132133.GJ1024477%40alap2.anarazel.de

The referenced commit introduced a planner feature. Funnily you seem to
have been the trigger for it's introduction ;)

 I'm also confused as to why this would affect BIND time rather than
 EXECUTE time.

Because we're doing the histogram checks during planning and not during
execution.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-10 Thread Andres Freund

On 2013-09-09 20:38:09 -0400, Andrew Dunstan wrote:
 
 On 08/01/2013 03:20 PM, Jeff Janes wrote:
 On Thu, Aug 1, 2013 at 10:58 AM, Josh Berkus j...@agliodbs.com wrote:
 Amit, All:
 
 So we just retested this on 9.3b2.  The performance is the same as 9.1
 and 9.2; that is, progressively worse as the test cycles go on, and
 unacceptably slow compared to 8.4.
 
 Some issue introduced in 9.1 is causing BINDs to get progressively
 slower as the PARSEs BINDs get run repeatedly.  Per earlier on this
 thread, that can bloat to 200X time required for a BIND, and it's
 definitely PostgreSQL-side.
 
 I'm trying to produce a test case which doesn't involve the user's
 application.  However, hints on other things to analyze would be keen.
 Does it seem to be all CPU time (it is hard to imagine what else it
 would be, but...)
 
 Could you use oprofile or perf or gprof to get a profile of the
 backend during a run?  That should quickly narrow it down to which C
 function has the problem.
 
 Did you test 9.0 as well?
 
 
 This has been tested back to 9.0. What we have found is that the problem
 disappears if the database has come in via dump/restore, but is present if
 it is the result of pg_upgrade. There are some long-running transactions
 also running alongside this - we are currently planning a test where those
 are not present. We're also looking at constructing a self-contained test
 case.
 
 Here is some perf output from the bad case:
 
+  14.67%  postgres   [.] heap_hot_search_buffer
+  11.45%  postgres   [.] LWLockAcquire
+   8.39%  postgres   [.] LWLockRelease
+   6.60%  postgres   [.] _bt_checkkeys
+   6.39%  postgres   [.] PinBuffer
+   5.96%  postgres   [.] hash_search_with_hash_value
+   5.43%  postgres   [.] hash_any
+   5.14%  postgres   [.] UnpinBuffer
+   3.43%  postgres   [.] ReadBuffer_common
+   2.34%  postgres   [.] index_fetch_heap
+   2.04%  postgres   [.] heap_page_prune_opt
+   2.00%  libc-2.15.so   [.] 0x8041b
+   1.94%  postgres   [.] _bt_next
+   1.83%  postgres   [.] btgettuple
+   1.76%  postgres   [.] index_getnext_tid
+   1.70%  postgres   [.] LockBuffer
+   1.54%  postgres   [.] ReadBufferExtended
+   1.25%  postgres   [.] FunctionCall2Coll
+   1.14%  postgres   [.] HeapTupleSatisfiesNow
+   1.09%  postgres   [.] ReleaseAndReadBuffer
+   0.94%  postgres   [.] ResourceOwnerForgetBuffer
+   0.81%  postgres   [.] _bt_saveitem
+   0.80%  postgres   [.] _bt_readpage
+   0.79%  [kernel.kallsyms]  [k] 0x81170861
+   0.64%  postgres   [.] CheckForSerializableConflictOut
+   0.60%  postgres   [.] ResourceOwnerEnlargeBuffers
+   0.59%  postgres   [.] BufTableLookup

After a second look at this, I very tentatively guess that you'll see
get_actual_variable_range() as the entry point here. Which would explain
why you're seing this during PARSE.

But there still is the question why we never actually seem to prune...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-10 Thread Andres Freund

Hi,

On 2013-09-09 20:38:09 -0400, Andrew Dunstan wrote:
 
 On 08/01/2013 03:20 PM, Jeff Janes wrote:
 On Thu, Aug 1, 2013 at 10:58 AM, Josh Berkus j...@agliodbs.com wrote:
 Amit, All:
 
 So we just retested this on 9.3b2.  The performance is the same as 9.1
 and 9.2; that is, progressively worse as the test cycles go on, and
 unacceptably slow compared to 8.4.
 
 Some issue introduced in 9.1 is causing BINDs to get progressively
 slower as the PARSEs BINDs get run repeatedly.  Per earlier on this
 thread, that can bloat to 200X time required for a BIND, and it's
 definitely PostgreSQL-side.
 
 I'm trying to produce a test case which doesn't involve the user's
 application.  However, hints on other things to analyze would be keen.
 Does it seem to be all CPU time (it is hard to imagine what else it
 would be, but...)
 
 Could you use oprofile or perf or gprof to get a profile of the
 backend during a run?  That should quickly narrow it down to which C
 function has the problem.
 
 Did you test 9.0 as well?
 
 
 This has been tested back to 9.0. What we have found is that the problem
 disappears if the database has come in via dump/restore, but is present if
 it is the result of pg_upgrade. There are some long-running transactions
 also running alongside this - we are currently planning a test where those
 are not present. We're also looking at constructing a self-contained test
 case.
 
 Here is some perf output from the bad case:
 
+  14.67%  postgres   [.] heap_hot_search_buffer
+  11.45%  postgres   [.] LWLockAcquire
+   8.39%  postgres   [.] LWLockRelease
+   6.60%  postgres   [.] _bt_checkkeys
+   6.39%  postgres   [.] PinBuffer
+   5.96%  postgres   [.] hash_search_with_hash_value
+   5.43%  postgres   [.] hash_any
+   5.14%  postgres   [.] UnpinBuffer
+   3.43%  postgres   [.] ReadBuffer_common
+   2.34%  postgres   [.] index_fetch_heap
+   2.04%  postgres   [.] heap_page_prune_opt

A backtrace for this would be useful. Alternatively you could recompile
postgres using -fno-omit-frame-pointer in CFLAGS and use perf record -g.

Any chance you have older prepared xacts, older sessions or something
like that around? I'd expect heap_prune* to be present in workloads that
spend significant time in heap_hot_search_buffer...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-10 Thread Andres Freund

On 2013-09-10 08:45:33 -0400, Andrew Dunstan wrote:
 
 On 09/10/2013 08:20 AM, Andres Freund wrote:
 
 A backtrace for this would be useful. Alternatively you could recompile
 postgres using -fno-omit-frame-pointer in CFLAGS and use perf record -g.
 
 It's using a custom build, so this should be doable.

Great.

 Any chance you have older prepared xacts, older sessions or something
 like that around? I'd expect heap_prune* to be present in workloads that
 spend significant time in heap_hot_search_buffer...
 
 
 Not sure about prepared transactions. There are certainly probably old
 prepared statements around, and long running transactions alongside this
 one.

Ok, long running transactions will do the trick. I quicky checked and
doing an index lookup for min/max histogram lookups was added *after*
8.4 which would explain why you're not seing the issue there
(c.f. 40608e7f949fb7e4025c0ddd5be01939adc79eec).

It getting slower and slower during a testrun would be explained by the
additional tuple versions amassing which cannot be marked dead because
of older transactions around. I guess those are also part of the test?

If I interpret things correctly you're using serializable? I guess there
is no chance to use repeatable read instead?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-10 Thread Andres Freund

On 2013-09-10 15:21:33 +0200, Andres Freund wrote:
 If I interpret things correctly you're using serializable? I guess there
 is no chance to use repeatable read instead?

Err, that wouldn't help much. Read committed. That lets PGXACT-xmin advance
these days and thus might help to reduce the impact of the longrunning
transactions.
Otherwise you will have to shorten those...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

2013-08-27 Thread Andres Freund

On 2013-08-27 09:57:38 -0500, Merlin Moncure wrote:
 + bool
 + RecoveryMightBeInProgress(void)
 + {
 + /*
 +  * We check shared state each time only until we leave recovery mode. We
 +  * can't re-enter recovery, so there's no need to keep checking after 
 the
 +  * shared variable has once been seen false.
 +  */
 + if (!LocalRecoveryInProgress)
 + return false;
 + else
 + {
 + /* use volatile pointer to prevent code rearrangement */
 + volatile XLogCtlData *xlogctl = XLogCtl;
 + 
 + /* Intentionally query xlogctl without spinlocking! */
 + LocalRecoveryInProgress = xlogctl-SharedRecoveryInProgress;
 + 
 + return LocalRecoveryInProgress;
 + }
 + }

I don't think it's acceptable to *set* LocalRecoveryInProgress
here. That should only be done in the normal routine.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Partitions not Working as Expected

2013-06-27 Thread Andres Freund

On 2013-06-27 14:42:26 -0400, Tom Lane wrote:
 Shaun Thomas stho...@optionshouse.com writes:
  On 06/27/2013 12:42 PM, Dave Johansen wrote:
  Or what about something like DATE_TRUNC(DAY, now())? Or would that run
  into the same optimization/planner problems as CURRENT_DATE?
 
  Same issue. This seems to work, though I'm not entirely sure of the 
  implications:
 
  UPDATE pg_proc
  SET provolatile = 'i'
WHERE proname = 'date_in';
 
 That will break things: CURRENT_DATE will then be equivalent to just
 writing today's date as a literal.
 
 It's conceivable that it wouldn't break any scenario that you personally
 care about, if you never use CURRENT_DATE in any view, rule, column
 default expression, or cached plan; but it seems mighty risky from here.

 I don't see any very good solution to your problem within the current
 approach to partitioning, which is basically theorem-proving.  That
 proof engine has no concept of time passing, let alone the sort of
 detailed knowledge of the semantics of this particular function that
 would allow it to conclude if CURRENT_DATE  '2013-06-20' is true now,
 it will always be so in the future as well.

Couldn't we at least significantly improve on the status quo by
detecting we're currently planning a query that's only going to be
executed once (because it's directly executed or because were planning a
onetime plan for specific parameters) and inline stable functions before
doing the theorem proving?

Maybe I am missing something here?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-13 Thread Andres Freund

On 2013-05-13 13:21:54 -0400, Robert Haas wrote:
 On Sun, May 12, 2013 at 8:50 AM, Andres Freund and...@2ndquadrant.com wrote:
  [ a response that I entirely agree with ]
 
 +1 to all that.

 It's maybe worth noting that it's probably fairly uncommon for vacuum
 to read a page and not dirty it, because if the page is all-visible,
 we won't read it.

But only if 50(?)+ pages are marked all-visible in one go, otherwise we
afair won't skip afair. And we don't skip them at all during full table
vacuums.

  And if it's not all-visible, and there's nothing
 else interesting to do with it, we'll probably make it all-visible,
 which will dirty it.  It can happen, if for example we vacuum a page
 with no dead tuples while the inserting transaction is still running,
 or committed but not yet all-visible.  Of course, in those cases we
 won't be able to freeze, either.

IIRC the actual values below which we freeze are always computed
relative to GetOldestXmin() (and have to, otherwise rows will suddently
appear visible). In many, many environment thats lagging behind quite a
bit. Longrunning user transactions, pg_dump, hot_standby_feedback,
vacuum_defer_cleanup_age...

Also, even if the *whole* page isn't all visible because e.g. there just
was another row inserted we still freeze individual rows.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-12 Thread Andres Freund

Hi Josh,

On 2013-05-11 16:28:32 -0700, Josh Berkus wrote:
  That, and Tom's concern about forensics, which I understand to be the
  larger sticking point.
 
 I don't buy the idea that we should cause regular recurring performance
 issues for all of our users in order to aid diagnosing the kind of
 issues which happen 1% of the time to 2% of our users.

Well. For one you haven't proven that the changed setting actually
improves performance. So the comparison isn't really valid. We will
still need full table vacuums to be able to change relfrozenxids. Also,
he small percentages are the cases where the shit really hit the
fan. Making sure you have at least some chance of a) diagnosing the
issue b) recovering data is a pretty good thing.

  So, if the table's age is less than vacuum_freeze_table_age, we'll
  only scan pages not already marked all-visible.  Regardless of vfma,
  we probably won't freeze much.
 
 Right, but the pages which were dirtied *anyway* will get frozen.

I think you're missing the fact that we don't neccessarily dirty pages,
just because vacuum visits them. In a mostly insert workload its not
uncommon that vacuum doesn't change anything. In many scenarios the
first time vacuum visits a page it cannot yet me marked all-visible
yet so we will visit again soon after anyway. And after that there will
be regular full table vacuums.

  It will also often enough lead to a page being frozen repeatedly which
  causes unneccessary IO and WAL traffic. If a page contains pages from
  several transactions its not unlikely that some tuples are older and
  some are newer than vfma. That scenario isn't unlikely because of two
  scenarios:
 
 Nobody has yet explained to me where this extra WAL and IO traffic would
 come from.  vfma only takes effect if the page is being vacuumed
 *anyway*.

There's multiple points here:
a) we don't necessarily write/dirty anything if vacuum doesn't find
   anything to do
b) freezing tuples requires a xlog_heap_freeze wal record to be
   emitted. If we don't freeze, we don't need to emit it.

  And if the page is being vacuumed anyway, the page is being
 rewritten anyway, and it doesn't matter how many changes we make on that
 page, except as far as CPU time is concerned.  As far as IO is
 concerned, an 8K page is an 8K page.  No?

Sure, *if* we writeout the page, it doesn't matter at all whether we
changed one byte or all of them. Unless it also requires extra xlog
records to be emitted.

 The only time I can imagine this resulting in extra IO is if vacuum is
 regularly visiting pages which don't have any other work to do, but do
 have tuples which could be frozen if vfma was lowered.  I would tend to
 think that this would be a tiny minority of pages, but testing may be
 the only way to answer that.

INSERT only produces workloads like that.

  When a page contains freezable items, as determined by freeze_min_age,
  and we are doing a full table scan we won't skip buffers that we can't
  lock for cleanup. Instead we will wait and then lock them for
  cleanup. So I think this would be rather noticeably impact the speed of
  vacuum (since it waits more often) and concurrency (since we lock more
  buffers than before, even if they are actively used).
 
 Well, that behavior sounds like something we should maybe fix,
 regardless of whether we're lowering the default vfma or not.

Well, that's easier said than done ;)

I wonder if we couldn't do the actual freezeing - not the dead tuple
deletion - without a cleanup but just with an exclusive lock?

I think I have said that before, but anyway: I think as long as we need
to regularly walk the whole relation for correctness there isn't much
hope to get this into an acceptable state. If we would track the oldest
xid in a page in a 'freeze map' we could make much of this more
efficient and way more scalable to bigger data volumes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-09 Thread Andres Freund

Hi,

On 2013-03-25 13:31:17 -0700, Josh Berkus wrote:
 In the past, setting vacuum_freeze_min_age (vfma) really low (say to
 1 or 5) would have caused lots of extra writing work due to
 dirtying extra pages for freezing.  This has been our stated reason to
 keep vfma high, despite the obvious advantage of freezing tuples while
 they're still in the cache.
 
 With the visibility map, though, vfma should only be dirtying pages
 which vacuum is already visiting because there's dirty tuples on the
 page.  That is, pages which vacuum will probably dirty anyway, freezing
 or not.  (This is assuming one has applied the 9.2.3 update.)
 
 Given that, it seems like the cost of lowering vfma *should* be
 marginal.  The only extra work done by a lower vfma should be:
 
 1. extra cpu time to put in the froxenXIDs on vacuumed pages, and
 2. dirtying the minority of pages which vacuum decided to scan, but not
 write to.

It will also often enough lead to a page being frozen repeatedly which
causes unneccessary IO and WAL traffic. If a page contains pages from
several transactions its not unlikely that some tuples are older and
some are newer than vfma. That scenario isn't unlikely because of two
scenarios:
- INSERT/UPDATE reusing space on older pages where tuples have been
  deleted.
- When a backend extends a relation that page is *not* known to have
  free space to other relations. Until vacuum comes along for the first
  time only this backend will use its space. Given that busy clusters
  frequently burn loads of xids per second it is not uncommon to have a
  wide range of xids on such a page.

 And are there other costs I'm not thinking of?

I think (but am not 100% sure right now) it would have another rather
big cost:
When a page contains freezable items, as determined by freeze_min_age,
and we are doing a full table scan we won't skip buffers that we can't
lock for cleanup. Instead we will wait and then lock them for
cleanup. So I think this would be rather noticeably impact the speed of
vacuum (since it waits more often) and concurrency (since we lock more
buffers than before, even if they are actively used).

Makes sense?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-09 Thread Andres Freund

On 2013-05-09 12:09:04 -0400, Robert Haas wrote:
 On Mon, Mar 25, 2013 at 4:31 PM, Josh Berkus j...@agliodbs.com wrote:
  In the past, setting vacuum_freeze_min_age (vfma) really low (say to
  1 or 5) would have caused lots of extra writing work due to
  dirtying extra pages for freezing.  This has been our stated reason to
  keep vfma high, despite the obvious advantage of freezing tuples while
  they're still in the cache.
 
 That, and Tom's concern about forensics, which I understand to be the
 larger sticking point.

FWIW I found having sensible xmin/xmax repeatedly really useful for
debugging problems. Most problems don't get noticed within minutes so
loosing evidence that fast is bad.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2012-11-29 Thread Andres Freund

On 2012-11-29 17:20:01 +0530, Pavan Deolasee wrote:
 On Thu, Nov 29, 2012 at 5:03 PM, Guillaume Cottenceau g...@mnc.ch wrote:

  Hello,
 
  I am toying around with 9.2.1, trying to measure/determine how
  index-only scans can improve our performance.
 
  A small script which is attached to this mail, shows that as long
  as the table has been VACUUM FULL'd, there is a unusual high
  amount of heap fetches. It is strange that the visibilitymap_test
  predicate fails in these situations, is the visibility map
  somehow trashed in this situation? It should not, or at least the
  documentation[1] should state it (my understanding is that vacuum
  full does *more* than vacuum, but nothing less) (note to usual
  anti vacuum full trollers: I know you hate vacuum full).
 
 
 I don't find it very surprising given that VACUUM FULL is now implemented
 as a CLUSTER command which rewrites the entire heap, thus invalidating all
 the visibility map info whatsoever.

Me neither.

 Now can CLUSTER or VACUUM FULL recreate the visibility map with all bits
 set to visible, thats an entirely different question. I don't think it can,
 but then I haven't thought through this completely.

It can't set everything to visible as it also copies RECENTLY_DEAD
tuples and tuples which are not yet visible to other transactions, but
it should be relatively easy to keep enough information about whether it
can set the current page to all visible. At least for the data in the
main relation, the toast tables are a different matter.
Just tracking whether the page in rewriteheap.c's state-rs_buffer
contains only tuples that are clearly visible according to the xmin
horizon seems to be enough.

The current effect of resetting the VM has the disadvantage of making
the next autovacuum basically a full table vacuum without any
benefits...

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2012-11-29 Thread Andres Freund

On 2012-11-29 17:59:39 +0530, Pavan Deolasee wrote:
 On Thu, Nov 29, 2012 at 5:42 PM, Andres Freund and...@2ndquadrant.comwrote:

  On 2012-11-29 17:20:01 +0530, Pavan Deolasee wrote:
 
   Now can CLUSTER or VACUUM FULL recreate the visibility map with all bits
   set to visible, thats an entirely different question. I don't think it
  can,
   but then I haven't thought through this completely.
 
  It can't set everything to visible as it also copies RECENTLY_DEAD
  tuples and tuples which are not yet visible to other transactions, but
  it should be relatively easy to keep enough information about whether it
  can set the current page to all visible.


 Yeah, that looks fairly easy to have. Thinking about it more, now that we
 have ability to skip WAL for the case when a table is created and populated
 in the same transaction, we could also set the visibility map bits for such
 a table (if we are not doing that already). That should be fairly safe too.

I don't think the latter would be safe at all. Every repeatable read
transaction that started before the table creation would see that tables
content based on the visibilitymap instead of seeing it as empty.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund

On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
 
 On 11/21/2012 09:59 AM, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 If we're going to do it can we please come up with something more
 intuitive and much, much more documented than OFFSET 0? And if/when we
 do this we'll need to have big red warnings all over then release notes,
 since a lot of people I know will need to do some extensive remediation
 before moving to such a release.
 The probability that we would actually *remove* that behavior of OFFSET
 0 is not distinguishable from zero.  I'm not terribly excited about
 having an alternate syntax to specify an optimization fence, but even
 if we do create such a thing, there's no need to break the historical
 usage.
 
 
 I wasn't talking about removing it. My point was that if the optimization
 fence around CTEs is removed a lot of people will need to rework apps where
 they have used them for that purpose. And I continue to think that spelling
 it OFFSET 0 is horribly obscure.

+1

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund

On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
 On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
  +1
 
  WITH foo AS (SELECT ...) (barrier=on|off)?
 
  9.3 introduces the syntax, defaulting to on
  9.4 switches the default to off.

 Why syntax? What about a guc?

 collapse_cte_limit?

Because there are very good reasons to want to current behaviour. A guc
is a global either/or so I don't see it helping much.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund

On 2012-11-21 13:32:45 -0300, Claudio Freire wrote:
 On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
  On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
   +1
  
   WITH foo AS (SELECT ...) (barrier=on|off)?
  
   9.3 introduces the syntax, defaulting to on
   9.4 switches the default to off.
 
  Why syntax? What about a guc?
 
  collapse_cte_limit?
 
  Because there are very good reasons to want to current behaviour. A guc
  is a global either/or so I don't see it helping much.
 
 set collapse_cte_limit=8;
 with blah as (blah) select blah;
 
 Not global at all.

Not very manageable though. And it doesn't help if you need both in a
query which isn't actually that unlikely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Increasing work_mem and shared_buffers on Postgres 9.2 significantly slows down queries

2012-11-09 Thread Andres Freund

On 2012-10-30 14:08:56 -0500, Petr Praus wrote:
 select count(*) from contest c
 left outer join contestparticipant cp on c.id=cp.contestId
 left outer join teammember tm on tm.contestparticipantid=cp.id
 left outer join staffmember sm on cp.id=sm.contestparticipantid
 left outer join person p on p.id=cp.personid
 left outer join personinfo pi on pi.id=cp.personinfoid
 where pi.lastname like '%b%' or pi.firstname like '%a%';

Btw, not really related to the question, but the way you use left joins
here doesn't really make sense and does lead to inferior plans.
As you restrict on 'pi', the rightmost table in a chain of left joins,
there is no point in all those left joins. I would guess the overall
plan is better if use straight joins.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] how to avoid deadlock on masive update with multiples delete

2012-10-05 Thread Andres Freund

On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
 Maciek Sakrejda m.sakre...@gmail.com writes:
  Presumably something like this?:
  maciek=# CREATE TABLE test AS SELECT g, random() FROM
  generate_series(1,1000) g;
  CREATE
  maciek=# EXPLAIN DELETE FROM test USING (SELECT g FROM test ORDER BY
  ctid) x where x.g = test.g;
 
 There's no guarantee that the planner won't re-sort the rows coming from
 the sub-select, unfortunately.
More often than not you can prevent the planner from doing that by putting a 
OFFSET 0 in the query. Not 100% but better than nothing.

We really need ORDER BY for DML.

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] how to avoid deadlock on masive update with multiples delete

2012-10-05 Thread Andres Freund

On Friday, October 05, 2012 05:46:05 PM Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On Friday, October 05, 2012 05:31:43 PM Tom Lane wrote:
  There's no guarantee that the planner won't re-sort the rows coming from
  the sub-select, unfortunately.
  
  More often than not you can prevent the planner from doing that by
  putting a OFFSET 0 in the query. Not 100% but better than nothing.
 
 No, that will accomplish exactly nothing.  The ORDER BY is already an
 optimization fence.
Yea, sorry. I was thinking of related problem/solution.

  We really need ORDER BY for DML.
 
 Meh.  That's outside the SQL standard (not only outside the letter of
 the standard, but foreign to its very conceptual model) and I don't
 think the problem really comes up that often.
Back when I mostly did consulting/development on client code it came up about 
once a week. I might have  a warped view though because thats the kind of 
thing you would ask a consultant about...

 Having said all that, are we sure this is even a deletion-order
 problem?  I was wondering about deadlocks from foreign key references,
 for instance.
Absolutely not sure, no.

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres becoming slow, only full vacuum fixes it

2012-09-24 Thread Andres Freund

On Monday, September 24, 2012 02:21:09 PM Julien Cigar wrote:
 5) synchronous_commit = off should only be used if you have a 
 battery-backed write cache.
Huh? Are you possibly confusing this with full_page_writes?

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres becoming slow, only full vacuum fixes it

2012-09-24 Thread Andres Freund

On Monday, September 24, 2012 02:53:59 PM Julien Cigar wrote:
 On 09/24/2012 14:34, Andres Freund wrote:
  On Monday, September 24, 2012 02:21:09 PM Julien Cigar wrote:
  5) synchronous_commit = off should only be used if you have a
  battery-backed write cache.
  
  Huh? Are you possibly confusing this with full_page_writes?
 
 indeed...! sorry for that
 (note that you still have a (very) small chance of loosing data with
 synchronous_commit = off if your server crashes between two commit
 chunks)
Sure, you have a chance of loosing the last some transactions, but you won't 
corrupt anything. Thats the entire point of the setting ;)

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] transactions start time

2012-07-25 Thread Andres Freund

Hi,

On Wednesday, July 25, 2012 04:56:20 PM Tom Lane wrote:
 Aleksei Arefjev aleksei.aref...@nordicgaming.com writes:
  On 24 July 2012 20:21, Richard Huxton d...@archonet.com wrote:
  I'm not sure if I'm reading this right, but are there more than 48
  million BEGINs that took 0s each (presumably rounded down) and then a
  handful taking about 0.8s?
 
 I'm wondering exactly where/how the duration was measured.  If it was at
 a client, maybe the apparent delay had something to do with network
 glitches?  It seems suspicious that all the outliers are around 0.8s.
 It would be useful to look to see if there's any comparable pattern
 for statements other than BEGIN.
 
 As Richard says, a BEGIN by itself ought to take negligible time.
He earlier also asked on the IRC-Channel and I got the idea that the problem 
could be explained by pgbouncer in transaction pooling mode waiting for a free 
backend connection. Aleksei confirmed that they use pgbouncer in that 
configuration, so that might be it.

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Andres Freund

On Tuesday, July 10, 2012 08:14:00 AM Maxim Boguk wrote:
 On Tue, Jul 10, 2012 at 4:03 PM, David Kerr d...@mr-paradox.net wrote:
  On Jul 9, 2012, at 10:51 PM, Maxim Boguk wrote:
  But what appears to be happening is that all of the data is being
  written out at the end of the checkpoint.
  
  This happens at every checkpoint while the system is under load.
  
  I get the feeling that this isn't the correct behavior and i've done
  something wrong.
  
  It's not an actual checkpoints.
  It's is a fsync after checkpoint which create write spikes hurting
  server.
  
  You should set sysctl vm.dirty_background_bytes and vm.dirty_bytes to
  reasonable low values
  
  
  So use bla_bytes instead of bla_ratio?
 
 Yes because on 256GB server
 echo 10  /proc/sys/vm/dirty_ratio
 is equivalent to 26Gb dirty_bytes
 
 and
 echo 5 /proc/sys/vm/dirty_background_ratio
 is equivalent to 13Gb dirty_background_bytes
 
 It is really huge values.
 
 So kernel doesn't start write any pages out in background before it has at
 least 13Gb dirty pages in kernel memory.
 And at end of the checkpoint kernel trying flush all dirty pages to disk.
Thast not entirely true. The kernel will also writeout pages which haven't 
been written to for dirty_expire_centisecs.

But yes, adjusting dirty_* is definitely a good idea.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Massive I/O spikes during checkpoint

2012-07-10 Thread Andres Freund

On Tuesday, July 10, 2012 03:36:35 PM Jeff Janes wrote:
 On Tue, Jul 10, 2012 at 5:44 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  On Tuesday, July 10, 2012 08:14:00 AM Maxim Boguk wrote:
  So kernel doesn't start write any pages out in background before it has
  at least 13Gb dirty pages in kernel memory.
  And at end of the checkpoint kernel trying flush all dirty pages to
  disk.
  
  Thast not entirely true. The kernel will also writeout pages which
  haven't been written to for dirty_expire_centisecs.
 
 There seems to be many situations in which it totally fails to do that.
Totally as in diry pages sitting around without any io activity? Or just not 
agressive enough?

Currently its a bit hard to speculate about all without specifying the kernel 
because there have been massive rewrites of all that stuff in several kernels 
in the last two years...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL db, 30 tables with number of rows 100 (not huge) - the fastest way to clean each non-empty table and reset unique identifier column of empty ones.

2012-07-06 Thread Andres Freund

On Friday, July 06, 2012 01:38:56 PM Daniel Farina wrote:
 ll, I don't know a
 mechanism besides slow file system truncation time that would explain
 why DELETE would be significantly faster.
There is no filesystem truncation happening. The heap and the indexes get 
mapped into a new file. Otherwise rollback would be pretty hard to implement.

I guess the biggest cost in a bigger cluster is the dropping the buffers that 
were formerly mapped to that relation (DropRelFileNodeBuffers).

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Maximum number of sequences that can be created

2012-05-15 Thread Andres Freund

On Tuesday, May 15, 2012 08:29:11 AM Віталій Тимчишин wrote:
 2012/5/13 Robert Klemme shortcut...@googlemail.com
 
  On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин tiv...@gmail.com
  
  wrote:
   2012/5/11 Robert Klemme shortcut...@googlemail.com
   
   On the contrary: what would be the /advantage/ of being able to create
   millions of sequences?  What's the use case?
   
   We are using sequences as statistics counters - they produce almost no
   performance impact and we can tolerate it's non-transactional nature. I
  
  can
  
   imaging someone who wants to have a  sequence per user or other
   relation row.
  
  I can almost see the point. But my natural choice in that case would
  be a table with two columns.  Would that actually be so much less
  efficient? Of course you'd have fully transactional behavior and thus
  locking.
 
 We've had concurrency problems with table solution (a counter that is
 updated by many concurrent queries), so we traded transactionality for
 speed. We are actually using this data to graph pretty graphs in nagios, so
 it's quite OK. But we have only ~10 sequences, not millions :)
I would rather suggest going with a suming table if you need to do something 
like that:

sequence_id | value
1 | 3434334
1 | 1
1 | -1
1 | 1
1 | 1
...

You then can get the current value with SELECT SUM(value) WHERE sequence_id = 
1. For garbage collection you can delete those values and insert the newly 
summed up value again.
That solution won't ever block if done right.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Fwd: [HACKERS] client performance v.s. server statistics

2012-02-15 Thread Andres Freund

Hi,
On Wednesday, February 15, 2012 11:19:00 AM Zhou Han wrote:
 I have tried unix domain socket and the performance is similar with
 TCP socket. It is MIPS architecture so memory copy to/from kernel can
 occupy much time, and apparently using unit domain socket has no
 difference than TCP in terms of memory copy.

 But it is still unbelievable for the ten-fold gap between the client
 side statistic and the server side statistics. So I want to know what
 exactly the operations are involved in the server side statistics in
 EXPLAIN ANALYZE. May I check the code later on when I get time.
My guess is that the time difference youre seing is actually the planning time. 
The timing shown at the end of EXPLAIN ANALYZE is just the execution, not the 
planning time. You can use \timing on in psql to let it display timing 
information that include planning.

Whats the query?
 For the query itself, it was just for performance comparison. There
 are other index based queries, which are of course much faster, but
 still result in similar ten-fold of time gap between client side and
 server side statistics.
 
 I am thinking of non-kernel involved client interface, is there such
 an option, or do I have to develop one from scratch?
Its unlikely thats possible in a sensible amount of time. But I don't think 
thats your problem anyway.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Fwd: [HACKERS] client performance v.s. server statistics

2012-02-15 Thread Andres Freund

On Wednesday, February 15, 2012 12:33:13 PM Han Zhou wrote:
 Hi,
 
 To be more specific, I list my calculation here:
 The timing shown in psql may include: plan + execution + copying to
 result set in backend (does this step exist?) + transferring data to
 client via socket.
Correct.

 Then I want to know what's the time shown in pg_stat_statement and
 EXPLAIN ANALYZE in terms of the above mentioned parts. And why are the
 gap is almost 10 times (100 ms v.s. 1 second)? As a comparison,
 transferring same amount of data with unix domain socket should cost
 only a very small fraction of this (almost negligible), according to
 my other performance tests.
Yea, you proved my quick theory wrong.

 And I don't think the plan time plays an important role here in
 EXPLAIN ANALYZE, because the command itself costs similar time to the
 Total runtime as shown in psql (timing on), which means the plan is
 too simple to take any significant part of time in this case.
Sounds like that.

It would be interesting to see the time difference between:
COPY (SELECT * FROM blub) TO '/tmp/somefile';
COPY (SELECT * FROM blub) TO '/tmp/somefile' BINARY;
EXPLAIN ANALYZE SELECT * FROM blub;

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] STRICT SQL functions never inline

2011-11-08 Thread Andres Freund

On Tuesday, November 08, 2011 15:29:03 Josh Berkus wrote:
 Folks,
 
 After having some production issues, I did some testing and it seems
 that any SQL function declared STRICT will never inline.  As a result,
 it won't work with either indexes (on the underlying predicate) or
 partitioning.
 
 This seems like a horrible gotcha for our users.  At the very least I'd
 like to document it (in CREATE FUNCTION, presumably), but it would be
 better to fix it.  Thoughts?
I am all for documenting it somewhere. There were lots of people hit by it in 
the past  - e.g. the postgis folks.
Its not so easy to fix though. The problem is that straight inlining would 
change the behaviour because suddenly the expression might not return NULL 
anymore even though one of the parameters is NULL. Or even cause more problems 
because the content wasn't prepared to handle NULLs.
It would be possible to inline a CASE $1 IS NULL OR $2 IS NULL  THEN NULL 
ELSE orig_expression END but that would be usefull in far fewer cases because 
it won't help much in most cases and actually might hurt performance in some.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Guide to PG's capabilities for inlining, predicate hoisting, flattening, etc?

2011-11-02 Thread Andres Freund

On Wednesday 02 Nov 2011 16:13:09 Robert Haas wrote:
 On Wed, Nov 2, 2011 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Jay Levitt jay.lev...@gmail.com writes:
  So you can see where I'm going.  I know if I break everything into
  elegant, composable functions, it'll continue to perform poorly.  If I
  write one big hairy, it'll perform great but it will be difficult to
  maintain, and it will be inelegant and a kitten will die.  My tools
  are CTEs, subqueries, aliases, SQL functions, PL/pgSQL functions, and
  views (and other tools?)  What optimizations do each of those prevent?
  
  plpgsql functions are black boxes to the optimizer.  If you can express
  your functions as single SQL commands, using SQL-language functions is
  usually a better bet than plpgsql.
  
  CTEs are also treated as optimization fences; this is not so much an
  optimizer limitation as to keep the semantics sane when the CTE contains
  a writable query.
 
 I wonder if we need to rethink, though.  We've gotten a number of
 reports of problems that were caused by single-use CTEs not being
 equivalent - in terms of performance - to a non-CTE formulation of the
 same idea.  It seems necessary for CTEs to behave this way when the
 subquery modifies data, and there are certainly situations where it
 could be desirable otherwise, but I'm starting to think that we
 shouldn't do it that way by default.  Perhaps we could let people say
 something like WITH x AS FENCE (...) when they want the fencing
 behavior, and otherwise assume they don't (but give it to them anyway
 if there's a data-modifying operation in there).
+1. I avoid writing CTEs in many cases where they would be very useful just 
for that reasons.
I don't even think some future inlining necessarily has to be restricted to 
one-use cases only...

+1 for making fencing behaviour as well. Currently there is no real explicit 
method to specify this which is necessarily future proof (WITH, OFFSET 0)...


Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow cursor

2011-10-26 Thread Andres Freund

Hi,

On Wednesday 26 Oct 2011 14:43:08 Cezariusz Marek wrote:
 Is there any known problem with slow cursors in PostgreSQL 8.4.5?
 
 I have a following query, which is slow (on my database it takes 11 seconds
 to execute), probably should be rewritten, but it doesn't matter here. The
 problem is, that in cursor, each fetch takes much longer (even few
 minutes!), while only the first one should be slow. Am I doing something
 wrong?
Does the problem persist if you play around with cursor_tuple_fraction?

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Rather large LA

2011-09-06 Thread Andres Freund

On Monday 05 Sep 2011 22:23:32 Scott Marlowe wrote:
 On Mon, Sep 5, 2011 at 11:24 AM, Andres Freund and...@anarazel.de wrote:
  On Monday, September 05, 2011 14:57:43 Richard Shaw wrote:
  Autovacuum has been disabled and set to run manually via cron during a
  quiet period and fsync has recently been turned off to gauge any real
  world performance increase, there is battery backup on the raid card
  providing some level of resilience.
  
  That doesn't help you against a failure due to fsync() off as the BBU can
  only protect data that actually has been written to disk. Without
  fsync=on no guarantee about that exists.
 
 Further, if you've got a bbu cache on the RAID card the gains from
 fsync=off wll be low / nonexistent.
Thats not necessarily true. If you have a mixed load of many small writes and 
some parallel huge writes (especially in combination with big indexes) 
fsync=off still can give you quite big performance increases. Even in the 
presenence of synchronous_commit=off.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Rather large LA

2011-09-05 Thread Andres Freund

On Monday, September 05, 2011 14:57:43 Richard Shaw wrote:
 Autovacuum has been disabled and set to run manually via cron during a quiet
 period and fsync has recently been turned off to gauge any real world
 performance increase, there is battery backup on the raid card providing
 some level of resilience.
That doesn't help you against a failure due to fsync() off as the BBU can only 
protect data that actually has been written to disk. Without fsync=on no 
guarantee about that exists.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] setting configuration values inside a stored proc

2011-05-13 Thread Andres Freund

Hi,

On Friday, May 13, 2011 01:10:19 AM Samuel Gendler wrote:
 I've got a stored proc that constructs some aggregation queries as strings
 and then executes them.  I'd like to be able to increase work_mem before
 running those queries.  If I set a new value for work_mem within the stored
 proc prior to executing my query string, will that actually have an impact
 on the query or is work_mem basically a constant once the outer statement
 that calls the stored proc has begun?  I'd just test, but it will take
 hours for me to grab a copy of production data and import into a new db
 host for testing. I've already started that process, but I'm betting I'll
 have an answer by the time it completes.  It's just the difference between
 modifying the application which calls the procs (and doing a full software
 release in order to do so or else waiting a month to go in the next
 release) vs modifying the procs themselves, which requires only db a
 update.
I would suggest doing ALTER FUNCTION blub(blarg) SET work_mem = '512MB';

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

1 2 >

1 - 100 of 194 matches

Mail list logo