[PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Stefan Keller
Hi Wales

2012/2/27 Wales Wang wormw...@yahoo.com wrote:
 There are many approach for PostgreSQL in-memory.
 The quick and easy way is making slave pgsql run on persistent RAM
 filesystem, the slave is part of master/slave replication cluster.

 The fstab and script make RAM file system persistent is below:
 Setup:
 First, create a mountpoint for the disk :
 mkdir /mnt/ramdisk
 Secondly, add this line to /etc/fstab in to mount the drive at boot-time.
 tmpfs   /mnt/ramdisk tmpfs  defaults,size=65536M 0 0
 #! /bin/sh
 # /etc/init.d/ramdisk.sh
 #

 case $1 in
   start)
     echo Copying files to ramdisk
     rsync -av /data/ramdisk-backup/ /mnt/ramdisk/
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched from HD 
 /var/log/ramdisk_sync.log
     ;;
   sync)
     echo Synching files from ramdisk to Harddisk
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched to HD 
 /var/log/ramdisk_sync.log
     rsync -av --delete --recursive --force /mnt/ramdisk/
 /data/ramdisk-backup/
     ;;
   stop)
     echo Synching logfiles from ramdisk to Harddisk
     echo [`date +%Y-%m-%d %H:%M`] Ramdisk Synched to HD 
 /var/log/ramdisk_sync.log
     rsync -av --delete --recursive --force /mnt/ramdisk/
 /data/ramdisk-backup/
     ;;
   *)
     echo Usage: /etc/init.d/ramdisk {start|stop|sync}
     exit 1
     ;;
 esac
 exit 0

 you can run it when startup and shutdown and crontabe hoursly.

 Wales Wang

Thank you for the tipp.
Making slave pgsql run on persistent RAM filesystem is surely at least
a possibility which I'll try out.

But what I'm finally after is a solution, where records don't get
pushed back to disk a.s.a.p. but rather got hold in memory as long as
possible assuming that there is enough memory.
I suspect that currently there is quite some overhead because of that
(besides disk-oriented structures).

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 5:30 AM, Stefan Keller sfkel...@gmail.com wrote:

 But what I'm finally after is a solution, where records don't get
 pushed back to disk a.s.a.p. but rather got hold in memory as long as
 possible assuming that there is enough memory.

fsync = off ?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Tomas Vondra
On 28 Únor 2012, 14:08, Claudio Freire wrote:
 On Tue, Feb 28, 2012 at 5:30 AM, Stefan Keller sfkel...@gmail.com wrote:

 But what I'm finally after is a solution, where records don't get
 pushed back to disk a.s.a.p. but rather got hold in memory as long as
 possible assuming that there is enough memory.

 fsync = off ?

I don't think this is a viable idea, unless you don't care about the data.

Moreover, fsyn=off does not mean not writing and writing does not mean
removing from shared buffers. A page written/fsynced during a checkpoint
may stay in shared buffers.

AFAIK the pages are not removed from shared buffers without a reason. So a
dirty buffer is written to a disk (because it needs to, to keep ACID) but
stays in shared buffers as clean (unless it was written by a backend,
which means there's not enough memory).

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 10:38 AM, Tomas Vondra t...@fuzzy.cz wrote:
 On 28 Únor 2012, 14:08, Claudio Freire wrote:
 On Tue, Feb 28, 2012 at 5:30 AM, Stefan Keller sfkel...@gmail.com wrote:

 But what I'm finally after is a solution, where records don't get
 pushed back to disk a.s.a.p. but rather got hold in memory as long as
 possible assuming that there is enough memory.

 fsync = off ?

 I don't think this is a viable idea, unless you don't care about the data.

Well, if you keep things in memory as long as possible (as per the
quoted message), then you don't care about memory. There's no way
memory-only DBs can provide ACID guarantees.

synchronous_commit=off goes half way there without sacrificing crash
recovery, which is another option.

 Moreover, fsyn=off does not mean not writing and writing does not mean
 removing from shared buffers. A page written/fsynced during a checkpoint
 may stay in shared buffers.

The OS will write in the background (provided there's enough memory,
which was an assumption on the quoted message). It will not interfere
with other operations, so, in any case, writing or not, you get what
you want.

 AFAIK the pages are not removed from shared buffers without a reason. So a
 dirty buffer is written to a disk (because it needs to, to keep ACID) but
 stays in shared buffers as clean (unless it was written by a backend,
 which means there's not enough memory).

Just writing is not enough. ACID requires fsync. If you don't fsync
(be it with synchronous_commit=off or fsync=off), then it's not full
ACID already.
Because a crash at a bad moment can always make your data nonpersistent.

That's an unavoidable result of keeping things in memory.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Tomas Vondra
On 28 Únor 2012, 14:52, Claudio Freire wrote:
 On Tue, Feb 28, 2012 at 10:38 AM, Tomas Vondra t...@fuzzy.cz wrote:
 On 28 Únor 2012, 14:08, Claudio Freire wrote:
 On Tue, Feb 28, 2012 at 5:30 AM, Stefan Keller sfkel...@gmail.com
 wrote:

 But what I'm finally after is a solution, where records don't get
 pushed back to disk a.s.a.p. but rather got hold in memory as long as
 possible assuming that there is enough memory.

 fsync = off ?

 I don't think this is a viable idea, unless you don't care about the
 data.

 Well, if you keep things in memory as long as possible (as per the
 quoted message), then you don't care about memory. There's no way
 memory-only DBs can provide ACID guarantees.

 synchronous_commit=off goes half way there without sacrificing crash
 recovery, which is another option.

 Moreover, fsyn=off does not mean not writing and writing does not
 mean
 removing from shared buffers. A page written/fsynced during a
 checkpoint
 may stay in shared buffers.

 The OS will write in the background (provided there's enough memory,
 which was an assumption on the quoted message). It will not interfere
 with other operations, so, in any case, writing or not, you get what
 you want.

 AFAIK the pages are not removed from shared buffers without a reason. So
 a
 dirty buffer is written to a disk (because it needs to, to keep ACID)
 but
 stays in shared buffers as clean (unless it was written by a backend,
 which means there's not enough memory).

 Just writing is not enough. ACID requires fsync. If you don't fsync
 (be it with synchronous_commit=off or fsync=off), then it's not full
 ACID already.
 Because a crash at a bad moment can always make your data nonpersistent.

I haven't said writing is sufficient for ACID, I said it's required. Which
is kind of obvious because of the durability part.

 That's an unavoidable result of keeping things in memory.

Why? IIRC the OP was interested in keeping the data in memory for querying
and that the database is read-only after it's populated with data (once a
day). How does writing the transactional logs / data files properly
interfere with that?

I haven't investigated why exactly the data are not cached initially, but
none of the options that I can think of could be fixed by setting
fsync=off. That's something that influences writes (not read-only
database) and I don't think it influences how buffers are evicted from
shared buffers / page cache.

It might speed up the initial load of data, but that's not what the OP was
asking.

kind regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 11:15 AM, Tomas Vondra t...@fuzzy.cz wrote:
 I haven't investigated why exactly the data are not cached initially, but
 none of the options that I can think of could be fixed by setting
 fsync=off. That's something that influences writes (not read-only
 database) and I don't think it influences how buffers are evicted from
 shared buffers / page cache.

 It might speed up the initial load of data, but that's not what the OP was
 asking.

It speeds a lot more than the initial load of data.

Assuming the database is read-only, but not the filesystem (ie: it's
not a slave, in which case all this is moot, as you said, there are no
writes on a slave). That is, assuming this is a read-only master, then
read-only queries don't mean read-only filesystem. Bookkeeping tasks
like updating catalog dbs, statistics tables, page cleanup, stuff like
that can actually result in writes.

Writes that go through the WAL and then the filesystem.

With fsync=off, those writes happen on the background, and are carried
out by the OS. Effectively releasing postgres from having to wait on
them, and, assuming there's enough RAM, merging repeated writes to the
same sectors in one operation in the end. For stats, bookkeeping, and
who knows what else, the merging would be quite effective. With enough
RAM to hold the entire DB, the merging would effectively keep
everything in RAM (in system buffers) until there's enough I/O
bandwidth to transparently push that to persistent storage.

In essence, what was required, to keep everything in RAM for as much
as possible.

It *does* in the same way affect buffer eviction - it makes eviction
*very* quick, and re-population equally as quick, if everything fits
into memory.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Jeff Janes
On Tue, Feb 28, 2012 at 12:30 AM, Stefan Keller sfkel...@gmail.com wrote:

 Thank you for the tipp.
 Making slave pgsql run on persistent RAM filesystem is surely at least
 a possibility which I'll try out.

 But what I'm finally after is a solution, where records don't get
 pushed back to disk a.s.a.p. but rather got hold in memory as long as
 possible assuming that there is enough memory.

That is already the case.  There are two separate issues, when dirty
data is written to disk, and when clean data is dropped from memory.
The only connection between them is that dirty data can't just be
dropped, it must be written first.  But have written it, there is no
reason to immediately drop it.  When a checkpoint cleans data from the
shard_buffers, that now-clean data remains in shared_buffers.  And at
the OS level, when an fsync forces dirty data out to disk, the
now-clean data generally remains in cache (although I've seen nfs
implementations where that was not the case).

It is hard to figure out what problem you are facing.  Is your data
not getting loaded into cache, or is it not staying there?

Cheers,

Jeff

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Tomas Vondra
On 28 Únor 2012, 15:24, Claudio Freire wrote:
 On Tue, Feb 28, 2012 at 11:15 AM, Tomas Vondra t...@fuzzy.cz wrote:
 I haven't investigated why exactly the data are not cached initially,
 but
 none of the options that I can think of could be fixed by setting
 fsync=off. That's something that influences writes (not read-only
 database) and I don't think it influences how buffers are evicted from
 shared buffers / page cache.

 It might speed up the initial load of data, but that's not what the OP
 was
 asking.

 It speeds a lot more than the initial load of data.

 Assuming the database is read-only, but not the filesystem (ie: it's
 not a slave, in which case all this is moot, as you said, there are no
 writes on a slave). That is, assuming this is a read-only master, then
 read-only queries don't mean read-only filesystem. Bookkeeping tasks
 like updating catalog dbs, statistics tables, page cleanup, stuff like
 that can actually result in writes.

 Writes that go through the WAL and then the filesystem.

I'm not sure what maintenance tasks you mean. Sure, there are tasks that
need to be performed after the load (stats, hint bits, updating system
catalogs etc.) but this may happen once right after the load and then
there's effectively zero write activity. Unless the database needs to
write temp files, but that contradicts the 'fits into RAM' assumption ...

 With fsync=off, those writes happen on the background, and are carried
 out by the OS. Effectively releasing postgres from having to wait on
 them, and, assuming there's enough RAM, merging repeated writes to the
 same sectors in one operation in the end. For stats, bookkeeping, and
 who knows what else, the merging would be quite effective. With enough
 RAM to hold the entire DB, the merging would effectively keep
 everything in RAM (in system buffers) until there's enough I/O
 bandwidth to transparently push that to persistent storage.

The writes are always carried out by the OS - except when dirty_ratio is
exceeded (but that's a different story) and WAL with direct I/O enabled.
The best way to allow merging the writes in shared buffers or page cache
is to set the checkpoint_segments / checkpoint_timeout high enough.

That way the transactions won't need to wait for writes to data files
(which is the part related to evictions of buffers from cache). And
read-only transactions won't need to wait at all because they don't need
to wait for fsync on WAL.

 In essence, what was required, to keep everything in RAM for as much
 as possible.

 It *does* in the same way affect buffer eviction - it makes eviction
 *very* quick, and re-population equally as quick, if everything fits
 into memory.

No it doesn't. Only a write caused by a background process (due to full
shared buffers) means immediate eviction. A simple write (caused by a
checkpoint) does not evict the page from shared buffers. Not even a
background writer evicts a page from shared buffers, it merely marks them
as 'clean' and leaves them there. And all those writes happen on the
background, so the clients don't need to wait for them to complete (except
for xlog checkpoints).

kind regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 1:05 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 28 Únor 2012, 15:24, Claudio Freire wrote:
 It speeds a lot more than the initial load of data.

 Assuming the database is read-only, but not the filesystem (ie: it's
 not a slave, in which case all this is moot, as you said, there are no
 writes on a slave). That is, assuming this is a read-only master, then
 read-only queries don't mean read-only filesystem. Bookkeeping tasks
 like updating catalog dbs, statistics tables, page cleanup, stuff like
 that can actually result in writes.

 Writes that go through the WAL and then the filesystem.

 I'm not sure what maintenance tasks you mean. Sure, there are tasks that
 need to be performed after the load (stats, hint bits, updating system
 catalogs etc.) but this may happen once right after the load and then
 there's effectively zero write activity. Unless the database needs to
 write temp files, but that contradicts the 'fits into RAM' assumption ...

AFAIK, stats need to be constantly updated.
Not sure about the rest.

And yes, it's quite possible to require temp files without a database
that doesn't fit in memory, only big OLAP-style queries and small
enough work_mem.

 The writes are always carried out by the OS - except when dirty_ratio is
 exceeded (but that's a different story) and WAL with direct I/O enabled.
 The best way to allow merging the writes in shared buffers or page cache
 is to set the checkpoint_segments / checkpoint_timeout high enough.
 That way the transactions won't need to wait for writes to data files
 (which is the part related to evictions of buffers from cache). And
 read-only transactions won't need to wait at all because they don't need
 to wait for fsync on WAL.

Exactly

 In essence, what was required, to keep everything in RAM for as much
 as possible.

 It *does* in the same way affect buffer eviction - it makes eviction
 *very* quick, and re-population equally as quick, if everything fits
 into memory.

 No it doesn't. Only a write caused by a background process (due to full
 shared buffers) means immediate eviction. A simple write (caused by a
 checkpoint) does not evict the page from shared buffers. Not even a
 background writer evicts a page from shared buffers, it merely marks them
 as 'clean' and leaves them there. And all those writes happen on the
 background, so the clients don't need to wait for them to complete (except
 for xlog checkpoints).

So, we're saying the same.

With all that, and enough RAM, it already does what was requested.

Maybe it would help to tune shared_buffers-to-os-cache ratio, and
dirty_ratio to allow a big portion of RAM used for write caching (if
there were enough writes which I doubt), but, in essence, un
unmodified postgres installation with enough RAM to hold the whole DB
+ shared buffers in RAM should perform quite optimally.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-28 Thread David Kerr

On 02/27/2012 12:08 AM, Reuven M. Lerner wrote:

Hi, everyone. I wanted to thank you again for your help on the huge
delete problem that I was experiencing.

After a lot of trial and error, we finally came to the conclusion that
deleting this much data in the time frame that they need, on
underpowered hardware that is shared with an application, with each test
iteration taking 5-9 hours to run (but needing to run in 2-3), is just
not going to happen. We tried many of the options that people helpfully
suggested here, but none of them gave us the performance that we needed.

(One of the developers kept asking me how it can possibly take so long
to delete 200 GB, when he can delete files of that size in much less
time. I had to explain to him that deleting rows from a database, is a
far more complicated task, and can't really be compared to deleting a
few files.)

In the end, it was agreed that we could execute the deletes over time,
deleting items in the background, or in parallel with the application's
work. After all, if the disk is filling up at the rate of 2 GB/day, then
so long as we delete 4 GB/day (which is pretty easy to do), we should be
fine. Adding RAM or another disk are simply out of the question, which
is really a shame for a database of this size.



Howdy,

I'm coming a little late to the tread but i didn't see anyone propose 
some tricks I've used in the past to overcome the slow delete problem.


First - if you can drop your FKs, delete, re-create your FKs you'll find 
that you can delete an amazing amount of data very quickly.


second - if you can't do that - you can try function that loops and 
deletes a small amount at a time, this gets around the deleting more 
data then you can fit into memory problem. It's still slow but just not 
as slow.


third - don't delete, instead,
create new_table as select * from old_table where records are not the 
ones you want to delete

rename new_table to old_table;
create indexes and constraints
drop old_table;

fourth - I think some folks mentioned this, but just for completeness, 
partition the table and make sure that your partition key is such that 
you can just drop an entire partition.


Hope that helps and wasn't redundant.

Dave

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Jon Nelson
I happened to be looking in the PostgreSQL logs (8.4.10, x86_64,
ScientificLinux 6.1) and noticed that an app was doing some sorting
(group by, order by, index creation) that ended up on disk rather than
staying in memory.
So I enabled trace_sort and restarted the app.
What followed confused me.

I know that the app is setting the work_mem and maintenance_work_mem
to 1GB, at the start of the session, with the following calls:

select set_config(work_mem, 1GB, False);
select set_config(maintenance_work_mem, 1GB, False);

By timestamps, I know that these statements take place before the next
log items, generated by PostgreSQL (note: I also log the PID of the
backend and all of these are from the same PID):

LOG:  0: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 1, workMem = 1048576, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
^ these make sense

LOG:  0: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 1, workMem = 131072, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 1, workMem = 131072, randomAccess = f

^^ these do not (but 128MB is the globally-configured work_mem value)

LOG:  0: begin index sort: unique = t, workMem = 2097152, randomAccess = f
^ this kinda does (2GB is the globally-configured maintenance_work_mem value)

LOG:  0: begin index sort: unique = f, workMem = 131072, randomAccess = f
LOG:  0: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
..


The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
Why does PostgreSQL /sometimes/ use the globally-configured values and
sometimes use the values that come from the connection?
Am I wrong in misunderstanding what 'session' variables are? I thought
that session (versus transaction) config items were set for /all/
transactions in a given backend, until changed or until that backend
terminates. Is that not so?

If I reconfigure the app to call out to set_config(item, value, True)
after each 'BEGIN' statement then workMem seems to be correct (at
least more of the time -- the process takes some time to run and I
haven't done an exhaustive check as yet).

-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes:
 The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
 Why does PostgreSQL /sometimes/ use the globally-configured values and
 sometimes use the values that come from the connection?

You sure those log entries are all from the same process?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Jon Nelson
On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jon Nelson jnelson+pg...@jamponi.net writes:
 The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
 Why does PostgreSQL /sometimes/ use the globally-configured values and
 sometimes use the values that come from the connection?

 You sure those log entries are all from the same process?

If I am understanding this correctly, yes. They all share the same pid.
The logline format is:

log_line_prefix = '%t %d %u [%p]'

and I believe %p represents the pid, and also that a pid corresponds
to a backend. Therefore, same pid == same backend == same connection
== same session. Many transactions within a session.


-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes:
 On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jon Nelson jnelson+pg...@jamponi.net writes:
 Why does PostgreSQL /sometimes/ use the globally-configured values and
 sometimes use the values that come from the connection?

 You sure those log entries are all from the same process?

 If I am understanding this correctly, yes. They all share the same pid.

Hmph ... does seem a bit weird.  Can you turn on log_statements and
identify which operations aren't using the session values?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Jon Nelson
On Tue, Feb 28, 2012 at 2:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jon Nelson jnelson+pg...@jamponi.net writes:
 On Tue, Feb 28, 2012 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jon Nelson jnelson+pg...@jamponi.net writes:
 Why does PostgreSQL /sometimes/ use the globally-configured values and
 sometimes use the values that come from the connection?

 You sure those log entries are all from the same process?

 If I am understanding this correctly, yes. They all share the same pid.

 Hmph ... does seem a bit weird.  Can you turn on log_statements and
 identify which operations aren't using the session values?

I had log_min_duration_statement = 1000.

An example:

LOG:  0: begin tuple sort: nkeys = 3, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT:  INSERT INTO (new table) SELECT (bunch of stuff here) FROM
.. ORDER BY ...

and also some CREATE TABLE ... statements:

LOG:  0: begin index sort: unique = f, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_index_btree, tuplesort.c:642
STATEMENT:  CREATE TABLE tablename (LIKE some_other_tablename)

I also see this:

LOG:  0: begin tuple sort: nkeys = 2, workMem = 131072, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT: SELECT bunch of stuff from system catalogs

which is the ORM library (SQLAlchemy) doing a reflection of the
table(s) involved.
The statement is from the same backend (pid) and takes place
chronologically *after* the following:

LOG:  0: begin tuple sort: nkeys = 2, workMem = 1048576, randomAccess = f
LOCATION:  tuplesort_begin_heap, tuplesort.c:573
STATEMENT: more reflection stuff

Is that useful?

If that's not enough, I can crank the logging up.
What would you like to see for 'log_statements' (if what I've provided
you above is not enough).

-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Kääriäinen Anssi
Quoting Jon Nelson:

The config shows 128MB for work_mem and 2GB for maintenance_work_mem.
Why does PostgreSQL /sometimes/ use the globally-configured values and
sometimes use the values that come from the connection?
Am I wrong in misunderstanding what 'session' variables are? I thought
that session (versus transaction) config items were set for /all/
transactions in a given backend, until changed or until that backend
terminates. Is that not so?


Could it be that the transaction which does the set_config is rolled back? If 
that is
the case, the set_config is rolled back, too. However, if the transaction 
commits,
then the set_config should be in effect for the whole session. It seems this is 
not
documented at all for set_config, just for SET SQL command.

I think it would be nice to have a way to force the connection to use the 
provided
settings even if the transaction in which they are done is rolled back. In 
single statement
if possible. Otherwise you might be forced to do a transaction just to be sure 
the SET
is actually in effect for the connection's life-time.

Django was bitten by this for example, it is now fixed by using this:
https://github.com/django/django/blob/master/django/db/backends/postgresql_psycopg2/base.py#L189

 - Anssi
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes:
 On Tue, Feb 28, 2012 at 2:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hmph ... does seem a bit weird.  Can you turn on log_statements and
 identify which operations aren't using the session values?

 I had log_min_duration_statement = 1000.

That's not really going to prove much, as you won't be able to see any
commands that might be setting or resetting the work_mem parameters.

 ... which is the ORM library (SQLAlchemy) doing a reflection of the
 table(s) involved.

Oh, there's an ORM involved?  I'll bet a nickel it's doing something
surprising, like not issuing your SET until much later than you thought.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 5:48 PM, Stefan Keller sfkel...@gmail.com wrote:
 P.S. And yes, the database is aka 'read-only' and truncated and
 re-populated from scratch every night. fsync is off so I don't care
 about ACID. After the indexes on name, hstore and geometry are
 generated I do a VACUUM FULL FREEZE. The current installation is a
 virtual machine with 4GB memory and the filesystem is read/write.
 The future machine will be a pizza box with 72GB memory.

I don't get this. Something's wrong.

In the OP, you say There is enough main memory to hold all table
contents.. I'm assuming, there you refer to your current system, with
4GB memory.

So your data is less than 4GB, but then you'll be throwing a 72GB
server? It's either tremendous overkill, or your data simply isn't
less than 4GB.

It's quite possible the vacuum full is thrashing your disk cache due
to maintainance_work_mem. You can overcome this issue with the tar
trick, which is more easily performed as:

tar cf /dev/null $PG_DATA/base

tar will read all the table's contents and populate the OS cache. From
there to shared_buffers it should be very very quick. If it is true
that your data fits in 4GB, then that should fix it all. Beware,
whatever you allocate to shared buffers will be redundantly loaded
into RAM, first in shared buffers, then in the OS cache. So your data
has to fit in 4GB - shared buffers.

I don't think query-based tricks will load everything into RAM,
because you will get sequential scans and not index scans - the
indices will remain uncached. If you forced an index scan, it would
have to read the whole index in random order (random I/O), and that
would be horribly slow. The best way is to tar the whole database into
/dev/null and be done with it.

Another option is to issue a simple vacuum after the vacuum full.
Simple vacuum will just scan the tables and indices, I'm hoping doing
nothing since the vacuum full will have cleaned everything already,
but loading everything both in the OS cache and into shared_buffers.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 6:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 ... which is the ORM library (SQLAlchemy) doing a reflection of the
 table(s) involved.

 Oh, there's an ORM involved?  I'll bet a nickel it's doing something
 surprising, like not issuing your SET until much later than you thought.

I'd rather go for an auto-rollback at some point within the
transaction that issued the set work_mem. SQLA tends to do that if,
for instance, an exception is risen within a transaction block (ie,
flushing).

You can issue the set work_mem in its own transaction, and commit it,
and in that way avoid that rollback.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Jon Nelson
On Tue, Feb 28, 2012 at 4:46 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Feb 28, 2012 at 6:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 ... which is the ORM library (SQLAlchemy) doing a reflection of the
 table(s) involved.

 Oh, there's an ORM involved?  I'll bet a nickel it's doing something
 surprising, like not issuing your SET until much later than you thought.

 I'd rather go for an auto-rollback at some point within the
 transaction that issued the set work_mem. SQLA tends to do that if,
 for instance, an exception is risen within a transaction block (ie,
 flushing).

 You can issue the set work_mem in its own transaction, and commit it,
 and in that way avoid that rollback.

I cranked the logging /all/ the way up and isolated the server.
I suspect that your theory is correct.
I'll spend a bit more time investigating.


-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Claudio Freire
On Tue, Feb 28, 2012 at 8:00 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 I cranked the logging /all/ the way up and isolated the server.
 I suspect that your theory is correct.

Another option, depending on your SQLA version, when connections are
sent back to the pool, I seem to remember they were reset. That would
also reset the work_mem, you'd still see the same pid on PG logs, but
it's not the same session.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?

2012-02-28 Thread Stefan Keller
2012/2/28 Claudio Freire klaussfre...@gmail.com:
 On Tue, Feb 28, 2012 at 5:48 PM, Stefan Keller sfkel...@gmail.com wrote:
 P.S. And yes, the database is aka 'read-only' and truncated and
 re-populated from scratch every night. fsync is off so I don't care
 about ACID. After the indexes on name, hstore and geometry are
 generated I do a VACUUM FULL FREEZE. The current installation is a
 virtual machine with 4GB memory and the filesystem is read/write.
 The future machine will be a pizza box with 72GB memory.

 I don't get this. Something's wrong.

 In the OP, you say There is enough main memory to hold all table
 contents.. I'm assuming, there you refer to your current system, with
 4GB memory.

Sorry for the confusion: I'm doing these tests on this machine with
one table (osm_point) and one country. This table has a size of 2.6GB
and 10 million tuples. The other machine has to deal with at least 5
tables in total and will be hold more than one country plus routing
etc..

-Stefan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] problems with set_config, work_mem, maintenance_work_mem, and sorting

2012-02-28 Thread Ants Aasma
On Feb 29, 2012 1:44 AM, Claudio Freire klaussfre...@gmail.com wrote:
 Another option, depending on your SQLA version, when connections are
 sent back to the pool, I seem to remember they were reset. That would
 also reset the work_mem, you'd still see the same pid on PG logs, but
 it's not the same session.

Except that any open transactions are rolled back no other reset is done.
The correct way to handle this would be to set the options and commit the
transaction in Pool connect or checkout events. The event choice depends on
whether application scope or request scope parameters are wanted.

--
Ants Aasma