Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-09-06 Thread 高健
Hello:

Sorry for disturbing again.
Some of my friends told me about cgroups, So I tried it first.
I found that cgroups can work for  task such as  wget.
But it can't work for my postgres process.

[root@cent6 Desktop]# cat /etc/cgconfig.conf
#
#  Copyright IBM Corporation. 2007
#
#  Authors: Balbir Singh bal...@linux.vnet.ibm.com
#  This program is free software; you can redistribute it and/or modify it
#  under the terms of version 2.1 of the GNU Lesser General Public License
#  as published by the Free Software Foundation.
#
#  This program is distributed in the hope that it would be useful, but
#  WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See man cgconfig.conf for further details.
#
# By default, mount all controllers to /cgroup/controller

mount {
cpuset = /cgroup/cpuset;
cpu = /cgroup/cpu;
cpuacct = /cgroup/cpuacct;
memory = /cgroup/memory;
devices = /cgroup/devices;
freezer = /cgroup/freezer;
net_cls = /cgroup/net_cls;
blkio = /cgroup/blkio;
}

group test1 {
perm {
  task{
  uid=postgres;
  gid=postgres;
  }

  admin{
 uid=root;
 gid=root;
  }

} memory {
   memory.limit_in_bytes=500M;
}
}

[root@cent6 Desktop]#

[root@cent6 Desktop]# service cgconfig status
Running
[root@cent6 Desktop]#

When I start postgres and run the above sql statement, It still consume too
much memory. As if cgroups does not work.

Best Regards


2013/9/3 高健 luckyjack...@gmail.com

 Thanks, I'll consider it carefully.

 Best Regards

 2013/9/3 Jeff Janes jeff.ja...@gmail.com

 On Sun, Sep 1, 2013 at 6:25 PM, 高健 luckyjack...@gmail.com wrote:
 To spare memory, you would want to use something like:
 
 insert into test01 select generate_series,
 repeat(chr(int4(random()*26)+65),1024) from
 generate_series(1,2457600);
 
  Thanks a lot!
 
  What I am worrying about is that:
  If data grows rapidly, maybe our customer will use too much memory ,


 The size of the data has little to do with it.  Take your example as
 an example.  The database could have been nearly empty before you
 started running that query.  A hostile or adventurous user can craft
 queries that will exhaust the server's memory without ever needing any
 particular amount of data in data_directory, except maybe in the temp
 tablespace.

 So it is a matter of what kind of users you have, not how much data
 you anticipate having on disk.

 The parts of PostgreSQL that might blow up memory based on ordinary
 disk-based tables are pretty well protected by shared_buffers,
 temp_buffers, work_mem, maintenance_work_mem, etc. already.  It is the
 things that don't directly map to data already on disk which are
 probably more vulnerable.

  Is
  ulimit  command a good idea for PG?

 I've used ulimit -v on a test server (which was intentionally used to
 test things to limits of destruction), and was happy with the results.
  It seemed like it would error out the offending process, or just the
 offending statement, in a graceful way; rather than having random
 processes other than the culprit be brutally killed by OOM, or having
 the machine just swap itself into uselessness.   I'd be reluctant to
 use it on production just on spec that something bad *might* happen
 without it, but if I started experiencing problems caused by a single
 rogue process using outrageous amounts of memory, that would be one of
 my first stops.

 Experimentally, shared memory does count against the -v limit, and the
 limit has to be set rather higher than shared_buffers, or else your
 database won't even start.

 Cheers,

 Jeff





Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-09-02 Thread Jeff Janes
On Sun, Sep 1, 2013 at 6:25 PM, 高健 luckyjack...@gmail.com wrote:
To spare memory, you would want to use something like:

insert into test01 select generate_series,
repeat(chr(int4(random()*26)+65),1024) from
generate_series(1,2457600);

 Thanks a lot!

 What I am worrying about is that:
 If data grows rapidly, maybe our customer will use too much memory ,


The size of the data has little to do with it.  Take your example as
an example.  The database could have been nearly empty before you
started running that query.  A hostile or adventurous user can craft
queries that will exhaust the server's memory without ever needing any
particular amount of data in data_directory, except maybe in the temp
tablespace.

So it is a matter of what kind of users you have, not how much data
you anticipate having on disk.

The parts of PostgreSQL that might blow up memory based on ordinary
disk-based tables are pretty well protected by shared_buffers,
temp_buffers, work_mem, maintenance_work_mem, etc. already.  It is the
things that don't directly map to data already on disk which are
probably more vulnerable.

 Is
 ulimit  command a good idea for PG?

I've used ulimit -v on a test server (which was intentionally used to
test things to limits of destruction), and was happy with the results.
 It seemed like it would error out the offending process, or just the
offending statement, in a graceful way; rather than having random
processes other than the culprit be brutally killed by OOM, or having
the machine just swap itself into uselessness.   I'd be reluctant to
use it on production just on spec that something bad *might* happen
without it, but if I started experiencing problems caused by a single
rogue process using outrageous amounts of memory, that would be one of
my first stops.

Experimentally, shared memory does count against the -v limit, and the
limit has to be set rather higher than shared_buffers, or else your
database won't even start.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-09-02 Thread 高健
Thanks, I'll consider it carefully.

Best Regards

2013/9/3 Jeff Janes jeff.ja...@gmail.com

 On Sun, Sep 1, 2013 at 6:25 PM, 高健 luckyjack...@gmail.com wrote:
 To spare memory, you would want to use something like:
 
 insert into test01 select generate_series,
 repeat(chr(int4(random()*26)+65),1024) from
 generate_series(1,2457600);
 
  Thanks a lot!
 
  What I am worrying about is that:
  If data grows rapidly, maybe our customer will use too much memory ,


 The size of the data has little to do with it.  Take your example as
 an example.  The database could have been nearly empty before you
 started running that query.  A hostile or adventurous user can craft
 queries that will exhaust the server's memory without ever needing any
 particular amount of data in data_directory, except maybe in the temp
 tablespace.

 So it is a matter of what kind of users you have, not how much data
 you anticipate having on disk.

 The parts of PostgreSQL that might blow up memory based on ordinary
 disk-based tables are pretty well protected by shared_buffers,
 temp_buffers, work_mem, maintenance_work_mem, etc. already.  It is the
 things that don't directly map to data already on disk which are
 probably more vulnerable.

  Is
  ulimit  command a good idea for PG?

 I've used ulimit -v on a test server (which was intentionally used to
 test things to limits of destruction), and was happy with the results.
  It seemed like it would error out the offending process, or just the
 offending statement, in a graceful way; rather than having random
 processes other than the culprit be brutally killed by OOM, or having
 the machine just swap itself into uselessness.   I'd be reluctant to
 use it on production just on spec that something bad *might* happen
 without it, but if I started experiencing problems caused by a single
 rogue process using outrageous amounts of memory, that would be one of
 my first stops.

 Experimentally, shared memory does count against the -v limit, and the
 limit has to be set rather higher than shared_buffers, or else your
 database won't even start.

 Cheers,

 Jeff



Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-09-01 Thread 高健
To spare memory, you would want to use something like:

insert into test01 select generate_series,
repeat(chr(int4(random()*26)+65),1024) from
generate_series(1,2457600);

Thanks a lot!

What I am worrying about is that:
If data grows rapidly, maybe our customer will use too much memory , Is
ulimit  command a good idea for PG?

Best Regards



2013/9/1 Jeff Janes jeff.ja...@gmail.com

 On Fri, Aug 30, 2013 at 2:10 AM, 高健 luckyjack...@gmail.com wrote:
 
 
  postgres=# insert into test01 values(generate_series(1,2457600),repeat(
  chr(int4(random()*26)+65),1024));

 The construct values (srf1,srf2) will generate its entire result set
 in memory up front, it will not stream its results to the insert
 statement on the fly.

 To spare memory, you would want to use something like:

 insert into test01 select generate_series,
 repeat(chr(int4(random()*26)+65),1024) from
 generate_series(1,2457600);

 Cheers,

 Jeff



Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-09-01 Thread Tom Lane
=?UTF-8?B?6auY5YGl?= luckyjack...@gmail.com writes:
 If data grows rapidly, maybe our customer will use too much memory , Is
 ulimit  command a good idea for PG?

There's no received wisdom saying that it is.  There's a fairly widespread
consensus that disabling OOM kill can be a good idea, but I don't recall
that many people have tried setting specific ulimits on server processes.

Keep in mind that exceeding a ulimit would cause queries to fail outright
(whether the server was under much load or not), versus just getting
slower if the server starts to swap under too much load.  I can imagine
situations where that would be considered a good tradeoff, but it's hardly
right for everyone.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-08-31 Thread Jeff Janes
On Fri, Aug 30, 2013 at 2:10 AM, 高健 luckyjack...@gmail.com wrote:


 postgres=# insert into test01 values(generate_series(1,2457600),repeat(
 chr(int4(random()*26)+65),1024));

The construct values (srf1,srf2) will generate its entire result set
in memory up front, it will not stream its results to the insert
statement on the fly.

To spare memory, you would want to use something like:

insert into test01 select generate_series,
repeat(chr(int4(random()*26)+65),1024) from
generate_series(1,2457600);

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-08-30 Thread hxreno1

This should be the operating system  OOM kills pg process,check syslog

On Fri 30 Aug 2013 05:10:42 PM CST, 高健 wrote:

Hello:

I  have done the following experiment to test :

PG's activity when dealing with data which is bigger in size than
total memory of the whole os system.

The result is:
PG says:

WARNING:  terminating connection because of crash of another server
process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
The connection to the server was lost. Attempting reset: Failed.
!
-

In log, I can see the following:
LOG:  background writer process (PID 3221) was terminated by signal 9:
Killed

But why it need to kill the bgwriter ?
And that PG will use as much resource as it can to finish a sql dml ,
till it used almost all the resource and then crash?

I used the default  checkpoint_segments(3) and shard_buffers(32MB)
settings. And my machine has only 1024MB memory.
The process is as :
1) create table whose one tuple is more than 1KB:

postgres=# create table test01(id integer, val char(1024));

2) insert into the table 2457600 records, which makes totally more
than 2400MB:

postgres=# insert into test01
values(generate_series(1,2457600),repeat(
chr(int4(random()*26)+65),1024));

It really took a few minutes, When the sql statement run, I can see
that the server process is consuming 80% of total memory of the os.

--

[root@server ~]# ps aux | grep post

root  3180  0.0  0.0 105296   712 pts/1S16:31   0:00 su -
postgres

postgres  3181  0.0  0.0  70304   676 pts/1S+   16:31   0:00 -bash

postgres  3219  0.0  0.2 113644  2864 pts/1S16:32   0:00
/usr/local/pgsql/bin/postgres -D /gao/data

postgres  3221  0.4  3.0 113724 35252 ?Ss   16:32   0:01
postgres: writer process

postgres  3222  0.2  0.1 113644  1616 ?Ds   16:32   0:00
postgres: wal writer process

postgres  3223  0.0  0.0 114380  1148 ?Ss   16:32   0:00
postgres: autovacuum launcher process

postgres  3224  0.0  0.0  73332   472 ?Ss   16:32   0:00
postgres: stats collector process

root  3252  0.0  0.0 105296   712 pts/2S16:32   0:00 su -
postgres

postgres  3253  0.0  0.0  70304   676 pts/2S16:32   0:00 -bash

postgres  3285  0.0  0.0  83488   740 pts/2S+   16:32   0:00 ./psql

postgres  3286 14.8 80.2 2598332 924308 ?  Ds   16:32   0:35
postgres: postgres postgres [local] INSERT

root    0.0  0.0  65424   812 pts/3S+   16:36   0:00 grep post



After a while, I found in the PG's log , the following information:


LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
LOG:  checkpoints are occurring too frequently (2 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (1 second apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (2 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (1 second apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (2 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (1 second apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (2 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (4 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (10 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (9 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (7 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (9 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are occurring too frequently (7 seconds apart)
HINT:  Consider increasing the configuration parameter
checkpoint_segments.
LOG:  checkpoints are 

Re: [GENERAL] My Experiment of PG crash when dealing with huge amount of data

2013-08-30 Thread Michael Paquier
On Fri, Aug 30, 2013 at 6:10 PM, 高健 luckyjack...@gmail.com wrote:
 In log, I can see the following:
 LOG:  background writer process (PID 3221) was terminated by signal 9:
 Killed
Assuming that no users on your server manually killed this process, or
that no maintenance task you implemented did that, this looks like the
Linux OOM killer because of a memory overcommit. Have a look here for
more details:
http://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
So have a look at dmesg to confirm that, then you could use one of the
strategies described in the docs. Also, as you have been doing a bulk
INSERT, you should as well increase temporarily checkpoint_segments to
reduce the pressure on the background writer by reducing the number of
checkpoints happening. This will also make your data load faster.
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general