Re: [ADMIN] Large database help

2001-04-26 Thread Ragnar Kjørstad

On Mon, Apr 23, 2001 at 08:15:05PM -0400, [EMAIL PROTECTED] wrote:
 Then, please correct me if I'm wrong: I should be able to test your 
 hypothesis by creating a small DB (of say 2MB) and setting up at least a 
 dozen backends to tag it. If I get the same symptoms w/ respect to disk 
 activity/performance then we could say that the problem is not related to 
 shared memory/the amount of data/swapping.

I'm no expert, but I guess so.

 What log files are output from postgres? I was under the impression that 
 postmaster's stdout/stderr were the only output. I've been postmaster 
 xyz  log.txt'ing it. I'll check into this.

There is a transaction log named pg_log. I'm not sure exactly under what
conditions postgres writes to that file, and what - others will be able
to answer that.


-- 
Ragnar Kjørstad

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [ADMIN] Large database help

2001-04-23 Thread Ragnar Kjørstad

On Sun, Apr 22, 2001 at 06:52:26PM -0400, [EMAIL PROTECTED] wrote:
 I'm spawning 6 backends to query the data. top lists 6 postmaster processes 
 working, and therefore the idle time should hit 0% easily. Also, the hard 
 drive light goes nuts when I'm running this.
 
 Here is the pertinent information from top. To be clear, I'm NOT spawning a 
 new postmaster per chunk. These same six processes are alive and kicking 
 for over 4 minutes.
 
 I hope the formatting works ok.
 
PID USER PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
   2379 mg13   0 14476  14M 12908 S   0  5.8  2.7   0:01 postmaster
   2380 mg10   0 14436  14M 12868 S   0  2.9  2.7   0:01 postmaster
   2381 mg10   0 13572  13M 12012 S   0  2.9  2.6   0:00 postmaster
   2377 mg10   0 13640  13M 12072 S   0  2.4  2.6   0:01 postmaster
   2378 mg11   0 14476  14M 12908 S   0  2.4  2.7   0:01 postmaster
   2376 mg 8   0 13556  13M 11984 S   0  1.9  2.6   0:00 postmaster

The processes should have ~ 400M shared memory, not ~ 10M, right?
stracing (or maybe just running with debugging enabled) should tell you
what went wrong. My first guess is that the OS is not able to provide a
single 400M block of memory, and postgres falls back to some lower
setting.

You could also verify this by writing a small program that just
allocates different sizes of shared memory, and see what the biggest
size you can allocate is.



-- 
Ragnar Kjørstad

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [ADMIN] Large database help

2001-04-23 Thread xbdelacour

Below are snippets from:

strace postmaster -i -D `pwd` -B 48000

A large snippet:
.
.
.
socket(PF_UNIX, SOCK_STREAM, 0) = 4
bind(4, {sun_family=AF_UNIX, sun_path=/tmp/.s.PGSQL.5432}, 20) = 0
listen(4, 128)  = 0
chmod(/tmp/.s.PGSQL.5432, 0777)   = 0
shmget(5432010, 144, IPC_CREAT|0x1c0|0600) = 2944
shmat(2944, 0, 0)   = 0x40014000
shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945
shmget(5432001, 400385024, 0)   = 2945
shmat(2945, 0, 0)   = 0x40176000
brk(0x81b7000)  = 0x81b7000
mmap(0, 192512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x57f4d000
brk(0x81c3000)  = 0x81c3000
mmap(0, 577536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x57f7c000
mmap(0, 3076096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x58009000
brk(0x81cf000)  = 0x81cf000
semget(5432014, 0, 0)   = -1 ENOENT (No such file or directory)
semget(5432014, 16, IPC_CREAT|0x180|0600) = 1792
semctl(1792, 0, SETALL, 0xb750) = 0
semget(5432015, 0, 0)   = -1 ENOENT (No such file or directory)
semget(5432015, 16, IPC_CREAT|0x180|0600) = 1793
semctl(1793, 0, SETALL, 0xb750) = 0
shmget(5432007, 0, 0)   = -1 ENOENT (No such file or directory)
shmget(5432007, 66060, IPC_CREAT|0x180|0600) = 2946
shmat(2946, 0, 0)   = 0x582f8000
getpid()= 2909
open(/home/mg/pgsql/postmaster.pid, O_RDWR|O_CREAT|O_EXCL, 0600) = 5
write(5, 2909, 4) = 4
close(5)= 0
open(/home/mg/pgsql/postmaster.opts, O_RDWR|O_CREAT|O_TRUNC, 0600) = 5
.
.
.

A smaller snippet:

shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945
shmget(5432001, 400385024, 0)   = 2945
shmat(2945, 0, 0)   = 0x40176000

I'm no Unix expert, but this would seem to indicate that shmget is 
successfully allocating 400385024/1024/1024=381MB of shared memory. I don't 
know enough about how the postgres parent/child/shmem scheme works to know 
why this is working yet the children only register 12MB of shared memory 
under top.

I do need to get this to work however, and firing up a debugger is becoming 
my only option rather quickly..

-Xavier

At 06:15 PM 4/23/01 +0200, Ragnar Kjørstad wrote:
On Sun, Apr 22, 2001 at 06:52:26PM -0400, [EMAIL PROTECTED] wrote:
  I'm spawning 6 backends to query the data. top lists 6 postmaster 
 processes
  working, and therefore the idle time should hit 0% easily. Also, the hard
  drive light goes nuts when I'm running this.
 
  Here is the pertinent information from top. To be clear, I'm NOT 
 spawning a
  new postmaster per chunk. These same six processes are alive and kicking
  for over 4 minutes.
 
  I hope the formatting works ok.
 
 PID USER PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
2379 mg13   0 14476  14M 12908 S   0  5.8  2.7   0:01 
 postmaster
2380 mg10   0 14436  14M 12868 S   0  2.9  2.7   0:01 
 postmaster
2381 mg10   0 13572  13M 12012 S   0  2.9  2.6   0:00 
 postmaster
2377 mg10   0 13640  13M 12072 S   0  2.4  2.6   0:01 
 postmaster
2378 mg11   0 14476  14M 12908 S   0  2.4  2.7   0:01 
 postmaster
2376 mg 8   0 13556  13M 11984 S   0  1.9  2.6   0:00 
 postmaster

The processes should have ~ 400M shared memory, not ~ 10M, right?
stracing (or maybe just running with debugging enabled) should tell you
what went wrong. My first guess is that the OS is not able to provide a
single 400M block of memory, and postgres falls back to some lower
setting.

You could also verify this by writing a small program that just
allocates different sizes of shared memory, and see what the biggest
size you can allocate is.



--
Ragnar Kjørstad


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [ADMIN] Large database help

2001-04-23 Thread xbdelacour

I'm trying to figure that out myself :-)

According to the strace info I sent in my last message, it is in fact 
creating a 381MB shmem block.. which makes no sense, I agree.

-Xavier

At 01:07 AM 4/23/01 -0400, Tom Lane wrote:
  27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped
  CPU states: 16.3% user,  3.8% system,  0.0% nice, 79.8% idle
  Mem:  517292K av, 508400K used,   8892K free,  9K shrd, 197224K buff
  Swap:  65988K av,  0K used,  65988K free160740K cached

These numbers don't add up.  If there's a 384M shared-memory block
in the system, how can there be 197M of kernel disk buffers (not to
mention the kernel and user programs)?

 regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [ADMIN] Large database help

2001-04-23 Thread Ragnar Kjørstad

On Mon, Apr 23, 2001 at 06:13:38PM -0400, [EMAIL PROTECTED] wrote:
 A smaller snippet:
 
 shmget(5432001, 400385024, IPC_CREAT|0x180|0600) = 2945
 shmget(5432001, 400385024, 0)   = 2945
 shmat(2945, 0, 0)   = 0x40176000
 
 I'm no Unix expert, but this would seem to indicate that shmget is 
 successfully allocating 400385024/1024/1024=381MB of shared memory. I don't 
 know enough about how the postgres parent/child/shmem scheme works to know 
 why this is working yet the children only register 12MB of shared memory 
 under top.

I believe you're right that it allocates the memory (succesfully).

The reason top only show 12 MB shared memory may be that the process
haven't actually used the whole segment yet. (linux only allocates the
memory when it's first written to)

I suppose it could be that postgres doesn't write to the memory-segment,
because it doesn't need it. For read access to the file, the OS wil
cache the data anyway, and there is no need to use process memory to
access the file-data. 

Maybe the whole shared-memory issue was a blind track - it should not be
related to disk activety in this case?

Note: some disk activety should be expected. Maybe postgresql updates
the log? Or at the very least it will update the atime timestamps for
the files everytime they're read. This shouldn't cause enough disk
activity to become a performance-problem, but if I remember your initial
post correctly, you indicated that one processor was fully saturated.

Maybe the problem is in fact related to locking and smp, and not related
to shared-memory and disk activity?



-- 
Ragnar Kjørstad

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [ADMIN] Large database help

2001-04-23 Thread Tom Lane

[EMAIL PROTECTED] writes:
 I'm no Unix expert, but this would seem to indicate that shmget is 
 successfully allocating 400385024/1024/1024=381MB of shared memory. I don't 
 know enough about how the postgres parent/child/shmem scheme works to know 
 why this is working yet the children only register 12MB of shared memory 
 under top.

On most of the systems I've worked on, top does not seem to count shmem
blocks that a process is attached to in the process' memory usage.  So
that doesn't prove much one way or the other.

I am wondering if your version of 'top' fails to count swapped-out shmem
segments against swap space, or something like that.  That'd be a tad
weird, but it seems very improbable that your machine is not swapping;
I just do not believe top's claim that no swapping is happening.

Anyway, the most direct experiment would be to reduce your -B request to
100MB or so and see how things change...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [ADMIN] Large database help

2001-04-23 Thread xbdelacour

By my reading, the machine is definitely swapping, and not writing to a log 
file (unless its writing obscene amounts of data to the log, which 
presumably the default settings won't do).

postmaster -i -D /home/mg/pgsql -B 100

produces almost identical results in terms of performance and disk 
activity. top shows that each child only has 2.2MB shared instead of 12MB.

What other program/means do I have to tell if the machine is swapping? Can 
I get a reading as to the amount of data that is actually in physical 
memory within a process?

-Xavier

At 08:24 PM 4/23/01 -0400, Tom Lane wrote:
[EMAIL PROTECTED] writes:
  I'm no Unix expert, but this would seem to indicate that shmget is
  successfully allocating 400385024/1024/1024=381MB of shared memory. I 
 don't
  know enough about how the postgres parent/child/shmem scheme works to know
  why this is working yet the children only register 12MB of shared memory
  under top.

On most of the systems I've worked on, top does not seem to count shmem
blocks that a process is attached to in the process' memory usage.  So
that doesn't prove much one way or the other.

I am wondering if your version of 'top' fails to count swapped-out shmem
segments against swap space, or something like that.  That'd be a tad
weird, but it seems very improbable that your machine is not swapping;
I just do not believe top's claim that no swapping is happening.

Anyway, the most direct experiment would be to reduce your -B request to
100MB or so and see how things change...

 regards, tom lane


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [ADMIN] Large database help

2001-04-22 Thread Tom Lane

[EMAIL PROTECTED] writes:
 Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a 
 rather large database for a data analysis application. Data is collected 
 and dropped into a single table, which will become ~20GB. Analysis happens 
 on a Windows client (over a network) that queries the data in chunks across 
 parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory 
 under Redhat 6 (.0 I think).

 I am setting 'echo 402653184 /proc/sys/kernel/shmmax', which is being 
 reflected in top. I also specify '-B 48000' when starting postmaster.

Hm.  384M shared memory request on a 512M machine.  I'll bet that the
kernel is deciding you don't need all that stuff in RAM, and is swapping
out chunks of the shared memory region to make room for processes and
its own disk buffering activity.  Try a more reasonable -B setting, like
maybe a quarter of your physical RAM, max.  There's no percentage in -B
large enough to risk getting swapped.  Moreover, any physical RAM that
does happen to be free will be exploited by the kernel for disk
buffering at its level, so you aren't really saving any I/O by
increasing Postgres' internal buffering.

BTW, what Postgres version are you using?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [ADMIN] Large database help

2001-04-22 Thread Ragnar Kjørstad

On Sun, Apr 22, 2001 at 05:12:20PM -0400, [EMAIL PROTECTED] wrote:
 My problem is this: during the query process the hard drive is being tagged 
 excessively, while the cpu's are idling at 50% (numbers from Linux command: 
 top), and this is bringing down the speed pretty dramatically since the 
 process is waiting on the hard disk. How do I get the database to be 
 completely resident in memory such that selects don't cause hdd activity? 
 How do I pin how exactly why the hard disk is being accessed?

50% idle on a 2 CPU system mean one CPU is busy and one is idle - as
expected if you run a single CPU-intensive job.

 I am setting 'echo 402653184 /proc/sys/kernel/shmmax', which is being 
 reflected in top. I also specify '-B 48000' when starting postmaster. My 
 test DB is only 86MB, so in theory the disk has no business being active 
 once the data is read into memory unless I perform a write operation.. What 
 am I missing?

What does top say?
How much memory do each of your postgres processes take, and how much of
it is shared?
Is there any free memory?


-- 
Ragnar Kjrstad

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [ADMIN] Large database help

2001-04-22 Thread xbdelacour

I'm spawning 6 backends to query the data. top lists 6 postmaster processes 
working, and therefore the idle time should hit 0% easily. Also, the hard 
drive light goes nuts when I'm running this.

Here is the pertinent information from top. To be clear, I'm NOT spawning a 
new postmaster per chunk. These same six processes are alive and kicking 
for over 4 minutes.

I hope the formatting works ok.

   PID USER PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
  2379 mg13   0 14476  14M 12908 S   0  5.8  2.7   0:01 postmaster
  2380 mg10   0 14436  14M 12868 S   0  2.9  2.7   0:01 postmaster
  2381 mg10   0 13572  13M 12012 S   0  2.9  2.6   0:00 postmaster
  2377 mg10   0 13640  13M 12072 S   0  2.4  2.6   0:01 postmaster
  2378 mg11   0 14476  14M 12908 S   0  2.4  2.7   0:01 postmaster
  2376 mg 8   0 13556  13M 11984 S   0  1.9  2.6   0:00 postmaster

27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 16.3% user,  3.8% system,  0.0% nice, 79.8% idle
Mem:  517292K av, 508400K used,   8892K free,  9K shrd, 197224K buff
Swap:  65988K av,  0K used,  65988K free160740K cached

-Xavier

At 12:46 AM 4/23/01 +0200, Ragnar Kjrstad wrote:
On Sun, Apr 22, 2001 at 05:12:20PM -0400, [EMAIL PROTECTED] wrote:
  My problem is this: during the query process the hard drive is being 
 tagged
  excessively, while the cpu's are idling at 50% (numbers from Linux 
 command:
  top), and this is bringing down the speed pretty dramatically since the
  process is waiting on the hard disk. How do I get the database to be
  completely resident in memory such that selects don't cause hdd activity?
  How do I pin how exactly why the hard disk is being accessed?

50% idle on a 2 CPU system mean one CPU is busy and one is idle - as
expected if you run a single CPU-intensive job.

  I am setting 'echo 402653184 /proc/sys/kernel/shmmax', which is being
  reflected in top. I also specify '-B 48000' when starting postmaster. My
  test DB is only 86MB, so in theory the disk has no business being active
  once the data is read into memory unless I perform a write operation.. 
 What
  am I missing?

What does top say?
How much memory do each of your postgres processes take, and how much of
it is shared?
Is there any free memory?


--
Ragnar Kjrstad


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [ADMIN] Large database help

2001-04-22 Thread Bruce Momjian

 [EMAIL PROTECTED] writes:
  Hi everyone, I'm more or less new to PostgreSQL and am trying to setup a 
  rather large database for a data analysis application. Data is collected 
  and dropped into a single table, which will become ~20GB. Analysis happens 
  on a Windows client (over a network) that queries the data in chunks across 
  parallel connections. I'm running the DB on a dual gig p3 w/ 512 memory 
  under Redhat 6 (.0 I think).
 
  I am setting 'echo 402653184 /proc/sys/kernel/shmmax', which is being 
  reflected in top. I also specify '-B 48000' when starting postmaster.
 
 Hm.  384M shared memory request on a 512M machine.  I'll bet that the
 kernel is deciding you don't need all that stuff in RAM, and is swapping
 out chunks of the shared memory region to make room for processes and
 its own disk buffering activity.  Try a more reasonable -B setting, like
 maybe a quarter of your physical RAM, max.  There's no percentage in -B
 large enough to risk getting swapped.  Moreover, any physical RAM that
 does happen to be free will be exploited by the kernel for disk
 buffering at its level, so you aren't really saving any I/O by
 increasing Postgres' internal buffering.

FYI, I will complete a PostgreSQL performance tuning article this week
that will appear on the PostgreSQL web site.  It will talk about the
problems if making -B too high, causing pages to be swapped out.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [ADMIN] Large database help

2001-04-22 Thread Bruce Momjian

 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped
 CPU states: 16.3% user,  3.8% system,  0.0% nice, 79.8% idle
 Mem:  517292K av, 508400K used,   8892K free,  9K shrd, 197224K buff
 Swap:  65988K av,  0K used,  65988K free160740K cached
 

I see zero swap used, meaning -B is not too large.  I was wrong.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [ADMIN] Large database help

2001-04-22 Thread Tom Lane

 27 processes: 24 sleeping, 3 running, 0 zombie, 0 stopped
 CPU states: 16.3% user,  3.8% system,  0.0% nice, 79.8% idle
 Mem:  517292K av, 508400K used,   8892K free,  9K shrd, 197224K buff
 Swap:  65988K av,  0K used,  65988K free160740K cached

These numbers don't add up.  If there's a 384M shared-memory block
in the system, how can there be 197M of kernel disk buffers (not to
mention the kernel and user programs)?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]