Re: [GENERAL] Reducing memory usage of insert into select operations? [Solved]

2008-07-19 Thread Francisco Reyes

Martijn van Oosterhout wrote:

Can you make them not deferred?

How?


I found the issue.
I had the foreign key in the master table instead of the children.
Deleted RI from master table and put into the inherited partitions.
My whole 230 million rows merged in about an hour!
And I even had two of those running at the same time. (one setup with 14 
partitions per month and another with 5 partitions per month to test 
difference in performance).


It was so fast I even had to do a count(*) to make sure both actually 
merged.

That is 117K rows per second for rows that were about 33 bytes long.
That only comes down to about 3 MB/sec+overhead, but still 117K rows/sec 
is not too shabby.


In case it is of interest to anyone..
2 AMD dual core, 2GHz CPUs
12GB of RAM
shared_buffers 3GB
work_mem 64MB
256 check_point segments
10 min checkpoing_timeout
LSI controller with 128MB cache with BBU. Write cache enabled.


Many thanks to all that offered suggestions in the troubleshooting.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations? [Solved]

2008-07-19 Thread Alvaro Herrera
Francisco Reyes wrote:

 I had the foreign key in the master table instead of the children.
 Deleted RI from master table and put into the inherited partitions.
 My whole 230 million rows merged in about an hour!

Heh -- but are the FKs now checked?  Try inserting something that
violates the constraints and see if they are rejected.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations? [Solved]

2008-07-19 Thread Francisco Reyes

Alvaro Herrera writes:


Heh -- but are the FKs now checked?  Try inserting something that
violates the constraints and see if they are rejected.


I knew it sounded too good to be true.
1- The trigger was not set in the master (ie nothing went to the children).
2- The master had no index and no RI.. so it was a straight insert.

I corrected (ie set the trigger in the master and RI in the children). Has 
been running for 10 hours and has not finished.


The good news is that memory doesn't seem to be going up.
I will give it till tomorrow AM.. and if hasn't finished will turn off the 
foreign keys in the children. Already modified the scripts so I can easily 
build/drop the foreign keys as needed.  


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations? [Solved]

2008-07-19 Thread Alvaro Herrera
Francisco Reyes wrote:

 I knew it sounded too good to be true.
 1- The trigger was not set in the master (ie nothing went to the children).
 2- The master had no index and no RI.. so it was a straight insert.

 I corrected (ie set the trigger in the master and RI in the children). 
 Has been running for 10 hours and has not finished.

FWIW it tends to be faster to do the bulk load first and add the
indexes and constraints later.  (Though obviously you must be prepared
to cope with the failing rows, if any).  However, if you do this
INSERT/SELECT thing frequently, this is probably not very workable.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Richard Huxton

Francisco Reyes wrote:

The OS triggered the out of memory killer (oom-killer).



The table I am selecting from has a few hundred million rows.
The table I am inserting into has partitions. I am benchmarking breaking 
up a large table into smaller partitions.


Is the partition split done with triggers or rules?

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Douglas McNaught
On Fri, Jul 18, 2008 at 12:18 AM, Francisco Reyes
[EMAIL PROTECTED] wrote:
 Douglas McNaught writes:


 It does seem that reducing work_mem might help you, but others on this

 I reduced it from 256MB to 64MB. It seems it is helping.

You should also look at your memory overcommit settings (in
/proc/sys/vm).  You can set things up so that Postgres gets a malloc()
failure (which it is generally prepared to cope with cleanly) when the
system runs out of RAM, rather than having the OOM killer go off and
hit it with SIGKILL.  Overcommit is useful in some contexts (Java apps
tend to map a lot more memory than they actually use) but for a
dedicated database server you really don't ever want to have the OOM
killer triggered.

-Doug

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 9:53 am 07/18/08 Douglas McNaught [EMAIL PROTECTED] wrote:
 dedicated database server you really don't ever want to have the OOM
 killer triggered.

Found that yesterday (vm.overcommit_memory=2).
Agree that this is better than OOM. I still ran out of memory last night
and postgres just failed on the malloc(), which as you mentioned is better.

Reduced work_mem to 8MB and trying again.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 8:13 am 07/18/08 Richard Huxton [EMAIL PROTECTED] wrote:
 Is the partition split done with triggers or rules?

I have a single trigger+function combo that dynamically computes which
partition the data has to go to.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Richard Huxton

Francisco Reyes wrote:

On 8:13 am 07/18/08 Richard Huxton [EMAIL PROTECTED] wrote:

Is the partition split done with triggers or rules?


I have a single trigger+function combo that dynamically computes which
partition the data has to go to.


I'm wondering whether it's memory usage either for the trigger itself, 
or for the function (pl/pgsql?). If you're doing something like:

  INSERT INTO partitioned_table SELECT * FROM big_table
then that's not only taking place within a single transaction, but 
within a single statement.


Without being a hacker, I'd say it's entirely plausible that PG might 
clean up triggers at the end of a statement meaning you would need 
memory for 200million+ triggers.


Alternatively, it could be a memory-leak somewhere in the pl/pgsql or 
trigger code. Wouldn't have to be much to affect this particular case.


What happens if you do the insert/select in stages but all in one 
transaction? Do you see PG's memory requirement stay constant or grow in 
steps. That will show whether the memory is growing over the duration of 
a statement or a transaction.


BEGIN;
  INSERT ... SELECT ... WHERE id BETWEEN 0 AND 99
  INSERT ... SELECT ... WHERE id BETWEEN 100 AND 199
  ...
COMMIT;

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 11:25 am 07/18/08 Richard Huxton [EMAIL PROTECTED] wrote:
 I'm wondering whether it's memory usage either for the trigger
 itself, or for the function (pl/pgsql?).

Good point.

 If you're doing something
 like:INSERT INTO partitioned_table SELECT * FROM big_table
 then that's not only taking place within a single transaction, but
 within a single statement.

Correct.
I have kept decreasing work_mem and that does not seem to help.

 Without being a hacker, I'd say it's entirely plausible that PG might
 clean up triggers at the end of a statement meaning you would need
 memory for 200million+ triggers.

Sure hope that is not the case.

 Alternatively, it could be a memory-leak somewhere in the pl/pgsql or
 trigger code. Wouldn't have to be much to affect this particular case.

Will post an strace.

 What happens if you do the insert/select in stages but all in one
 transaction?

Will test.
The data is about a year worth of data. I will try  to do one month at a
time, within a single transaction.

A single month finishes fine.

 Do you see PG's memory requirement stay constant or grow
 in steps. That will show whether the memory is growing over the
 duration of a statement or a transaction.

Right now for the single statement/transaction (the one big process) it is
growing slowly over time. It may be a leak. It seems to start growing
somewhere between the 1st and 2nd hower. It seems to always be failing
around 4 hours.

I wrote a little process that shows the amount of free memory every 15
minutes..

I will post strace for the big process and then will try breaking the
process down by month, but within a single transaction and report that
later when I get some results.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Martijn van Oosterhout
On Fri, Jul 18, 2008 at 10:40:02AM -0400, Francisco Reyes wrote:
 Found that yesterday (vm.overcommit_memory=2).
 Agree that this is better than OOM. I still ran out of memory last night
 and postgres just failed on the malloc(), which as you mentioned is better.
 
 Reduced work_mem to 8MB and trying again.

Perhaps you can try reducing the shared_buffers, to see if that helps
more? 8MB is quite small for workmem. More shared_buffers is not
necessarily better.

Also, how much swap are you running? overcommit disabled while not
having any swap setup is a great way to ensure you run out of memory
quickly.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 11:25 am 07/18/08 Richard Huxton [EMAIL PROTECTED] wrote:

Strace of the single/large process.
Again, all the query is doing is
insert into file select subquery

The strace is pretty much a repetition of the lines below.

semop(557057, 0x7fbfffdfb0, 1)  = 0
lseek(100, 0, SEEK_END) = 671719424
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
lseek(508, 0, SEEK_END) = 55697408
write(508, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
read(381, 
\0\0\0\0\224\21\0\225o\10\0\30\331c\0c\225%w(\0\0\0\0\0\0\0\0\0\5\0..., 
8192) = 8192
semop(557057, 0x7fbfffd1a0, 1)  = 0
lseek(100, 0, SEEK_END) = 671727616
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffd1c0, 1)  = 0
semop(557057, 0x7fbfffd1a0, 1)  = 0
semop(557057, 0x7fbfffd1c0, 1)  = 0
read(381, 
w\317\21\0]9\0\177\246eA(\0\0\0\0\0\0\0\0\0\5\0\2\0\30\0.\v\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffd1a0, 1)  = 0
lseek(512, 0, SEEK_END) = 48144384
write(512, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffd1c0, 1)  = 0
lseek(100, 0, SEEK_END) = 671735808
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
lseek(517, 0, SEEK_END) = 89309184
write(517, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffd1c0, 1)  = 0
semop(557057, 0x7fbfffddd0, 1)  = 0
lseek(100, 0, SEEK_END) = 671744000
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
read(381, 
\212\225\202(\0\0\0\0\0\0\0\0\0\5\0\2\0\30\\v\0\0\1\0\23\2\0\0\0\t..., 
8192) = 8192
lseek(510, 0, SEEK_END) = 29351936
write(510, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
lseek(100, 0, SEEK_END) = 671752192
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffddf0, 1)  = 0
read(381, 
\0\0\0\0\0\0\5\0\2\0\30\0001\v\0\0\0\0\23\2\0\0\0\30\0\4\20\0\302\326\0\0..., 
8192) = 8192
lseek(513, 0, SEEK_END) = 19316736
write(513, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
lseek(100, 0, SEEK_END) = 671760384
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
read(381, [EMAIL PROTECTED]..., 8192) = 8192
lseek(100, 0, SEEK_END) = 671768576
write(100, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
lseek(518, 0, SEEK_END) = 55025664
write(518, 
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 8192) = 
8192
semop(557057, 0x7fbfffd1c0, 1)  = 0
semop(557057, 0x7fbfffd1c0, 1)  = 0
semop(557057, 0x7fbfffd1c0, 1)  = 0
lseek(100, 0, SEEK_END) = 671776768


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Alvaro Herrera
Francisco Reyes wrote:
 On 11:25 am 07/18/08 Richard Huxton [EMAIL PROTECTED] wrote:
 
 Strace of the single/large process.
 Again, all the query is doing is
 insert into file select subquery
 
 The strace is pretty much a repetition of the lines below.

Do you have long-running transactions?  (For example transactions that
have been idle for a long time).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 12:03 pm 07/18/08 Martijn van Oosterhout [EMAIL PROTECTED] wrote:

 Perhaps you can try reducing the shared_buffers, to see if that helps
 more?

Will try.

 8MB is quite small for workmem. More shared_buffers is not
 necessarily better.

Ok, but from everything I had read shared_buffers of 1/4 seemed like a
starting point. Will try reducing it to 2GB.

 Also, how much swap are you running?

Started out with 12GB (same as memory) and last night I added 24GB more.
I had 2 instances of inserts going so each exausted about 18GB of ram!


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 12:23 pm 07/18/08 Alvaro Herrera [EMAIL PROTECTED] wrote:
 Do you have long-running transactions?  (For example transactions that
 have been idle for a long time).

No.
The two inserts I was running were the only processes. I even did a restart
to make sure there was absolutely nothing else running and to make sure all
my postgresql.conf settings were in.

Given that memory grows over time I am beggining to wonder if it is some
type of memory leak.

Just installed the postgresql debug rpm, but not sure if did anything..
strace doesn't look   any different..

 read(81, 2\1\0\0\260~!\16\1\0\0\0\370\1\0\2\0 \4 
\0\0\0\0\300\237r\0\200\237r\0..., 8192) = 8192
write(191, 
Q=J\313\253]1\0\0\0\1\0007\33\4\0\2\0\2\t\30\0\3\302\204\0;a1OjG..., 8192) = 
8192
write(160, XQxbqQEx+yo=H\333o\2371\0\0\0\1\0.\33C\0\2\0\2\t\30\0...,
8192) = 8192
read(81, 2\1\0\0\320(\301\17\1\0\0\0\370\1\0\2\0 \4 
\0\0\0\0\300\237r\0\200\237r\0..., 8192) = 8192


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Tom Lane
Francisco Reyes [EMAIL PROTECTED] writes:
 Given that memory grows over time I am beggining to wonder if it is some
 type of memory leak.

Are there any AFTER triggers (including foreign key constraints) on the
table being inserted into?  If so the list of pending trigger events
might be your problem.

If you can get Postgres to report an actual out-of-memory error (as
opposed to crashing from OOM kill) then it should dump a memory usage
map into the postmaster log.  Looking at that would be informative.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 1:00 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
 Are there any AFTER triggers (including foreign key constraints)

I have two foreign key constraints.

 the table being inserted into?  If so the list of pending trigger
 events might be your problem.

I guess I can try disablign the foreign key, but that would be less than
ideal for production. This is an analytics environment so all operations
are in bulk.

 If you can get Postgres to report an actual out-of-memory error (as
 opposed to crashing from OOM kill)

Disabled oom with vm.overcommit_memory=2.

then it should dump a memory usage
map into the postmaster log.  Looking at that would be informative.

Got it.
--
AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
chunks); 1055316 used
ExecutorState: 122880 total in 4 blocks; 68040 free (8 chunks); 54840
used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
  MessageContext: 131072 total in 5 blocks; 50712 free (291 chunks); 80360
used
  smgr relation table: 24576 total in 2 blocks; 3584 free (4 chunks); 20992
used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used
  ExecutorState: 98784 total in 8 blocks; 24064 free (22 chunks); 74720
used
ExprContext: 8192 total in 1 blocks; 8016 free (0 chunks); 176 used
HashTableContext: 8192 total in 1 blocks; 8064 free (1 chunks); 128
used
  HashBatchContext: 532676656 total in 74 blocks; 1863936 free (5
chunks); 530812720 used
HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  HashBatchContext: 415227952 total in 59 blocks; 6589744 free (5
chunks); 408638208 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8136 free (0 chunks); 56 used
Relcache by OID: 24576 total in 2 blocks; 8672 free (3 chunks); 15904 used
  CacheMemoryContext: 2390256 total in 22 blocks; 751904 free (2 chunks);
1638352 used
CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
CachedPlan: 7168 total in 3 blocks; 3120 free (0 chunks); 4048 used
CachedPlanSource: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
SPI Plan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used
CachedPlan: 3072 total in 2 blocks; 792 free (0 chunks); 2280 used
CachedPlanSource: 7168 total in 3 blocks; 3600 free (0 chunks); 3568 used
SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used
pg_cast_source_target_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
pg_language_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
pg_toast_2619_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440
used
pg_amop_opr_fam_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
tcf_mnfoids_partid: 2048 total in 1 blocks; 752 free (0 chunks); 1296
used
tcf_mnfoids_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
cards_cardnum_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
cards_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
tcf_original_trans_partid_cardnum: 2048 total in 1 blocks; 656 free (0
chunks); 1392 used
tcf_original_trans_yearmo: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
pg_constraint_contypid_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_constraint_conname_nsp_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 392 free (0
chunks); 1656 used
pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 584 free (0
chunks); 1464 used
pg_proc_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
pg_shdepend_reference_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
pg_namespace_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
pg_statistic_relid_att_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
pg_inherits_relid_seqno_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
pg_constraint_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used

Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Tom Lane
Francisco Reyes [EMAIL PROTECTED] writes:
 On 1:00 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
 If you can get Postgres to report an actual out-of-memory error (as
 opposed to crashing from OOM kill)
 then it should dump a memory usage
 map into the postmaster log.  Looking at that would be informative.

 Got it.

 AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
 chunks); 1055316 used

Well, that's definitely your problem ...

   HashBatchContext: 532676656 total in 74 blocks; 1863936 free (5
 chunks); 530812720 used

   HashBatchContext: 415227952 total in 59 blocks; 6589744 free (5
 chunks); 408638208 used

although these numbers seem way outta line too.  What did you say you
had work_mem set to?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 3:55 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
   AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
   chunks); 1055316 used

 Well, that's definitely your problem ...

So I need to remove the foreign constraints?

 HashBatchContext: 415227952 total in 59 blocks; 6589744
   free (5 chunks); 408638208 used

 although these numbers seem way outta line too.  What did you say you
 had work_mem set to?

Initially on the first crash it was 256MB. I believe at the time of the
crash I got the dump for it was down to 64MB or 8MB. I kept trying lower
values. Also tried reducing shared_buffers as someone suggested.

I will bump my shared_buffers back to 3GB and work_mem back to 64MB.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 3:55 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
   AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
   chunks); 1055316 used

 Well, that's definitely your problem ...

What is the overhead for each AfterTriggerEvent?

I guess I can write a program to process so many rows at a time, if I know
how much overhead each AfterTriggerEvent uses. I know 15 million at a time
worked fine, so I could do 5 or 10 million at a time.

When does the memory usage for those AfterTriggerEvents gets released? At
commit?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Tom Lane
Francisco Reyes [EMAIL PROTECTED] writes:
 On 3:55 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
 AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
 chunks); 1055316 used
 
 Well, that's definitely your problem ...

 So I need to remove the foreign constraints?

Either that or do the update in sections.  But working through umpteen
gig of pending trigger events would take forever anyway --- dropping
and re-adding the FK constraint is almost certainly a better way.

 HashBatchContext: 415227952 total in 59 blocks; 6589744
 free (5 chunks); 408638208 used
 
 although these numbers seem way outta line too.  What did you say you
 had work_mem set to?

 Initially on the first crash it was 256MB. I believe at the time of the
 crash I got the dump for it was down to 64MB or 8MB.

Something fishy about that.  The max size of a HashBatchContext should
be work_mem, more or less (the accounting isn't perfectly accurate
I think, but it's not off by an order of magnitude).

The only thing I can think of is that you had a huge number of rows with
all the same hash value, so that there wasn't any way to split the batch
into smaller sections.  What are the join keys exactly in this query,
and what can you tell us about their data distributions?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Tom Lane
Francisco Reyes [EMAIL PROTECTED] writes:
 What is the overhead for each AfterTriggerEvent?

On a 64-bit machine it looks like they'd cost you about 80 bytes
each :-(.  A good deal of that is palloc overhead --- I wonder if
we should get rid of the separate-palloc-for-each-event design?

 When does the memory usage for those AfterTriggerEvents gets released? At
 commit?

Whenever the check is done; you'd have to read the rules about deferred
constraints ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-18 Thread Francisco Reyes
On 4:55 pm 07/18/08 Tom Lane [EMAIL PROTECTED] wrote:
 The only thing I can think of is that you had a huge number of rows
 with all the same hash value, so that there wasn't any way to split
 the batch into smaller sections.  What are the join keys exactly in
 this query, and what can you tell us about their data distributions?

I can't put actual table or column names so I am putting the actual select
and explain, with all names changed..

insert into customer_transactions
 (record_id, date, type, amount, ids, groupid)
select
  ca.record_id, coh.date, coh.type, coh.amount, coh.ids, ids.groupid
from
customer_original_historical coh,
cards ca,
customer_ids ids
where
ca.natural_key = coh.natural_key
and ids.ids = coh.ids
and coh.yearmo  '200703';

Hash Join  (cost=712213.57..27293913.33 rows=234402352 width=24)
   Hash Cond: (coh.id = ids.id)
   -  Hash Join  (cost=551387.26..18799378.16 rows=234402352
width=22)
 Hash Cond: (coh.user_id = ca.user_id)
 -  Seq Scan on customer_original_historical coh
  (cost=0.00..6702501.40 rows=234402352 width=47)
   Filter: (yearmo  '200703'::bpchar)
 -  Hash  (cost=268355.67..268355.67 rows=14637567 width=32)
   -  Seq Scan on cards ca
   (cost=0.00..268355.67 rows=14637567 width=32)
   -  Hash  (cost=77883.25..77883.25 rows=5055525 width=6)
 -  Seq Scan on customer_ids ids
 (cost=0.00..77883.25 rows=5055525 width=6)

There was a single table, customer_original_historical, which was using a
natural key with a text field.

Most queries used the customer_original_historical by itself or joined
against a single other table which we shoudl call area.

The new schema I am testing is to split the one single file into 12 files
per month.

In addition I replaced the natural keys with a synthetic integer key.
I also replaced the area table with a customer_ids table which only has
two columns: synthetic key for historical and a region.

In order to have 12 tables per month I grouped all the regions into 12
groups. Queries are usually within a single region so what I am trying to
benchmark is if dividing 24 months of data into 24 sets of 12 regions will
perform better than a single large file.

The distribution of the joins is:
There are about 1000,000 unique natural keys. Each natural key has in
average 15 rows per month.
ids are regions where the natural_keys are. Figure 10s of thousands of
natural_keys to an id.

Is that along the lines of what you were looking for?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Reducing memory usage of insert into select operations?

2008-07-17 Thread Francisco Reyes

Redhat 4
postgresql 8.3.3
Memory: 12GB

While doing a couple of operations of the type
insert into table select from some other table

The OS triggered the out of memory killer (oom-killer).

After some research and trial/error I found it was the inserts.
I see one of the inserts is using up 12GB!

How can I reduce the usage?
Postgresql.conf settings.
shared_buffers = 3GB
temp_buffers = 64MB # min 800kB

work_mem = 256MB# min 64kB
maintenance_work_mem = 1GB
  


Reducing work_mem would help?

The table I am selecting from has a few hundred million rows.
The table I am inserting into has partitions. I am benchmarking breaking up 
a large table into smaller partitions.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-17 Thread Douglas McNaught
On Thu, Jul 17, 2008 at 7:21 PM, Francisco Reyes [EMAIL PROTECTED] wrote:
 Redhat 4
 postgresql 8.3.3
 Memory: 12GB

 While doing a couple of operations of the type
 insert into table select from some other table

 The OS triggered the out of memory killer (oom-killer).

Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
too big for a 32-bit setup.

-Doug

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-17 Thread Douglas McNaught
On Thu, Jul 17, 2008 at 9:27 PM, Francisco Reyes [EMAIL PROTECTED] wrote:
 Douglas McNaught writes:

 Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
 too big for a 32-bit setup.


 64-bit.
 The machine has 12GB of RAM so shared-buffers is about 1/3.
 Dedicated DB server.

Ahh, good.  Just wanted to answer the obvious question first.  Some
people set shared_buffers really high on 32-bit systems and then are
surprised when it doesn't work well.

It does seem that reducing work_mem might help you, but others on this
list are much more expert than I in diagnosing this sort of problem.
It would probably be helpful for you to post the EXPLAIN output from
your query, so they can see which part of the plan causes the large
memory usage.

-Doug

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-17 Thread Francisco Reyes

Douglas McNaught writes:



It does seem that reducing work_mem might help you, but others on this


I reduced it from 256MB to 64MB. It seems it is helping.
At 256MB the usage per DB connection instance was upwards of 12GB. At 64MB 
so far is around 7GB. I just reduced it further to 32MB and see how that 
works. 




It would probably be helpful for you to post the EXPLAIN output from

Hash Join  (cost=712213.57..27293913.33 rows=234402352 width=24)
  Hash Cond: (coh.id = ids.id)
  -  Hash Join  (cost=551387.26..18799378.16 rows=234402352 width=22)
Hash Cond: (coh.user_id = ca.user_id)
-  Seq Scan on customer_original_historical coh  
 (cost=0.00..6702501.40 rows=234402352 width=47)

  Filter: (yearmo  '200703'::bpchar)
-  Hash  (cost=268355.67..268355.67 rows=14637567 width=32)
  -  Seq Scan on cards ca  
  (cost=0.00..268355.67 rows=14637567 width=32)

  -  Hash  (cost=77883.25..77883.25 rows=5055525 width=6)
-  Seq Scan on customer_ids ids  
(cost=0.00..77883.25 rows=5055525 width=6)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Reducing memory usage of insert into select operations?

2008-07-17 Thread Francisco Reyes

Douglas McNaught writes:


Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
too big for a 32-bit setup.



64-bit.
The machine has 12GB of RAM so shared-buffers is about 1/3.
Dedicated DB server.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general