Re: [PERFORM] Insert Concurrency

2017-04-18 Thread ROBERT PRICE
Thanks everyone, I decided to have the SQS process changed to create csv files 
in a S3 bucket. Then we have a process that will use the copy command to load 
the data. Process is loading 500,000 records in around 4 minutes which should 
be good enough for now. Going to look at pg_citus to get up to speed on 
postgres partitioning for a future need.



From: Scott Marlowe <scott.marl...@gmail.com>
Sent: Tuesday, April 18, 2017 3:41 PM
To: ROBERT PRICE
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Insert Concurrency

On Mon, Apr 17, 2017 at 8:55 PM, ROBERT PRICE <rprice...@hotmail.com> wrote:
> I come from an Oracle background and am porting an application to postgres.
> App has a table that will contain 100 million rows and has to be loaded by a
> process that reads messages off a SQS queue and makes web service calls to
> insert records one row at a time in a postgres RDS instance. I know slow by
> slow is not the ideal approach but I was wondering if postgres had
> partitioning or other ways to tune concurrent insert statements. Process
> will run 50 - 100 concurrent threads.

It's not uncommon to look for an Oracle solution while working with
another rdbms. Often what works in one engine doesn't work the same or
as well in another.

Is it possible for you to roll up some of these inserts into a single
transaction in some way? Even inserting ten rows at a time instead of
one at a time can make a big difference in your insert rate. Being
able to roll up 100 or more together even more so.

Another possibility is to insert them into a smaller table, then have
a process every so often come along, and insert all the rows there and
then delete them or truncate the table (for truncate you'll need to
lock the table to not lose rows).

--
To understand recursion, one must first understand recursion.


Re: [PERFORM] Insert Concurrency

2017-04-18 Thread David McKelvie

>> To understand recursion, one must first understand recursion.

This makes no sense unless you also provide the base case.

David


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert Concurrency

2017-04-18 Thread Scott Marlowe
On Mon, Apr 17, 2017 at 8:55 PM, ROBERT PRICE  wrote:
> I come from an Oracle background and am porting an application to postgres.
> App has a table that will contain 100 million rows and has to be loaded by a
> process that reads messages off a SQS queue and makes web service calls to
> insert records one row at a time in a postgres RDS instance. I know slow by
> slow is not the ideal approach but I was wondering if postgres had
> partitioning or other ways to tune concurrent insert statements. Process
> will run 50 - 100 concurrent threads.

It's not uncommon to look for an Oracle solution while working with
another rdbms. Often what works in one engine doesn't work the same or
as well in another.

Is it possible for you to roll up some of these inserts into a single
transaction in some way? Even inserting ten rows at a time instead of
one at a time can make a big difference in your insert rate. Being
able to roll up 100 or more together even more so.

Another possibility is to insert them into a smaller table, then have
a process every so often come along, and insert all the rows there and
then delete them or truncate the table (for truncate you'll need to
lock the table to not lose rows).

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert Concurrency

2017-04-18 Thread Claudio Freire
On Tue, Apr 18, 2017 at 2:45 AM, Daniel Blanch Bataller
 wrote:
>
> But if you are going to move a record at a time you are going to be limited
> by the fastest transaction rate you can achieve, which is going to be a few
> hundred per second, and limited at the end by the disk hardware you have, .
> Out of the box  and on commodity hardware it can take you up to then days to
> move 100M records.

RDS usually is not commodity hardware, most RDS instances will have
some form of SSD storage, so performance could be much higher than
what you'd get on your laptop.

I'd have to second David's advice: test with pgbench first. It can
quite accurately simulate your use case.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert Concurrency

2017-04-17 Thread Daniel Blanch Bataller
Yes, postgres has partitions:

https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html 


But this is not going to help much in the scenario you have. 

Postgres can ingest data very very fast, 100M records in seconds - minutes , 
faster than oracle can serve it in many scenarios (all I have tested).

Specially if you use COPY command 

https://www.postgresql.org/docs/9.6/static/sql-copy.html 


and even faster if you use the unlogged feature 

https://www.postgresql.org/docs/9.6/static/sql-altertable.html 


You can tune postgres to make it even faster, but it’s not normally necessary, 
with the two advices I gave you firstly, is more than enough,  If I don’t 
remember it wrong you can move 100M records in ~ 2 minutes.

https://www.postgresql.org/docs/current/static/populate.html 



But if you are going to move a record at a time you are going to be limited by 
the fastest transaction rate you can achieve, which is going to be a few 
hundred per second, and limited at the end by the disk hardware you have, . Out 
of the box  and on commodity hardware it can take you up to then days to move 
100M records.

So, my recomendation is to find a way to batch record insertions using copy, 
the benefits you can achieve tunning postgres are going to be marginal compared 
with COPY.

Regards

Daniel Blanch.
ww.translatetopostgres.com







> El 18 abr 2017, a las 4:55, ROBERT PRICE  escribió:
> 
> I come from an Oracle background and am porting an application to postgres. 
> App has a table that will contain 100 million rows and has to be loaded by a 
> process that reads messages off a SQS queue and makes web service calls to 
> insert records one row at a time in a postgres RDS instance. I know slow by 
> slow is not the ideal approach but I was wondering if postgres had 
> partitioning or other ways to tune concurrent insert statements. Process will 
> run 50 - 100 concurrent threads.



Re: [PERFORM] Insert Concurrency

2017-04-17 Thread David Rowley
On 18 April 2017 at 14:55, ROBERT PRICE  wrote:
> I come from an Oracle background and am porting an application to postgres.
> App has a table that will contain 100 million rows and has to be loaded by a
> process that reads messages off a SQS queue and makes web service calls to
> insert records one row at a time in a postgres RDS instance. I know slow by
> slow is not the ideal approach but I was wondering if postgres had
> partitioning or other ways to tune concurrent insert statements. Process
> will run 50 - 100 concurrent threads.

Have you tested performance and noticed that it is insufficient for
your needs? or do you just assume PostgreSQL suffers from the same
issue as Oracle in regards to INSERT contention on a single table?

You may like to look at pgbench [1] to test the performance if you've
not done so already.

[1] https://www.postgresql.org/docs/9.6/static/pgbench.html

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Insert Concurrency

2017-04-17 Thread ROBERT PRICE
I come from an Oracle background and am porting an application to postgres. App 
has a table that will contain 100 million rows and has to be loaded by a 
process that reads messages off a SQS queue and makes web service calls to 
insert records one row at a time in a postgres RDS instance. I know slow by 
slow is not the ideal approach but I was wondering if postgres had partitioning 
or other ways to tune concurrent insert statements. Process will run 50 - 100 
concurrent threads.


Re: [PERFORM] insert performance

2016-01-24 Thread Jim Nasby

On 1/21/16 11:54 PM, Jinhua Luo wrote:

There is another problem.
When the autovacuum recycles the old pages, the ReadBuffer_common()
would do mdread() instead of mdextend().
The read is synchronous, while the write could be mostly asynchronous,
so the frequent read is much worse than write version.


Please don't top post.

AFAICT your analysis is correct, though I don't see what autovacuum has 
to do with anything. When we need a new block to put data on we'll 
either find one with free space in the free space map and use it, or 
we'll extend the relation.



Any help? Please.


There's been some discussion on ways to improve the performance of 
relation extension (iirc one of those was to not zero the new page), but 
ultimately you're at the mercy of the underlying OS and hardware.


If you have ideas for improving this you should speak up on -hackers, 
but before doing so you should read the archives about what's been 
proposed in the past.



2016-01-19 12:50 GMT+08:00 Jinhua Luo :

Hi,

I thought with async commit enabled, the backend process would rarely
do file io. But in fact, it still involves a lot of file io.

After inspecting the vfs probes using systemtap, and reading the
source codes of postgresql, I found the tight loop of insert or update
will cause heavy file io upon the data files (table, indexes) directly
by the backend process! And those io has nothing to do with shared
buffer dirty writes.

The heap_insert() or heap_update() would invoke
RelationGetBufferForTuple(), which in turn finally invoke
ReadBuffer_common():

1) lookup or allocate the buffer from shared buffer, which may cause
dirty write (but in my case, it's rare. Maybe the shared buffer is big
enough and the checkpointer or bgwriter always clean it in time). If
the target buffer is found, skip following steps.

2)  if it needs to extend the relation (insert or update on new table
would normally fall in this case), then it would write zero-filled
page into the disk (used to occupy the file space? But most file
systems support file hole or space reservation, so maybe this part
could be optimized?) This procedure would hold the exclusive lock on
the relation. So if the write is slow, it would slow down all pending
queries of the lock waiters.

3) Otherwise, it would read from disk.

The target buffer would be locked exclusively until the insert or
update finish. Note that the insert or update also involve xlog
insert, although with async commit enabled, the backend process would
not flush the xlog, but chances are that the xlog buffer dirty writes
happens (although it's also rare in my case).

So I guess the real reason is the file io with lock holding. If io
spike happens, it would cause long query duration.

Am I correct? Look forward to any advice.

Thanks.

Regards,
Jinhua Luo



--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert performance

2016-01-21 Thread Jinhua Luo
Hi,

There is another problem.
When the autovacuum recycles the old pages, the ReadBuffer_common()
would do mdread() instead of mdextend().
The read is synchronous, while the write could be mostly asynchronous,
so the frequent read is much worse than write version.

Any help? Please.

Regards,
Jinhua Luo


2016-01-19 12:50 GMT+08:00 Jinhua Luo :
> Hi,
>
> I thought with async commit enabled, the backend process would rarely
> do file io. But in fact, it still involves a lot of file io.
>
> After inspecting the vfs probes using systemtap, and reading the
> source codes of postgresql, I found the tight loop of insert or update
> will cause heavy file io upon the data files (table, indexes) directly
> by the backend process! And those io has nothing to do with shared
> buffer dirty writes.
>
> The heap_insert() or heap_update() would invoke
> RelationGetBufferForTuple(), which in turn finally invoke
> ReadBuffer_common():
>
> 1) lookup or allocate the buffer from shared buffer, which may cause
> dirty write (but in my case, it's rare. Maybe the shared buffer is big
> enough and the checkpointer or bgwriter always clean it in time). If
> the target buffer is found, skip following steps.
>
> 2)  if it needs to extend the relation (insert or update on new table
> would normally fall in this case), then it would write zero-filled
> page into the disk (used to occupy the file space? But most file
> systems support file hole or space reservation, so maybe this part
> could be optimized?) This procedure would hold the exclusive lock on
> the relation. So if the write is slow, it would slow down all pending
> queries of the lock waiters.
>
> 3) Otherwise, it would read from disk.
>
> The target buffer would be locked exclusively until the insert or
> update finish. Note that the insert or update also involve xlog
> insert, although with async commit enabled, the backend process would
> not flush the xlog, but chances are that the xlog buffer dirty writes
> happens (although it's also rare in my case).
>
> So I guess the real reason is the file io with lock holding. If io
> spike happens, it would cause long query duration.
>
> Am I correct? Look forward to any advice.
>
> Thanks.
>
> Regards,
> Jinhua Luo


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert performance

2016-01-18 Thread Jinhua Luo
Hi,

I thought with async commit enabled, the backend process would rarely
do file io. But in fact, it still involves a lot of file io.

After inspecting the vfs probes using systemtap, and reading the
source codes of postgresql, I found the tight loop of insert or update
will cause heavy file io upon the data files (table, indexes) directly
by the backend process! And those io has nothing to do with shared
buffer dirty writes.

The heap_insert() or heap_update() would invoke
RelationGetBufferForTuple(), which in turn finally invoke
ReadBuffer_common():

1) lookup or allocate the buffer from shared buffer, which may cause
dirty write (but in my case, it's rare. Maybe the shared buffer is big
enough and the checkpointer or bgwriter always clean it in time). If
the target buffer is found, skip following steps.

2)  if it needs to extend the relation (insert or update on new table
would normally fall in this case), then it would write zero-filled
page into the disk (used to occupy the file space? But most file
systems support file hole or space reservation, so maybe this part
could be optimized?) This procedure would hold the exclusive lock on
the relation. So if the write is slow, it would slow down all pending
queries of the lock waiters.

3) Otherwise, it would read from disk.

The target buffer would be locked exclusively until the insert or
update finish. Note that the insert or update also involve xlog
insert, although with async commit enabled, the backend process would
not flush the xlog, but chances are that the xlog buffer dirty writes
happens (although it's also rare in my case).

So I guess the real reason is the file io with lock holding. If io
spike happens, it would cause long query duration.

Am I correct? Look forward to any advice.

Thanks.

Regards,
Jinhua Luo


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert performance

2016-01-13 Thread Jinhua Luo
Hi,

I found the insert performance is not related to the table schema.

In fact, I could recur the issue using simple table:

create table test(k bigserial primary key, a int, b int, c text, d text);

test.sql:
insert into test(a, b, c, d) values(3438, 1231,
'o',
'kkk');


pgbench -r -N -n -c 4 -j 1 -T 600 -f test.sql

I also compile and run it on the latest 9.4 version, the same issue.

Regards,
Jinhua Luo




2016-01-12 3:20 GMT+08:00 Jeff Janes :
> On Sat, Jan 9, 2016 at 9:57 PM, Jinhua Luo  wrote:
>>
>> To make a clean test env, I clone a new table, removing the indexes (keeping
>> the primary key) and triggers, and use pgbench to test insert statement
>> purely.
>
> Can you share the pgbench command line, and the sql file you feed to
> it (and whatever is needed to set up the schema)?
>
>
> Thanks,
>
> Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert performance

2016-01-13 Thread Jinhua Luo
Hi All,

I found it's not related to file I/O.

I use systemtap to diagnose the postgres backend process.

The systemtap script is a modified version of sample-bt-off-cpu:

https://gist.github.com/kingluo/15b656998035cef193bc


Test process:

1) create a simple table:

---
create table test(k bigserial primary key, a int, b int, c text, d text);
---

2) test.sql:

---
insert into test(a, b, c, d) values(3438, 1231,
'o',
'kkk');
---

3) I run the pgbench, using 4 connections:

---
$ time pgbench -r -N -n -c 4 -j 1 -T 3600 -f test.sql
---

4)  I run the systemtap script for 3 minutes (Here the 987 is the pid
of one postgres backend process):

Note that sum, max, avg, min time is in microsecond.

---
# ./sample-bt-off-cpu -a -v -p 987 -t 180 > /tmp/postgres.bt
Pass 1: parsed user script and 110 library script(s) using
26916virt/22300res/4276shr/18740data kb, in 380usr/10sys/386real ms.
Pass 2: analyzed script: 18 probe(s), 13 function(s), 5 embed(s), 12
global(s) using 54708virt/51956res/5920shr/46532data kb, in
1870usr/360sys/2669real ms.
Pass 3: translated to C into
"/tmp/stapteVG3Q/stap_18d161acb3024931de917335496977c3_12813_src.c"
using 54708virt/52156res/6120shr/46532data kb, in
8680usr/250sys/24647real ms.
Pass 4: compiled C into
"stap_18d161acb3024931de917335496977c3_12813.ko" in
20450usr/610sys/43898real ms.
Pass 5: starting run.
WARNING: Tracing 987 (/opt/postgresql/bin/postgres)...
WARNING: Too many CFI instuctions
WARNING: Time's up. Quitting now...(it may take a while)
WARNING: query time, count=646253, sum=102991078, max=2474344, avg=159, min=83
WARNING: lock time, count=142, sum=3306500, max=1158862, avg=23285, min=16
WARNING: lwlock time, count=141272, sum=7260098, max=1383180, avg=51, min=1
WARNING: Number of errors: 0, skipped probes: 24
Pass 5: run completed in 10usr/110sys/180744real ms.
---

During that 3 minutes, the postgres prints below logs:

---
2016-01-13 23:27:21 CST [987-157] postgres@postgres LOG:  duration:
2474.304 ms  statement: insert into test(a, b, c, d) values(3438,
1231, 'o',
'kkk');
2016-01-13 23:27:48 CST [987-158] postgres@postgres LOG:  duration:
1383.359 ms  statement: insert into test(a, b, c, d) values(3438,
1231, 'o',
'kkk');
2016-01-13 23:28:33 CST [987-159] postgres@postgres LOG:  process 987
still waiting for ExclusiveLock on extension of relation 16386 of
database 12141 after 1000.212 ms
2016-01-13 23:28:33 CST [987-160] postgres@postgres DETAIL:  Process
holding the lock: 990. Wait queue: 988, 987, 989.
2016-01-13 23:28:33 CST [987-161] postgres@postgres STATEMENT:  insert
into test(a, b, c, d) values(3438, 1231,
'o',
'kkk');
2016-01-13 23:28:33 CST [987-162] postgres@postgres LOG:  process 987
acquired ExclusiveLock on extension of relation 16386 of database
12141 after 1158.818 ms
2016-01-13 23:28:33 CST [987-163] postgres@postgres STATEMENT:  insert
into test(a, b, c, d) values(3438, 1231,
'o',
'kkk');
2016-01-13 23:28:33 CST [987-164] postgres@postgres LOG:  duration:
1159.512 ms  statement: insert into test(a, b, c, d) values(3438,
1231, 'o',
'kkk');
2016-01-13 23:28:45 CST [987-165] postgres@postgres LOG:  duration:
.664 ms  statement: insert into test(a, b, c, d) values(3438,
1231, 'o',
'kkk');
---

The final backtrace result is converted into flame graph, see:
http://luajit.io/systemtap/pgsql/postgres_1.svg

Any advice?


Regards,
Jinhua Luo

2016-01-11 5:05 GMT+08:00 Jim Nasby :
> On 1/9/16 11:57 PM, Jinhua Luo wrote:
>>
>> But I do not understand that why the process do so many IO with async
>> commit? And it does not even happen at the shared buffer flushing and
>> locks waiting. Where's the code path doing these IO?
>
>
> I assume you're asking about all the IO to the heap table. That is most
> likely occurring as part of ReadBuffer(). As soon as you fill up shared
> buffers, BufferAlloc() is likely to end up with a dirty buffer, resulting in
> it calling FlushBuffer() (see src/backend/storage/buffer/bufmgr.c#1084).
>
> Note that that call is tracked by
> TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(), so I'd expect you to see it in
> the relevant systemtap stats.
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com


-- 
Sent via pgsql-performance mailing list 

Re: [PERFORM] insert performance

2016-01-11 Thread Jeff Janes
On Sat, Jan 9, 2016 at 9:57 PM, Jinhua Luo  wrote:
>
> To make a clean test env, I clone a new table, removing the indexes (keeping
> the primary key) and triggers, and use pgbench to test insert statement
> purely.

Can you share the pgbench command line, and the sql file you feed to
it (and whatever is needed to set up the schema)?


Thanks,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert performance

2016-01-10 Thread Jim Nasby

On 1/9/16 11:57 PM, Jinhua Luo wrote:

But I do not understand that why the process do so many IO with async
commit? And it does not even happen at the shared buffer flushing and
locks waiting. Where's the code path doing these IO?


I assume you're asking about all the IO to the heap table. That is most 
likely occurring as part of ReadBuffer(). As soon as you fill up shared 
buffers, BufferAlloc() is likely to end up with a dirty buffer, 
resulting in it calling FlushBuffer() (see 
src/backend/storage/buffer/bufmgr.c#1084).


Note that that call is tracked by 
TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(), so I'd expect you to see it 
in the relevant systemtap stats.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] insert performance

2016-01-09 Thread Jinhua Luo
Hi All,

The database is postgresql 9.3, running on debian7, with 8 cpu cores and
8096MB physical memory.


There is a big table, with 70 more columns. It would be constantly at 700
rows/sec. It's not feasible to use COPY, because the data is not predefined
or provisioned, and it's generated on demand by clients.


To make a clean test env, I clone a new table, removing the indexes
(keeping the primary key) and triggers, and use pgbench to test insert
statement purely.


Here is some key items in the postgresql.conf:

--

shared_buffers = 1024MB

work_mem = 32MB

maintenance_work_mem = 128MB

bgwriter_delay = 20ms

synchronous_commit = off

checkpoint_segments = 64

checkpoint_completion_target = 0.9

effective_cache_size = 4096MB

log_min_duration_statement = 1000

--

The problem:

The log would show that the duration of some inserts exceed 1 second.

I use iotop to view the io activities of the pg backend worker process, it
shows that it writes some MB per second. It's the most confused part. The
commit is async, so it would not do and wait the wal writing, as well as
the shared buffers. I doubt it would be flush at the shared buffer
allocation. So I modify the codes, print the pgBufferUsage.blk_write_time
along with the long duration printing. But I found it's a small fraction
the total duration. I also add codes to record the total time on lock
waiting within the statement execution and print it, and it's also a small
fraction of the duration. I could not explain the result.

Then I use systemtap to check what files the process frequenlty write out:
-
Tracing 20991 (/usr/lib/postgresql/9.3/bin/postgres)...
Please wait for 30 seconds.

=== Top 10 file writes ===
#1: 6004 times, 49184768 bytes writes in file 21986.44.
#2: 400 times, 3276800 bytes writes in file 21988.3.
#3: 12 times, 98304 bytes writes in file 57ED.
#4: 10 times, 81920 bytes writes in file 57F0.
#5: 10 times, 81920 bytes writes in file 57EE.
#6: 9 times, 73728 bytes writes in file 57F1.
#7: 9 times, 73728 bytes writes in file 57F3.
#8: 8 times, 65536 bytes writes in file 57EB.
#9: 8 times, 65536 bytes writes in file 57F2.
#10: 4 times, 32768 bytes writes in file 57EF.
-

The "21986.44" is the table data file, and the "21988.3" is the primary key
index, and the others are subtrans files.

Obviously, the process does a lot of IO (vfs level), and I doubt the
duration spikes are due to the IO contention from time to time.

But I do not understand that why the process do so many IO with async
commit? And it does not even happen at the shared buffer flushing and locks
waiting. Where's the code path doing these IO?


Regards,
Jinhua Luo


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Guillaume Lelarge
Le 15 juil. 2015 11:16 PM, David G. Johnston david.g.johns...@gmail.com
a écrit :

 On Wed, Jul 15, 2015 at 4:53 PM, Michael Nolan htf...@gmail.com wrote:

 On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco robert.difa...@gmail.com
wrote:


 Thanks David, my example was a big simplification, but I appreciate
your guidance. The different event types have differing amounts of related
data. Query speed on this schema is not important, it's really the write
speed that matters. So I was just wondering given the INSERT or UPDATE
approach (with no indexed data being changed) if one is likely to be
substantially faster than the other.


 As I understand how ACID compliance is done, updating a record will
require updating any indexes for that record, even if the index keys are
not changing.  That's because any pending transactions still need to be
able to find the 'old' data, while new transactions need to be able to find
the 'new' data.  And ACID also means an update is essentially a
delete-and-insert.


 ​I might be a bit pedantic here but what you describe is a byproduct of
the specific​ implementation that PostgreSQL uses to affect Consistency
(the C in ACID) as opposed to a forgone outcome in being ACID compliant.

 http://www.postgresql.org/docs/9.4/static/mvcc-intro.html

 I'm out of my comfort zone here but the HOT optimization is designed to
leverage the fact that an update to a row that does not affect indexed
values is able to leave the index alone and instead during index lookup the
index points to the old tuple, notices that there is a chain present, and
walks that chain to find the currently active tuple.


That's true as long as the old and new tuples are stored in the same block.

 In short, if the only index is a PK an update of the row can avoid
touching that index.

 I mentioned that going from NULL to Not NULL may disrupt this but I'm
thinking I may have mis-spoken.

 Also, with separate tables the amount of data to write is going to be
less because you'd have fewer columns on the affected tables.

 While an update is a delete+insert a delete is mostly just a bit-flip
action - at least mid-transaction.  Depending on volume, though, the
periodic impact of vaccuming may want to be taken into consideration.

-- 
Guillaume


Re: [PERFORM] Insert vs Update

2015-07-15 Thread David G. Johnston
On Wed, Jul 15, 2015 at 12:16 PM, Robert DiFalco robert.difa...@gmail.com
wrote:

 First off I apologize if this is question has been beaten to death. I've
 looked around for a simple answer and could not find one.

 Given a database that will not have it's PKEY or indices modified, is it
 generally faster to INSERT or UPDATE data. And if there is a performance
 difference is it substantial?

 I have a situation where I can easily do one or the other to the same
 effect. For example, I have a journaling schema with a limited number of
 states for an entry. Currently each state is it's own table so I just
 insert them as they occur. But I could easily have a single entry table
 where the row is updated with column information for states (after the
 entry's initial insertion).

 Not a big deal but since it's so easy for me to take either approach I was
 wondering if one was more efficient (for a large DB) than another.


​There is HOT (heap only tuple?) optimization that can occur if only
non-indexed data is altered.  I do not recall the specifics.

Dave
​


Re: [PERFORM] Insert vs Update

2015-07-15 Thread David G. Johnston
On Wednesday, July 15, 2015, Robert DiFalco robert.difa...@gmail.com
wrote:

 First off I apologize if this is question has been beaten to death. I've
 looked around for a simple answer and could not find one.

 Given a database that will not have it's PKEY or indices modified, is it
 generally faster to INSERT or UPDATE data. And if there is a performance
 difference is it substantial?


This seems odd.  If you have an option to update but choose to insert what
becomes of the other record?


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Michael Nolan
I

On Wed, Jul 15, 2015 at 12:16 PM, Robert DiFalco robert.difa...@gmail.com
wrote:

 First off I apologize if this is question has been beaten to death. I've
 looked around for a simple answer and could not find one.

 Given a database that will not have it's PKEY or indices modified, is it
 generally faster to INSERT or UPDATE data. And if there is a performance
 difference is it substantial?

 I have a situation where I can easily do one or the other to the same
 effect. For example, I have a journaling schema with a limited number of
 states for an entry. Currently each state is it's own table so I just
 insert them as they occur. But I could easily have a single entry table
 where the row is updated with column information for states (after the
 entry's initial insertion).

 Not a big deal but since it's so easy for me to take either approach I was
 wondering if one was more efficient (for a large DB) than another.

 Thanks


If you think of an update as a delete-insert operation (glossing over the
fine points of what has to be done for ACID), it seems pretty clear that an
update involves more work than an insert.  Measuring that impact on
performance is probably a bit more challenging, because it's going to be
dependent on the specific table and the contents of the row, among other
things.
--
Mike Nolan
no...@tssi.com


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Robert DiFalco
On Wed, Jul 15, 2015 at 11:15 AM, David G. Johnston 
david.g.johns...@gmail.com wrote:


 ​Yes, you are trying to choose between a bunch of one-to-one (optional)
 relationships versus adding additional columns to a table all of which can
 be null.

 ​I'd argue that neither option is normal (in the DB normalization sense).

 CREATE TABLE meal (meal_id bigserial)
 CREATE TABLE meal_even​t_type (meal_event_id bigserial)
 CREATE TABLE meal_event (meal_id bigint, meal_event_id bigint, occurred_at
 timestamptz)

 ​So now the decision is one of how to denormalize.  materialzed views and
 two ways to do so.  The specific solution would depend in part on the final
 application queries that you need to write.

 If you do want to model the de-normalized form, which I would likely be
 tempted to do given a fixed set of events that do not require additional
 related attributes, would be to place the few event timestamps on the main
 table and UPDATE them to non-null.

 In the normal form you will likely find partial indexes to be quite useful.

 David J.
 ​


Thanks David, my example was a big simplification, but I appreciate your
guidance. The different event types have differing amounts of related data.
Query speed on this schema is not important, it's really the write speed
that matters. So I was just wondering given the INSERT or UPDATE approach
(with no indexed data being changed) if one is likely to be substantially
faster than the other.


Re: [PERFORM] Insert vs Update

2015-07-15 Thread David G. Johnston
On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco robert.difa...@gmail.com
wrote:

 The different event types have differing amounts of related data.


​On this basis alone I would select the multiple-table version as my
baseline and only consider something different if the performance of this
was insufficient and I could prove that an alternative arrangement was more
performant.

A single optional date with meta-data embedded in the column name​

​is usually workable but if you then have a bunch of other columns with
name like:

preparation_date, preparation_col1, preparation_col2, consumed_col1,
consumed_col2, consumed_date
​

​I would find that to be undesirable.

You may be able to put Table Inheritance to good use here...

I do not know (but doubt) if HOT optimization works when going from NULL to
non-NULL since the former is stored in a bitmap while the later occupies
normal relation space and thus the update would likely end up writing an
entirely new​ record upon each event category recording.

David J.


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Robert DiFalco
On Wed, Jul 15, 2015 at 12:32 PM, David G. Johnston 
david.g.johns...@gmail.com wrote:


 You may be able to put Table Inheritance to good use here...

 I do not know (but doubt) if HOT optimization works when going from NULL
 to non-NULL since the former is stored in a bitmap while the later occupies
 normal relation space and thus the update would likely end up writing an
 entirely new​ record upon each event category recording.

 David J.



Thanks!


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Robert DiFalco
On Wed, Jul 15, 2015 at 10:33 AM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Wednesday, July 15, 2015, Robert DiFalco robert.difa...@gmail.com
 wrote:

 First off I apologize if this is question has been beaten to death. I've
 looked around for a simple answer and could not find one.

 Given a database that will not have it's PKEY or indices modified, is it
 generally faster to INSERT or UPDATE data. And if there is a performance
 difference is it substantial?


 This seems odd.  If you have an option to update but choose to insert what
 becomes of the other record?



Consider the two pseudo-schemas, I'm just making this up for example
purposes:

SCHEMA A
=
meal(id SEQUENCE,user_id, started DEFAULT NOW())
meal_prepared(ref_meal_id, prepared DEFAULT NOW())
meal_abandoned(ref_meal_id, abandoned ...)
meal_consumed(ref_meal_id, consumed ...)
etc.

Then in response to different meal events you always have an insert.

aMealId = INSERT INTO meal(user_id) VALUES (aUserId);

When preparation starts:

INSERT INTO meal_prepared(ref_meal_id) VALUES (aMealId);

And so on for each event.

Compare that to this:

SCHEMA B
=
meal_event(id, started, prepared, abandoned, consumed, ...)

The start of the meal is an INSERT:

aMealId = INSERT INTO meal_event(user_id, started) VALUES (aUserId, NOW());

When preparation starts:

UPDATE meal_event SET prepared = NOW() WHERE id = aMealId;

And so on.

Basically the same data, in one case you always do inserts and add new
tables for new events. In the other case you only insert once and then
update for each state, then you add columns if you have new states.

As I said this is just an example. But in SCHEMA A you have only inserts,
lots of tables and in SCHEMA B you have a lot of updates and a lot of
possibly NULL columns if certain events don't occur.

Is that more clear?

R.


Re: [PERFORM] Insert vs Update

2015-07-15 Thread David G. Johnston
On Wed, Jul 15, 2015 at 1:56 PM, Robert DiFalco robert.difa...@gmail.com
wrote:



 On Wed, Jul 15, 2015 at 10:33 AM, David G. Johnston 
 david.g.johns...@gmail.com wrote:

 On Wednesday, July 15, 2015, Robert DiFalco robert.difa...@gmail.com
 wrote:

 First off I apologize if this is question has been beaten to death. I've
 looked around for a simple answer and could not find one.

 Given a database that will not have it's PKEY or indices modified, is it
 generally faster to INSERT or UPDATE data. And if there is a performance
 difference is it substantial?


 This seems odd.  If you have an option to update but choose to insert
 what becomes of the other record?



 Consider the two pseudo-schemas, I'm just making this up for example
 purposes:

 SCHEMA A
 =
 meal(id SEQUENCE,user_id, started DEFAULT NOW())
 meal_prepared(ref_meal_id, prepared DEFAULT NOW())
 meal_abandoned(ref_meal_id, abandoned ...)
 meal_consumed(ref_meal_id, consumed ...)
 etc.

 Then in response to different meal events you always have an insert.

 aMealId = INSERT INTO meal(user_id) VALUES (aUserId);

 When preparation starts:

 INSERT INTO meal_prepared(ref_meal_id) VALUES (aMealId);

 And so on for each event.

 Compare that to this:

 SCHEMA B
 =
 meal_event(id, started, prepared, abandoned, consumed, ...)

 The start of the meal is an INSERT:

 aMealId = INSERT INTO meal_event(user_id, started) VALUES (aUserId, NOW());

 When preparation starts:

 UPDATE meal_event SET prepared = NOW() WHERE id = aMealId;

 And so on.

 Basically the same data, in one case you always do inserts and add new
 tables for new events. In the other case you only insert once and then
 update for each state, then you add columns if you have new states.

 As I said this is just an example. But in SCHEMA A you have only inserts,
 lots of tables and in SCHEMA B you have a lot of updates and a lot of
 possibly NULL columns if certain events don't occur.

 Is that more clear?


​Yes, you are trying to choose between a bunch of one-to-one (optional)
relationships versus adding additional columns to a table all of which can
be null.

​I'd argue that neither option is normal (in the DB normalization sense).

CREATE TABLE meal (meal_id bigserial)
CREATE TABLE meal_even​t_type (meal_event_id bigserial)
CREATE TABLE meal_event (meal_id bigint, meal_event_id bigint, occurred_at
timestamptz)

​So now the decision is one of how to denormalize.  materialzed views and
two ways to do so.  The specific solution would depend in part on the final
application queries that you need to write.

If you do want to model the de-normalized form, which I would likely be
tempted to do given a fixed set of events that do not require additional
related attributes, would be to place the few event timestamps on the main
table and UPDATE them to non-null.

In the normal form you will likely find partial indexes to be quite useful.

David J.
​


Re: [PERFORM] Insert vs Update

2015-07-15 Thread David G. Johnston
On Wed, Jul 15, 2015 at 4:53 PM, Michael Nolan htf...@gmail.com wrote:

 On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco robert.difa...@gmail.com
 wrote:


 Thanks David, my example was a big simplification, but I appreciate your
 guidance. The different event types have differing amounts of related data.
 Query speed on this schema is not important, it's really the write speed
 that matters. So I was just wondering given the INSERT or UPDATE approach
 (with no indexed data being changed) if one is likely to be substantially
 faster than the other.


 As I understand how ACID compliance is done, updating a record will
 require updating any indexes for that record, even if the index keys are
 not changing.  That's because any pending transactions still need to be
 able to find the 'old' data, while new transactions need to be able to find
 the 'new' data.  And ACID also means an update is essentially a
 delete-and-insert.


​I might be a bit pedantic here but what you describe is a byproduct of the
specific​ implementation that PostgreSQL uses to affect Consistency (the C
in ACID) as opposed to a forgone outcome in being ACID compliant.

http://www.postgresql.org/docs/9.4/static/mvcc-intro.html

I'm out of my comfort zone here but the HOT optimization is designed to
leverage the fact that an update to a row that does not affect indexed
values is able to leave the index alone and instead during index lookup the
index points to the old tuple, notices that there is a chain present, and
walks that chain to find the currently active tuple.

In short, if the only index is a PK an update of the row can avoid touching
that index.

I mentioned that going from NULL to Not NULL may disrupt this but I'm
thinking I may have mis-spoken.

Also, with separate tables the amount of data to write is going to be less
because you'd have fewer columns on the affected tables.

While an update is a delete+insert a delete is mostly just a bit-flip
action - at least mid-transaction.  Depending on volume, though, the
periodic impact of vaccuming may want to be taken into consideration.

David J.


Re: [PERFORM] Insert vs Update

2015-07-15 Thread Michael Nolan
On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco robert.difa...@gmail.com
wrote:


 Thanks David, my example was a big simplification, but I appreciate your
 guidance. The different event types have differing amounts of related data.
 Query speed on this schema is not important, it's really the write speed
 that matters. So I was just wondering given the INSERT or UPDATE approach
 (with no indexed data being changed) if one is likely to be substantially
 faster than the other.


As I understand how ACID compliance is done, updating a record will require
updating any indexes for that record, even if the index keys are not
changing.  That's because any pending transactions still need to be able to
find the 'old' data, while new transactions need to be able to find the
'new' data.  And ACID also means an update is essentially a
delete-and-insert.
--
Mike Nolan


[PERFORM] Insert vs Update

2015-07-15 Thread Robert DiFalco
First off I apologize if this is question has been beaten to death. I've
looked around for a simple answer and could not find one.

Given a database that will not have it's PKEY or indices modified, is it
generally faster to INSERT or UPDATE data. And if there is a performance
difference is it substantial?

I have a situation where I can easily do one or the other to the same
effect. For example, I have a journaling schema with a limited number of
states for an entry. Currently each state is it's own table so I just
insert them as they occur. But I could easily have a single entry table
where the row is updated with column information for states (after the
entry's initial insertion).

Not a big deal but since it's so easy for me to take either approach I was
wondering if one was more efficient (for a large DB) than another.

Thanks!


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-18 Thread Horst Dehmer
Hey Jeff (and others)!

First of all: Thanks for your detailed explanations and guide lines.


On 17.01.2013, at 18:12, Jeff Janes jeff.ja...@gmail.com wrote:

 So the theory is that the presence of idx_4 is causing the trigger to
 pick a poor plan (i.e. one using idx_4) while its absence removes that
 temptation?

Yes. And auto_explain confirms this for the first record 
(obj_item_loc_obj_item_id_idx = idx_4 from last my last mail):

2013-01-18 22:50:21 CET LOG:  duration: 0.021 ms  plan:
Query Text: SELECT * FROM obj_item_loc WHERE obj_item_loc.obj_item_id = 
NEW.obj_item_id AND obj_item_loc.loc_id = NEW.loc_id AND 
obj_item_loc.obj_item_loc_ix = NEW.obj_item_loc_ix
Index Scan using obj_item_loc_obj_item_id_idx on obj_item_loc  
(cost=0.00..8.27 rows=1 width=382)
  Index Cond: (obj_item_id = $15)
  Filter: ((loc_id = $16) AND (obj_item_loc_ix = $17))
2013-01-18 22:50:21 CET CONTEXT:  SQL statement SELECT * FROM obj_item_loc 
WHERE obj_item_loc.obj_item_id = NEW.obj_item_id AND obj_item_loc.loc_id = 
NEW.loc_id AND obj_item_loc.obj_item_loc_ix = NEW.obj_item_loc_ix
PL/pgSQL function obj_item_loc_before_insert() line 5 at SQL statement
COPY obj_item_loc, line 1: 1090301122417276
109030102241000151101090002410029720\N  \N  \N  \N  
\N  \N  \N  \N  \N  \N  \N  \N  \...

and for one of the last records:

2013-01-18 22:53:20 CET LOG:  duration: 16.088 ms  plan:
Query Text: SELECT * FROM obj_item_loc WHERE obj_item_loc.obj_item_id = 
NEW.obj_item_id AND obj_item_loc.loc_id = NEW.loc_id AND 
obj_item_loc.obj_item_loc_ix = NEW.obj_item_loc_ix
Index Scan using obj_item_loc_obj_item_id_idx on obj_item_loc  
(cost=0.00..8.27 rows=1 width=382)
  Index Cond: (obj_item_id = $15)
  Filter: ((loc_id = $16) AND (obj_item_loc_ix = $17))

I see a linear increase of the duration from 0.0x ms to over 16 ms (apart from 
a few nasty outliers with about 22 ms). Although even at the end there are 
still a few durations  0.03 but mostly 15 ms and above.

 True disk reads are much more expensive, but given how few reads you
 have relative to hits, I now think that in aggregate the hits are more
 of a concern than the reads are.  In other words, you seem to be CPU
 bound, not IO bound.

Yes, definitely CPU bound, as top shows 99+% CPU utilization.

 Even more so I think (but not with much confidence) that most of your
 reads are actually coming from the OS cache and not from the disk.
 PG cannot distinguish true disk reads from OS cache reads.
 
 When was the last time you reset the stats?  That is, are your
 reported numbers accumulated over several loads, with some having idx4
 and some not?

I set up a fresh database before each test run. So the stats should be clean.

 More formally, use use auto_explain and set
 auto_explain.log_nested_statements to true.  I haven't verified this
 works with triggers, just going by the description I think it should.

Nice tip! Works for triggers as well.

 Your use case is a little unusual.  If you are bulk loading into an
 initially empty table, usually you would remove the trigger and add it
 after the load (with some kind of bulk operation to make up for
 whatever it was the trigger would have been doing).  On the other
 hand, if you are bulk loading into a live table and so can't drop
 the trigger, then the live table should have good-enough preexisting
 statistics to make the trigger choose a good plan.

My case is indeed unusual as for the whole model of 276 tables there will never 
be an update nor a delete on any row.
The model is rather short-lived, from a few hours to a few months. COPY FROM/TO 
are the only ways to get data into the database and back out. And in between 
there is lots of graph traversal and calculation of convex hulls. But the 
lengthy transaction are by far not the common case.

Having said that, I'm no longer sure if a RDBMS is the right tool for the 
backend. Maybe indexing and storing with a plain full text search engine is. 
Dunno...

Thanks again!

--
Horst

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-17 Thread Jeff Janes
On Tue, Jan 15, 2013 at 3:44 PM, Horst Dehmer horst.deh...@gmail.com wrote:


 idx_4 together with a simple select in the tables on-insert trigger is
 slowing things down considerably.

So the theory is that the presence of idx_4 is causing the trigger to
pick a poor plan (i.e. one using idx_4) while its absence removes that
temptation?


 pg_statio comes up with same big numbers (reads = bad, hits = not so bad?):

True disk reads are much more expensive, but given how few reads you
have relative to hits, I now think that in aggregate the hits are more
of a concern than the reads are.  In other words, you seem to be CPU
bound, not IO bound.

Even more so I think (but not with much confidence) that most of your
reads are actually coming from the OS cache and not from the disk.
PG cannot distinguish true disk reads from OS cache reads.

When was the last time you reset the stats?  That is, are your
reported numbers accumulated over several loads, with some having idx4
and some not?

...

 Now I have some (more) questions:

 1. How do I know which index (if any) is chosen for a select statement
 inside a trigger during a bulk load transaction? (or for that matter: a
 series of recursive plpgsql functions)

Informally, reset your database to the state it was in before the
load, analyze it, and do the explain again before you do the load.

More formally, use use auto_explain and set
auto_explain.log_nested_statements to true.  I haven't verified this
works with triggers, just going by the description I think it should.

 2. The query planner depends on stats collected by auto-vacuum/vacuum
 analyze, right? Does stats collecting also happen during a lengthy
 transaction?

My understanding is that a transaction will not dump its stats until
the commit, so the auto analyze will not occur *due to* the lengthy
transaction until after it is over.  But if the table was already due
for analyze anyway due to previous or concurrent shorter transactions,
the analyze will happen.  However, the lengthy transaction might not
see the results of the analyze (I'm not clear on the transaction
snapshot semantics of the statistics tables) and even if it did see
them, it might just be using cached plans and so would not change the
plan in the middle.

 3. Is it possible (or even advisable) to trigger vacuum analyze inside an
 ongoing transaction. Let's say load 10,000 rows of table A, analyze table A,
 insert the next 10,000 rows, analyze again, ...

You can't vacuum inside a transaction.  You can analyze, but I don't
know if it would be advisable.

Your use case is a little unusual.  If you are bulk loading into an
initially empty table, usually you would remove the trigger and add it
after the load (with some kind of bulk operation to make up for
whatever it was the trigger would have been doing).  On the other
hand, if you are bulk loading into a live table and so can't drop
the trigger, then the live table should have good-enough preexisting
statistics to make the trigger choose a good plan.

Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-15 Thread Horst Dehmer
After more testing I have gained some insights:

The foreign key constraints are NOT responsible for the low COPY FROM 
performance in my case. I forgot about the indexes which are created along with 
the FK constraints.

Besides the primary key

CONSTRAINT obj_item_loc_pkey PRIMARY KEY (obj_item_id, loc_id, obj_item_loc_ix),

the table OBJ_ITEM_LOC has four additional indexes (let's call them idx_1 
through idx_4)

CREATE INDEX idx_1 ON obj_item_loc USING btree (rec_id);
CREATE INDEX idx_2 ON obj_item_loc USING btree (loc_id);
CREATE INDEX idx_3 ON obj_item_loc USING btree (rptd_id);
CREATE INDEX idx_4 ON obj_item_loc USING btree (obj_item_id);

The indexes 2 to 4 are intended to speed up joins between OBJ_ITEM_LOC and
LOC (loc_id), RPTD (rptd_id) and OBJ_ITEM (obj_item) respectively (and I'm 
highly suspicious if this makes sense at all.)

idx_4 together with a simple select in the tables on-insert trigger is slowing 
things down considerably.
With idx_4 and the trigger rates are

 44100 rows, 0:00:04.576,   9637 r/s: LOC
  2101 rows, 0:00:00.221,   9506 r/s: OBJ_ITEM
  2101 rows, 0:00:00.278,   7557 r/s: ORG
 94713 rows, 0:00:18.502,   5119 r/s: RPTD
 44100 rows, 0:03:03.437,240 r/s: OBJ_ITEM_LOC
imported 187115 record in 0:03:27.081 = 903 r/s

pg_statio comes up with same big numbers (reads = bad, hits = not so bad?):

   relname| heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit 
--++---+---+--
 obj_item_loc |   1262 |   9908013 |  1199 |  1682005
 rptd |   4434 |279022 |  1806 |  1270746
 org  | 38 |191559 |19 |   201071
 obj_item | 84 | 92476 |29 |   104134
 loc  |768 | 88902 |   597 |   352680
(5 rows)

Dropping idx_1, idx_2 and idx_3 at the same time has no significant impact. But 
take away idx_4 only:

 44100 rows, 0:00:04.558,   9675 r/s: LOC
  2101 rows, 0:00:00.220,   9593 r/s: OBJ_ITEM
  2101 rows, 0:00:00.275,   7640 r/s: ORG
 94713 rows, 0:00:18.407,   5145 r/s: RPTD
 44100 rows, 0:00:11.433,   3857 r/s: OBJ_ITEM_LOC
imported 187115 record in 0:00:34.938 = 5355 r/s

Hm, not bad. Now for the select statement in the on insert trigger:

SELECT  * 
FROMobj_item_loc 
WHERE   obj_item_loc.obj_item_id = NEW.obj_item_id 
AND obj_item_loc.loc_id = NEW.loc_id 
AND obj_item_loc.obj_item_loc_ix = NEW.obj_item_loc_ix 
INTOold;

Executing this query AFTER the bulk insert (and probably some auto-vacuuming) 
the query plan looks like this

explain analyze 
select  * 
fromobj_item_loc 
where   (obj_item_id, loc_id, obj_item_loc_ix) =
(10903011224100014650,10903010224100089226,1090002410140894)

QUERY PLAN   
--
 Index Scan using obj_item_loc_loc_id_idx on obj_item_loc  
(cost=0.00..8.36 rows=1 width=329) 
(actual time=0.039..0.040 rows=1 loops=1)
   Index Cond: (loc_id = 10903010224100089226::numeric)
   Filter: ((obj_item_id = 10903011224100014650::numeric) AND 
(obj_item_loc_ix = 1090002410140894::numeric))
 Total runtime: 0.079 ms

After some head-scratching I realized that obj_item_id is just referencing a 
meager 2101 rows which probably makes not for a good index candidate. So, the 
query plan make some sense, I guess.

Now I have some (more) questions:

1. How do I know which index (if any) is chosen for a select statement inside a 
trigger during a bulk load transaction? (or for that matter: a series of 
recursive plpgsql functions)
2. The query planner depends on stats collected by auto-vacuum/vacuum analyze, 
right? Does stats collecting also happen during a lengthy transaction? 
3. Is it possible (or even advisable) to trigger vacuum analyze inside an 
ongoing transaction. Let's say load 10,000 rows of table A, analyze table A, 
insert the next 10,000 rows, analyze again, ...

I'm sorry if this is basic stuff I'm asking here, but especially point 2 is 
bothering me.

--
Kind regards
Horst Dehmer

On 12.01.2013, at 01:17, Jeff Janes jeff.ja...@gmail.com wrote:

 On Friday, January 11, 2013, Horst Dehmer wrote:
 
 Except - and that's the wall I'm hitting - for one table which yielded just 
 75 records/second.
 The main 'problem' seem to be the FK constraints. Dropping just them restored 
 insert performance for this table to 6k records/s.
 
 It sure sounds like you don't have enough RAM to hold the foreign-key table 
 data needed to check the constraints, so every insert needs one disk 
 revolution to fetch the data.
 
 If you drop the indexes and constraints one at a time until it speeds up, is 
 there a certain one that is the culprit? 
 
 You can look in pg_statio_user_tables to see what tables and indexes have 
 

Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Horst Dehmer
Yes, the ids is something I don't like either.
They carry additional semantics, which I cannot make go away.
How are chances char(20) is more time efficient than numeric(20)?
Disk space is no problem here.


On 12.01.2013, at 02:17, Claudio Freire klaussfre...@gmail.com wrote:

 On Fri, Jan 11, 2013 at 8:55 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Except - and that's the wall I'm hitting - for one table which yielded just
 75 records/second.
 The main 'problem' seem to be the FK constraints. Dropping just them
 restored insert performance for this table to 6k records/s. The table in
 question has a composite PK (3 columns), 3 foreign keys and a bunch of
 indexes (see table obj_item_loc at the end of the mail). Compared to the
 other 32 tables nothing unusual.
 I'd gladly supply more information if necessary.
 ...
 CREATE TABLE obj_item_loc
 (
  obj_item_id numeric(20,0) NOT NULL,
  loc_id numeric(20,0) NOT NULL,
  obj_item_loc_ix numeric(20,0) NOT NULL,
 
 That sounds a lot like a missing index on the target relations (or
 indices that are unusable).
 
 Those numeric ids look really unusual. Why not bigint? It's close to
 the same precision, but native, faster, more compact, and quite
 unambiguous when indices are involved. If the types don't match on
 both tables, it's quite likely indices won't be used when checking the
 FK, and that spells trouble.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Jeff Janes
On Fri, Jan 11, 2013 at 5:17 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Fri, Jan 11, 2013 at 8:55 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Except - and that's the wall I'm hitting - for one table which yielded just
 75 records/second.
 The main 'problem' seem to be the FK constraints. Dropping just them
 restored insert performance for this table to 6k records/s. The table in
 question has a composite PK (3 columns), 3 foreign keys and a bunch of
 indexes (see table obj_item_loc at the end of the mail). Compared to the
 other 32 tables nothing unusual.
 I'd gladly supply more information if necessary.
 ...
 CREATE TABLE obj_item_loc
 (
   obj_item_id numeric(20,0) NOT NULL,
   loc_id numeric(20,0) NOT NULL,
   obj_item_loc_ix numeric(20,0) NOT NULL,

 That sounds a lot like a missing index on the target relations (or
 indices that are unusable).

 Those numeric ids look really unusual. Why not bigint? It's close to
 the same precision, but native, faster, more compact, and quite
 unambiguous when indices are involved. If the types don't match on
 both tables, it's quite likely indices won't be used when checking the
 FK, and that spells trouble.

Will PG allow you to add a FK constraint where there is no usable
index on the referenced side?

I have failed to do so, but perhaps I am not being devious enough.

Cheers,

Jeff


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Claudio Freire
On Sat, Jan 12, 2013 at 5:16 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Yes, the ids is something I don't like either.
 They carry additional semantics, which I cannot make go away.
 How are chances char(20) is more time efficient than numeric(20)?
 Disk space is no problem here.

What are the other tables like then?

The exact data types involved are at issue here, so it matters.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 Will PG allow you to add a FK constraint where there is no usable
 index on the referenced side?

It will not, because the referenced side must have a unique constraint,
ie an index.

The standard performance gotcha here is not having an index on the
referencing side.  But that only hurts when doing UPDATEs/DELETEs of
referenced-side keys, which as far as I gathered was not the OP's
scenario.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Horst Dehmer
The types referenced by the foreign keys are the same Numeric(20). 
Since the complete schema (of about 300 tables) is generated, I will just try 
char(20) instead of numeric(20) in the next days to see if it makes any 
difference. Which I somehow doubt. 

But first I'm following the lead of the tables/indexes iostats given by Jeff.


obj_item_loc references the following three tables and there should be no 
surprises.

CREATE UNLOGGED TABLE loc
(
  loc_id numeric(20,0) NOT NULL, 
...
  CONSTRAINT loc_pkey PRIMARY KEY (loc_id),
…
)

CREATE UNLOGGED TABLE obj_item
(
  obj_item_id numeric(20,0) NOT NULL, 
...
  CONSTRAINT obj_item_pkey PRIMARY KEY (obj_item_id),
…
)

CREATE UNLOGGED TABLE rptd
(
  rptd_id numeric(20,0) NOT NULL, 
...
  CONSTRAINT rptd_pkey PRIMARY KEY (rptd_id),
…
)


On 12.01.2013, at 23:18, Claudio Freire klaussfre...@gmail.com wrote:

 On Sat, Jan 12, 2013 at 5:16 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Yes, the ids is something I don't like either.
 They carry additional semantics, which I cannot make go away.
 How are chances char(20) is more time efficient than numeric(20)?
 Disk space is no problem here.
 
 What are the other tables like then?
 
 The exact data types involved are at issue here, so it matters.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-12 Thread Claudio Freire
On Sat, Jan 12, 2013 at 7:41 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Since the complete schema (of about 300 tables) is generated, I will just try 
 char(20) instead of numeric(20) in the next days to see if it makes any 
 difference. Which I somehow doubt.

I think that might just make it worse.

Well, maybe the others were right, and it's just that you're hitting
the disk on that particular table.

That, or it's all those CHECK constraints. Have you tried removing the
CHECK constraints (they're a heapload of function calls)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-11 Thread Horst Dehmer
Hi!

I see a massive performance drop when writing a large transaction. I'm writing 
data for 33 tables with COPY FROM directly from streams in Scala/Java. Over all 
tables there are 2.2M records which are unevenly distributed from 1 record to 
315k records in some tables.

For comparison I ran a test where I used UNLOGGED tables, no PK/FK constraints, 
nor other constraints or indexes and no triggers for all tables. The insert 
rate for this scenario is well above 105k records/second over all tables (which 
I think is really cool!)

Turning everything on (but still with UNLOGGED tables), i.e. PK/FK, additional 
indexes, some column check constraints and a trigger for each table which 
basically insert one additional record to another table, the rates dropped 
expectedly to around 6k to 7k records/second.

Except - and that's the wall I'm hitting - for one table which yielded just 75 
records/second.
The main 'problem' seem to be the FK constraints. Dropping just them restored 
insert performance for this table to 6k records/s. The table in question has a 
composite PK (3 columns), 3 foreign keys and a bunch of indexes (see table 
obj_item_loc at the end of the mail). Compared to the other 32 tables nothing 
unusual.
I'd gladly supply more information if necessary.

Dropping and recreating constraints/indexes is (for now) no viable alternative, 
since I have to write such transaction into an already populated database.
What I'm trying to understand is, which limit it is I'm hitting here. I need 
some advice how to 'profile' this situation.

Configuration is more or less standard, except WAL settings (which should not 
be relevant here).
EnterpriseDB One Click installer.

Any hint is really appreciated. 
Thanks!

--
Horst Dehmer



PostgreSQL 9.2.1 on x86_64-apple-darwin, compiled by 
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) 
(LLVM build 2336.9.00), 64-bit
OS X 10.8.2
Mid-2012 MacBook Pro 16 GB, 512 GB SSD

bytea_output;escape
checkpoint_completion_target;0.9
checkpoint_segments;32
client_encoding;UNICODE
client_min_messages;notice
lc_collate;en_US.UTF-8
lc_ctype;en_US.UTF-8
listen_addresses;*
log_checkpoints;on
log_destination;stderr
log_line_prefix;%t 
logging_collector;on
max_connections;100
max_stack_depth;2MB
port;5432
server_encoding;UTF8
shared_buffers;24MB
TimeZone;Europe/Vienna
wal_buffers;768kB

/etc/sysctl.conf
kern.sysv.shmmax=1610612736
kern.sysv.shmall=393216
kern.sysv.shmmin=1
kern.sysv.shmmni=256
kern.sysv.shmseg=64
kern.maxprocperuid=512
kern.maxproc=2048

CREATE TABLE obj_item_loc
(
  obj_item_id numeric(20,0) NOT NULL,
  loc_id numeric(20,0) NOT NULL,
  obj_item_loc_ix numeric(20,0) NOT NULL, 
  ver_acc_dim numeric(12,3), 
  horz_acc_dim numeric(12,3), 
  brng_angle numeric(7,4), 
  brng_acc_angle numeric(7,4), 
  brng_precision_code character varying(6), 
  incl_angle numeric(7,4), 
  incl_acc_angle numeric(7,4), 
  incl_precision_code character varying(6), 
  speed_rate numeric(8,4), 
  speed_acc_rate numeric(8,4), 
  speed_precision_code character varying(6), 
  meaning_code character varying(6), 
  rel_speed_code character varying(6), 
  rptd_id numeric(20,0) NOT NULL,
  creator_id numeric(20,0) NOT NULL, 
  update_seqnr numeric(15,0) NOT NULL, 
  rec_id bigint DEFAULT nextval('rec_seq'::regclass),
  CONSTRAINT obj_item_loc_pkey PRIMARY KEY (obj_item_id, loc_id, 
obj_item_loc_ix),
  CONSTRAINT obj_item_loc_4fbc75641175ef1757ca310dd34e34ee_fkey FOREIGN KEY 
(obj_item_id)
  REFERENCES obj_item (obj_item_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_7895d64f5557b1e382c36d41212a3696_fkey FOREIGN KEY 
(rptd_id)
  REFERENCES rptd (rptd_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_8d919243f69bcc599873caca07ac9888_fkey FOREIGN KEY 
(loc_id)
  REFERENCES loc (loc_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT obj_item_loc_brng_acc_angle_ra_check CHECK 
(br_check_305(brng_acc_angle = 0::numeric AND brng_acc_angle = 359., 
'obj_item_loc'::text, 'brng_acc_angle'::text, brng_acc_angle::text)),
  CONSTRAINT obj_item_loc_brng_angle_ra_check CHECK (br_check_305(brng_angle = 
0::numeric AND brng_angle = 359., 'obj_item_loc'::text, 
'brng_angle'::text, brng_angle::text)),
  CONSTRAINT obj_item_loc_brng_precision_code_check CHECK 
(br_check_305(brng_precision_code::text = ANY (ARRAY['1000MN'::text, 
'100MN'::text, '100SEC'::text, '10DEG'::text, '10MN'::text, '10SEC'::text, 
'DEGREE'::text, 'MIL'::text, 'MINUTE'::text, 'SECOND'::text]), 
'obj_item_loc'::text, 'brng_precision_code'::text, brng_precision_code::text)),
  CONSTRAINT obj_item_loc_incl_acc_angle_ra_check CHECK 
(br_check_305(incl_acc_angle = 0::numeric AND incl_acc_angle = 359., 
'obj_item_loc'::text, 'incl_acc_angle'::text, incl_acc_angle::text)),
  CONSTRAINT obj_item_loc_incl_angle_ra_check CHECK (br_check_305(incl_angle = 
0::numeric AND incl_angle = 359., 

Re: [PERFORM] Insert performance for large transaction with multiple COPY FROM

2013-01-11 Thread Claudio Freire
On Fri, Jan 11, 2013 at 8:55 PM, Horst Dehmer horst.deh...@gmail.com wrote:
 Except - and that's the wall I'm hitting - for one table which yielded just
 75 records/second.
 The main 'problem' seem to be the FK constraints. Dropping just them
 restored insert performance for this table to 6k records/s. The table in
 question has a composite PK (3 columns), 3 foreign keys and a bunch of
 indexes (see table obj_item_loc at the end of the mail). Compared to the
 other 32 tables nothing unusual.
 I'd gladly supply more information if necessary.
...
 CREATE TABLE obj_item_loc
 (
   obj_item_id numeric(20,0) NOT NULL,
   loc_id numeric(20,0) NOT NULL,
   obj_item_loc_ix numeric(20,0) NOT NULL,

That sounds a lot like a missing index on the target relations (or
indices that are unusable).

Those numeric ids look really unusual. Why not bigint? It's close to
the same precision, but native, faster, more compact, and quite
unambiguous when indices are involved. If the types don't match on
both tables, it's quite likely indices won't be used when checking the
FK, and that spells trouble.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-08-01 Thread Vitalii Tymchyshyn

Hello.

Please note that in multitasking environment you may have problems with 
your code. Two connections may check if a is available and if not (and 
both got empty select result), try to insert. One will succeed, 
another will fail if you have a unique constraint on category name (and 
you'd better have one).


Please note that select for update won't help you much, since this is 
new record you are looking for, and select don't return (and lock) it. I 
am using lock table tableName in SHARE ROW EXCLUSIVE mode in this case.


But then, if you have multiple lookup dictinaries, you need to ensure 
strict order of locking or you will be getting deadlocks. As for me, I 
did create a special application-side class to retrieve such values. If 
I can't find a value in main connection with simple select, I open new 
connection, perform table lock, check if value is in there. If it is 
not, add the value and commit. This may produce orphaned dictionary 
entries (if dictionary entry is committed an main transaction is rolled 
back), but this is usually OK for dictionaries. At the same time I don't 
introduce hard locks into main transaction and don't have to worry about 
deadlocks.


Best regards, Vitalii Tymchyshyn


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-08-01 Thread Kevin Grittner
Vitalii Tymchyshyn tiv...@gmail.com wrote:
 
 Please note that in multitasking environment you may have problems
 with your code. Two connections may check if a is available and
 if not (and both got empty select result), try to insert. One
 will succeed, another will fail if you have a unique constraint on
 category name (and you'd better have one).
 
 Please note that select for update won't help you much, since this
 is new record you are looking for, and select don't return (and
 lock) it. I am using lock table tableName in SHARE ROW
 EXCLUSIVE mode in this case.
 
 But then, if you have multiple lookup dictinaries, you need to
 ensure strict order of locking or you will be getting deadlocks.
 As for me, I did create a special application-side class to
 retrieve such values. If I can't find a value in main connection
 with simple select, I open new connection, perform table lock,
 check if value is in there. If it is not, add the value and
 commit. This may produce orphaned dictionary entries (if
 dictionary entry is committed an main transaction is rolled back),
 but this is usually OK for dictionaries. At the same time I don't
 introduce hard locks into main transaction and don't have to worry
 about deadlocks.
 
It sounds like you might want to check out the new truly
serializable transactions in version 9.1.  If you can download the
latest beta version of it and test with
default_transaction_isolation = 'serializable' I would be interested
to hear your results.  Note that you can't have deadlocks, but you
can have other types of serialization failures, so your software
needs to be prepared to start a transaction over from the beginning
when the SQLSTATE of a failure is '40001'.
 
The Wiki page which was used to document and organize the work is:
 
http://wiki.postgresql.org/wiki/Serializable
 
This is in a little bit of a funny state because not all of the
wording that was appropriate while the feature was under development
(e.g., future tense verbs) has been changed to something more
appropriate for a finished feature, but it should cover the
theoretical ground pretty well.  An overlapping document which was
initially based on parts of the Wiki page and has received more
recent attention is the README-SSI file here:
 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/storage/lmgr/README-SSI;hb=master
 
Some examples geared toward programmers and DBAs is at this Wiki
page:
 
http://wiki.postgresql.org/wiki/SSI
 
It could use a couple more examples and a bit of language cleanup,
but what is there is fairly sound.  The largest omission is that we
need to show more explicitly that serialization failures can occur
at times other than COMMIT.  (I got a little carried away trying to
show that there was no blocking and that the first committer
wins.)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-07-31 Thread Gavin Flower

On 30/07/11 08:14, Kevin Grittner wrote:

alanalan.mill...@gmail.com  wrote:


Can I write a BEFORE ROW trigger for the products table  to runs
on INSERT or UPDATE to
  1. insert a new category  return the new category_id  OR
  2.  return the existing category_id for the (to be inserted row)


What would you be using to match an existing category?  If this
accurately identifies a category, why not use it for the key to the
category table, rather than generating a synthetic key value?

-Kevin


Hi Alan,

This is the way I would define the tables, I think it conforms tom your 
requirements, and the definitions look clearer.


I have the convention that the id of the table itself is not prefixed 
with the table name, but references to the id field of other tables are 
(e.g. category_id). This is not something you need to follow, but it 
helps to clearly identify what is a foreign key, and what is the current 
table's id!  Likewise, I think it is simpler to make the table names 
singular, but this again is a bit arbitrary.


I guess, even if you prefer my conventions, it is more important to 
follow the standards of the existing database!



CREATE TABLE product
(
id  SERIAL PRIMARY KEY,
category_id int REFERENCES category(id),
nameVARCHAR(60) NOT NULL
);

CREATE TABLE category
(
id  SERIAL PRIMARY KEY,
nameVARCHAR(20) UNIQUE NOT NULL
);

Though for the primary key of the category table, it might be better to 
explicitly assign the key, then you have more control of the numbers used.


I would be a bit wary of automatically inserting a new category, when 
the given category is not already there, you could end up with several 
variations of spelling for the same category!  I once saw a system with 
about 20 variations of spelling, and number of spaces between words, for 
the name of the same company!


Possibly your client GUI application could have a drop down list of 
available categories, and provision to enter new ones, but then this 
might be outside your control.



Cheers,
GAvin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-07-29 Thread alan
I think I figured it out myself.
If anyone sees issues with this (simple) approach, please let me know.

I changed my table definitions to this:

CREATE SEQUENCE public.product_id_seq
CREATE TABLE products (
product_id INTEGER DEFAULT nextval('product_id_seq'::regclass) NOT
NULL,
name VARCHAR(60) NOT NULL,
category SMALLINT  NOT NULL,
CONSTRAINT product_id PRIMARY KEY (product_id)
);
CREATE SEQUENCE public.category_id_seq
CREATE TABLE category (
   category_id INTEGER DEFAULT nextval('category_id_seq'::regclass)
NOT NULL,
   name VARCHAR(20) NOT NULL,
   CONSTRAINT category_id PRIMARY KEY (category_id)
);
ALTER TABLE products ADD CONSTRAINT category_products_fk
FOREIGN KEY (category)
REFERENCES category (category_id)
ON DELETE NO ACTION ON UPDATE CASCADE
;

Then created this function:

CREATE OR REPLACE FUNCTION getid(_table text,_pk text,_name text)
RETURNS integer AS $$
DECLARE _id integer;
BEGIN
  EXECUTE 'SELECT '
|| _pk
|| ' FROM '
|| _table::regclass
|| ' WHERE name'
|| ' = '
|| quote_literal(_name)
   INTO _id;

  IF _id  0 THEN
return _id;
  ELSE
EXECUTE 'INSERT INTO '
 || _table
 || ' VALUES (DEFAULT,' || quote_literal(_name) || ')'
 || ' RETURNING ' || _pk
INTO _id;
return _id;
  END IF;
END;
$$
 LANGUAGE 'plpgsql' VOLATILE;

Now I can just insert into the products table via:

INSERT INTO products VALUES(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));

For example:

testdb=# select * from products;
 product_id | name | category
+--+--
(0 rows)

iims_test=# select * from category;
 category_id | name
-+--
(0 rows)

testdb=# insert into products values(DEFAULT,'Postgresql for
Dummies',getid('category','category_id','books'));
INSERT 0 1

testdb=# select * from
category;
 category_id | name
-+---
   1 | books

testdb=# select * from products;
 product_id |  name  | category
++--
  1 | Postgresql for Dummies |1

Updating the category_id in category table are also cascaded to the
product table.

testdb=# UPDATE category SET category_id = 2 WHERE category_id = 1;
UPDATE 1

testdb=# SELECT * FROM products;
 product_id |  name  | category
++--
  1 | Postgresql for Dummies |2


Alan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] insert

2011-07-29 Thread alan
next question.

I have a product table with a 'category column that I want to
maintain in a separate table.

CREATE TABLE products (
product_id INTEGER DEFAULT
nextval('product_id_seq'::regclass) NOT NULL,
name VARCHAR(60) NOT NULL,
category SMALLINT  NOT NULL,
CONSTRAINT product_id PRIMARY KEY (product_id)
);
CREATE TABLE products (
category_id INTEGER DEFAULT
nextval('category_id_seq'::regclass) NOT NULL,
name VARCHAR(20) NOT NULL,
CONSTRAINT category_id PRIMARY KEY (category_id)
);

Every product must have a category,
Since many (but not all) products have the same category I only want 1
table with unique categories.

To do the insert into the products table I need to retrieve or insert
the category_id in categories first.
Which means more code on my client app (if ($cat_id =
get_cat_id($cat)) }else { $cat_id = insert_cat($cat)})

Can I write a BEFORE ROW trigger for the products table  to runs on
INSERT or UPDATE to
 1. insert a new category  return the new category_id  OR
 2.  return the existing category_id for the (to be inserted row)

Alan
I donproducts.category to be a foreign key that points to the uniqie
category_id id in the want to keep I need to do get the cate

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert

2011-07-29 Thread Kevin Grittner
alan alan.mill...@gmail.com wrote:
 
 Can I write a BEFORE ROW trigger for the products table  to runs
 on INSERT or UPDATE to
  1. insert a new category  return the new category_id  OR
  2.  return the existing category_id for the (to be inserted row)
 
What would you be using to match an existing category?  If this
accurately identifies a category, why not use it for the key to the
category table, rather than generating a synthetic key value?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] INSERT query times

2011-07-10 Thread sergio mayoral
Hi,

i am using libpq library and postgresql 8.4 for my linux application running
on ARM with 256 MB. I am just doing:

PQconnectdb();
PQexec(INSERT INTO table1 ); (0.009661 sec.)
PQexec(INSERT INTO table1 ); (0.004208 sec.)

PQexec(INSERT INTO table2 ); (0.007352 sec.)
PQexec(INSERT INTO table2 ); (0.002533 sec.)
PQexec(INSERT INTO table2 ); (0.002281 sec.)
PQexec(INSERT INTO table2 ); (0.002244 sec.)

PQexec(INSERT INTO table3 ); (0.006903 sec.)
PQexec(INSERT INTO table3 ); (0.002903 sec.)
PQfinnish();

I check the time for each PQexec with gettimeofday function and I always see
that the first INSERT for each table needs longer than the next ones.

this must be something with the parser stage and since i am doing every time
the same queries, I would like to know if there is a way to cache this
queries in order to speed up the first INSERTs.

Thanks in advance,

Sergio


Re: [PERFORM] INSERT query times

2011-07-10 Thread Pavel Stehule
Hello

a) look on COPY statement and COPY API protocol - it can be 100x
faster than INSERTS
http://www.postgresql.org/docs/8.3/static/libpq-copy.html

b) if you can't to use COPY use:

* outer transaction - BEGIN, INSERT, INSERT ... COMMIT if this is possible
* use a prepared statement
http://www.postgresql.org/docs/8.3/static/sql-prepare.html

if you cannot to use a outer transaction, and you can to replay a
process, if there are some problems, use a asynchronnous commit
http://www.postgresql.org/docs/8.3/static/wal-async-commit.html

Regards

Pavel Stehule


2011/7/7 sergio mayoral smayo...@gmail.com:
 Hi,
 i am using libpq library and postgresql 8.4 for my linux application running
 on ARM with 256 MB. I am just doing:
 PQconnectdb();
 PQexec(INSERT INTO table1 ); (0.009661 sec.)
 PQexec(INSERT INTO table1 ); (0.004208 sec.)
 PQexec(INSERT INTO table2 ); (0.007352 sec.)
 PQexec(INSERT INTO table2 ); (0.002533 sec.)
 PQexec(INSERT INTO table2 ); (0.002281 sec.)
 PQexec(INSERT INTO table2 ); (0.002244 sec.)
 PQexec(INSERT INTO table3 ); (0.006903 sec.)
 PQexec(INSERT INTO table3 ); (0.002903 sec.)
 PQfinnish();
 I check the time for each PQexec with gettimeofday function and I always see
 that the first INSERT for each table needs longer than the next ones.
 this must be something with the parser stage and since i am doing every time
 the same queries, I would like to know if there is a way to cache this
 queries in order to speed up the first INSERTs.
 Thanks in advance,
 Sergio

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT query times

2011-07-10 Thread Tom Lane
sergio mayoral smayo...@gmail.com writes:
 i am using libpq library and postgresql 8.4 for my linux application running
 on ARM with 256 MB. I am just doing:

 PQconnectdb();
 PQexec(INSERT INTO table1 ); (0.009661 sec.)
 PQexec(INSERT INTO table1 ); (0.004208 sec.)

 PQexec(INSERT INTO table2 ); (0.007352 sec.)
 PQexec(INSERT INTO table2 ); (0.002533 sec.)
 PQexec(INSERT INTO table2 ); (0.002281 sec.)
 PQexec(INSERT INTO table2 ); (0.002244 sec.)

 PQexec(INSERT INTO table3 ); (0.006903 sec.)
 PQexec(INSERT INTO table3 ); (0.002903 sec.)
 PQfinnish();

 I check the time for each PQexec with gettimeofday function and I always see
 that the first INSERT for each table needs longer than the next ones.

The first few commands of *any* type on a new connection are going to
take longer than repeat versions of those commands, because the backend
needs to load up its internal caches.  Once it's cached information
about the tables, operators, etc that you are working with, it's a bit
faster.  This isn't specific to INSERT.

 this must be something with the parser stage and since i am doing every time
 the same queries, I would like to know if there is a way to cache this
 queries in order to speed up the first INSERTs.

The only fix is to hang onto your connections longer.  Consider a
connection pooler.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-02 Thread hubert depesz lubaczewski
On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
    CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
  )
  the index definition is
  CREATE INDEX PK_AT2
    ON ABC
    USING btree
    (event, tableindex)
  TABLESPACE sample;
 
 Indexing twice the same column is useless. (perhaps move your PK to
 the tablespace 'sample' is good too ?)

why do you say that?
these are not the same indexes and they serve different purposes.

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-02 Thread Cédric Villemain
2010/11/2 hubert depesz lubaczewski dep...@depesz.com:
 On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
    CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
  )
  the index definition is
  CREATE INDEX PK_AT2
    ON ABC
    USING btree
    (event, tableindex)
  TABLESPACE sample;

 Indexing twice the same column is useless. (perhaps move your PK to
 the tablespace 'sample' is good too ?)

 why do you say that?
 these are not the same indexes and they serve different purposes.

Given that tableindex is the PK column, I really like to now the usage
pattern for having it indexed twice.


 Best regards,

 depesz

 --
 Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
 jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-02 Thread hubert depesz lubaczewski
On Tue, Nov 02, 2010 at 12:04:42PM +0100, Cédric Villemain wrote:
 2010/11/2 hubert depesz lubaczewski dep...@depesz.com:
  On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
     CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
   )
   the index definition is
   CREATE INDEX PK_AT2
     ON ABC
     USING btree
     (event, tableindex)
   TABLESPACE sample;
 
  Indexing twice the same column is useless. (perhaps move your PK to
  the tablespace 'sample' is good too ?)
 
  why do you say that?
  these are not the same indexes and they serve different purposes.
 
 Given that tableindex is the PK column, I really like to now the usage
 pattern for having it indexed twice.

select * from table where event = 123 order by tableindex desc limit 50;

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-02 Thread Divakar Singh
May be a query that is filtering based on these 2 columns?

 Best Regards,
Divakar





From: Cédric Villemain cedric.villemain.deb...@gmail.com
To: dep...@depesz.com
Cc: Divakar Singh dpsma...@yahoo.com; pgsql-performance@postgresql.org
Sent: Tue, November 2, 2010 4:34:42 PM
Subject: Re: [PERFORM] Insert performance with composite index

2010/11/2 hubert depesz lubaczewski dep...@depesz.com:
 On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
  )
  the index definition is
  CREATE INDEX PK_AT2
ON ABC
USING btree
(event, tableindex)
  TABLESPACE sample;

 Indexing twice the same column is useless. (perhaps move your PK to
 the tablespace 'sample' is good too ?)

 why do you say that?
 these are not the same indexes and they serve different purposes.

Given that tableindex is the PK column, I really like to now the usage
pattern for having it indexed twice.


 Best regards,

 depesz

 --
 Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
 jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007




-- 
Cédric Villemain   2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] Insert performance with composite index

2010-11-02 Thread Cédric Villemain
2010/11/2 hubert depesz lubaczewski dep...@depesz.com:
 On Tue, Nov 02, 2010 at 12:04:42PM +0100, Cédric Villemain wrote:
 2010/11/2 hubert depesz lubaczewski dep...@depesz.com:
  On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
     CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
   )
   the index definition is
   CREATE INDEX PK_AT2
     ON ABC
     USING btree
     (event, tableindex)
   TABLESPACE sample;
 
  Indexing twice the same column is useless. (perhaps move your PK to
  the tablespace 'sample' is good too ?)
 
  why do you say that?
  these are not the same indexes and they serve different purposes.

 Given that tableindex is the PK column, I really like to now the usage
 pattern for having it indexed twice.

 select * from table where event = 123 order by tableindex desc limit 50;

Correct. Thanks Hubert.

-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Insert performance with composite index

2010-11-01 Thread Divakar Singh
Hi,
I am trying to tune my libpq program for insert performance.
When I tried inserting 1M rows into a table with a Primary Key, it took almost 
62 seconds.
After adding a composite index of 2 columns, the performance degrades to 125 
seconds.
I am using COPY to insert all data in 1 transaction.

the table definition is 

CREATE TABLE ABC
(
  event integer,
  innodeid character varying(80),
  innodename character varying(80),
  sourceid character varying(300),
  intime timestamp(3) without time zone,
  outnodeid character varying(80),
  outnodename character varying(80),
  destinationid character varying(300),
  outtime timestamp(3) without time zone,
  bytes integer,
  cdrs integer,
  tableindex integer NOT NULL,
  noofsubfilesinfile integer,
  recordsequenceintegerlist character varying(1000),
  CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
)

the index definition is 


CREATE INDEX PK_AT2
  ON ABC
  USING btree
  (event, tableindex)
TABLESPACE sample;

Any tip to increase the insert performance in this case?

It would also be helpful if someone can send comprehensive libpq programming 
guide for PG 9.x. Online doc of libpq is not much helpful for a newbie like me.


 Best Regards,
Divakar



  

Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Divakar Singh
Hi Marti,
Thanks for your tips. i will try those.
I am on Solaris Sparc 5.10

 Best Regards,
Divakar





From: Marti Raudsepp ma...@juffo.org
To: Divakar Singh dpsma...@yahoo.com
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 6:23:17 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Mon, Nov 1, 2010 at 14:49, Divakar Singh dpsma...@yahoo.com wrote:
 I am trying to tune my libpq program for insert performance.
 When I tried inserting 1M rows into a table with a Primary Key, it took
 almost 62 seconds.
 After adding a composite index of 2 columns, the performance degrades to 125
 seconds.

This sounds a lot like the bottleneck I was hitting. What Linux kernel
version are you running?

If it's 2.6.33 or later, see:
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#wal_sync_method_wal_buffers

http://archives.postgresql.org/pgsql-performance/2010-10/msg00602.php

Regards,
Marti



  

Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Marti Raudsepp
On Mon, Nov 1, 2010 at 14:49, Divakar Singh dpsma...@yahoo.com wrote:
 I am trying to tune my libpq program for insert performance.
 When I tried inserting 1M rows into a table with a Primary Key, it took
 almost 62 seconds.
 After adding a composite index of 2 columns, the performance degrades to 125
 seconds.

This sounds a lot like the bottleneck I was hitting. What Linux kernel
version are you running?

If it's 2.6.33 or later, see:
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#wal_sync_method_wal_buffers
http://archives.postgresql.org/pgsql-performance/2010-10/msg00602.php

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Marti Raudsepp
On Mon, Nov 1, 2010 at 14:56, Divakar Singh dpsma...@yahoo.com wrote:
 Thanks for your tips. i will try those.
 I am on Solaris Sparc 5.10

Sorry, I assumed you were running Linux. But still it could be the
same problem as I had.

Be careful changing your wal_sync_method, as it has the potential to
corrupt your database. I have no experience with Solaris.

For what it's worth, Jignesh Shah recommends using
wal_sync_method=fsync on Solaris:
http://blogs.sun.com/jkshah/entry/postgresql_on_solaris_better_use
http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Cédric Villemain
2010/11/1 Divakar Singh dpsma...@yahoo.com:
 Hi,
 I am trying to tune my libpq program for insert performance.
 When I tried inserting 1M rows into a table with a Primary Key, it took
 almost 62 seconds.
 After adding a composite index of 2 columns, the performance degrades to 125
 seconds.
 I am using COPY to insert all data in 1 transaction.

 the table definition is

 CREATE TABLE ABC
 (
   event integer,
   innodeid character varying(80),
   innodename character varying(80),
   sourceid character varying(300),
   intime timestamp(3) without time zone,
   outnodeid character varying(80),
   outnodename character varying(80),
   destinationid character varying(300),
   outtime timestamp(3) without time zone,
   bytes integer,
   cdrs integer,
   tableindex integer NOT NULL,
   noofsubfilesinfile integer,
   recordsequenceintegerlist character varying(1000),
   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
 )

 the index definition is


 CREATE INDEX PK_AT2
   ON ABC
   USING btree
   (event, tableindex)
 TABLESPACE sample;

Indexing twice the same column is useless. (perhaps move your PK to
the tablespace 'sample' is good too ?)


 Any tip to increase the insert performance in this case?

If you create or truncate  table then copy to it, you should create
index after the copy order.


 It would also be helpful if someone can send comprehensive libpq programming
 guide for PG 9.x. Online doc of libpq is not much helpful for a newbie like
 me.


 Best Regards,
 Divakar





-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Andres Freund
Hi,

On Monday 01 November 2010 13:49:14 Divakar Singh wrote:
 When I tried inserting 1M rows into a table with a Primary Key, it took
 almost 62 seconds.
 After adding a composite index of 2 columns, the performance degrades to
 125 seconds.
 I am using COPY to insert all data in 1 transaction.
Without seeing your config its hard to suggest anything here. Did you do basic 
tuning of your pg installation?

wal_buffers, shared_buffers, checkpoint_segments, maintenance_work_mem are 
likely most relevant for that specific case.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Andres Freund
On Monday 01 November 2010 15:08:10 Divakar Singh wrote:
 here are my parameters:
Which pg version is that?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Divakar Singh
I am using 9.0.1

 Best Regards,
Divakar





From: Andres Freund and...@anarazel.de
To: Divakar Singh dpsma...@yahoo.com
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 7:44:31 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Monday 01 November 2010 15:08:10 Divakar Singh wrote:
 here are my parameters:
Which pg version is that?



  

Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Andres Freund
On Monday 01 November 2010 15:16:49 Divakar Singh wrote:
 I am using 9.0.1
Either thats not true or you cargo culted loads of your config from a 
significantly older pg version.

Things like:

#bgwriter_delay = 200# 10-1 milliseconds between rounds
bgwriter_lru_percent = 0# 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5# 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333# 0-100% of all buffers scanned/round
bgwriter_all_maxpages = 0# 0-1000 buffers max written/round

make me very suspicious.

As I said, I would check the variables I referenced in my first post...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Divakar Singh
Do you mean these parameters have been removed starting 9.X?
As I see on 
http://www.network-theory.co.uk/docs/postgresql/vol3/BackgroundWriter.html 
,these parameters were added starting from 8.0 right?


 Best Regards,
Divakar





From: Andres Freund and...@anarazel.de
To: Divakar Singh dpsma...@yahoo.com
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 7:50:59 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Monday 01 November 2010 15:16:49 Divakar Singh wrote:
 I am using 9.0.1
Either thats not true or you cargo culted loads of your config from a 
significantly older pg version.

Things like:

#bgwriter_delay = 200# 10-1 milliseconds between rounds
bgwriter_lru_percent = 0# 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5# 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333# 0-100% of all buffers scanned/round
bgwriter_all_maxpages = 0# 0-1000 buffers max written/round

make me very suspicious.

As I said, I would check the variables I referenced in my first post...

Andres



  

Re: [PERFORM] Insert performance with composite index

2010-11-01 Thread Andres Freund
On Monday 01 November 2010 15:28:19 Divakar Singh wrote:
 Do you mean these parameters have been removed starting 9.X?
 As I see on 
 http://www.network-theory.co.uk/docs/postgresql/vol3/BackgroundWriter.html 
 ,these parameters were added starting from 8.0 right?
No, I mean setting to 0 is a bit of a strange value in many situations.

And you have comments like:
#max_fsm_pages = 2# min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000# min 100, ~70 bytes each

Which reference config options which do not exist anymore. And you have 
shared_buffers = 81920
Which indicates that you started from 8.1/8.2 or so...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance and multi-column index order

2009-06-30 Thread Bob Lunney

Greg,

Thanks for the mental prod!  Yes, the original data is more closely sorted by 
the timestamptz column, since they represent events coming into the collection 
system in real time.  As for the distribution of data values, it goes without 
saying the timestamptz value is monotonically increasing, with roughly 1300 
entries having the same timestamptz value.  The other three columns' values are 
essentially reference data, with 400 values for the varchar, 680 for the first 
text column, and 60 for the second text column.  The distribution is fairly 
even, with some small spikes but nothing significant.

The duh moment came for me when you pointed out the implicit sort order of 
the data.  After resorting the data into the new index column order the insert 
performance was largely restored.  I didn't monitor the process with vmstat, 
however - the end result is good enough for me.  I believe that the index 
maintenance of page splitting, etc., that you describe below was exactly the 
culprit, and that presorting the data solved that problem.  

I call it my duh moment since I've presorted data for Sybase and Oracle for 
exactly the same reason, but forgot to apply the lesson to PostgreSQL.

BTW, this is PG 8.2.1 and 8.3.7 running on SLES 10.3, although I don't think it 
matters.

Thanks for the help, Greg and Tom!

--- On Sat, 6/27/09, Greg Smith gsm...@gregsmith.com wrote:

 From: Greg Smith gsm...@gregsmith.com
 Subject: Re: [PERFORM] Insert performance and multi-column index order
 To: bob_lun...@yahoo.com
 Cc: pgsql-performance@postgresql.org
 Date: Saturday, June 27, 2009, 1:08 AM
 On Fri, 26 Jun 2009, bob_lun...@yahoo.com
 wrote:
 
  The original unique index was in the order
 (timestamptz, varchar, text, text) and most queries against
 it were slow.  I changed the index order to (varchar, text,
 timestamptz, text) and queries now fly, but loading data
 (via copy from stdin) in the table is 2-4 times slower.
 
 Is the input data closer to being sorted by the timestamptz
 field than the varchar field?  What you might be seeing
 is that the working set of index pages needed to keep
 building the varchar index are bigger or have more of a
 random access component to them as they spill in and out of
 the buffer cache.  Usually you can get a better idea
 what the difference is by comparing the output from vmstat
 while the two are loading.  More random read/write
 requests in the mix will increase the waiting for I/O
 percentage while not increasing the total amount
 read/written per second.
 
 --
 * Greg Smith gsm...@gregsmith.com
 http://www.gregsmith.com Baltimore, MD




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Insert performance and multi-column index order

2009-06-26 Thread bob_lunney

I have a partitioned table with a multi-column unique index.  The table is 
partitioned on a timestamp with time zone column.  (I realize this has nothing 
to do with the unique index.)  The original unique index was in the order 
(timestamptz, varchar, text, text) and most queries against it were slow.  I 
changed the index order to (varchar, text, timestamptz, text) and queries now 
fly, but loading data (via copy from stdin) in the table is 2-4 times slower.  
The unique index is required during the load.  

The original index is in the same order as the table's columns (2,3,4,5), while 
the changed index is in column order (3,5,2,4).  I've tested this several times 
and the effect is repeatable.  It does not seem the column order in the table 
matters to the insert/index performance, just the column order in the index.

Why would changing the column order on a unique index cause data loading or 
index servicing to slow down?  Page splits in the b-tree, maybe?

Thanks in advance for any advice.





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert performance and multi-column index order

2009-06-26 Thread Greg Smith

On Fri, 26 Jun 2009, bob_lun...@yahoo.com wrote:

The original unique index was in the order (timestamptz, varchar, text, 
text) and most queries against it were slow.  I changed the index order 
to (varchar, text, timestamptz, text) and queries now fly, but loading 
data (via copy from stdin) in the table is 2-4 times slower.


Is the input data closer to being sorted by the timestamptz field than the 
varchar field?  What you might be seeing is that the working set of index 
pages needed to keep building the varchar index are bigger or have more of 
a random access component to them as they spill in and out of the buffer 
cache.  Usually you can get a better idea what the difference is by 
comparing the output from vmstat while the two are loading.  More random 
read/write requests in the mix will increase the waiting for I/O 
percentage while not increasing the total amount read/written per second.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-15 Thread Matthew Wakeling

On Tue, 14 Apr 2009, Stephen Frost wrote:

What does your test harness currently look like, and what would you like
to see to test the binary-format COPY?  I'd be happy to write up the
code necessary to implement binary-format COPY for this.


If anyone needs this code in Java, we have a version at 
http://www.intermine.org/


Download source code: http://www.intermine.org/wiki/SVNCheckout

Javadoc: http://www.intermine.org/api/

The code is contained in the org.intermine.sql.writebatch package, in the 
intermine/objectstore/main/src/org/intermine/sql/writebatch directory in 
the source.


The public interface is org.intermine.sql.writebatch.Batch.

The Postgres-specific binary COPY code is in 
org.intermine.sql.writebatch.BatchWriterPostgresCopyImpl.


The implementation unfortunately relies on a very old modified version of 
the Postgres JDBC driver, which is in the intermine/objectstore/main/lib 
directory.


The code is released under the LGPL, and we would appreciate notification 
if it is used.


The code implements quite a sophisticated system for writing rows to 
database tables very quickly. It batches together multiple writes into 
COPY statements, and writes them in the background in another thread, 
while fully honouring flush calls. When it is using the database 
connection is well-defined. I hope someone can find it useful.


Matthew

--
-. .-.   .-. .-.   .-. .-.   .-. .-.   .-. .-.   .-. .-.   .-.
||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||X|||\ /|||
|/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/ \|||X|||/
'   `-' `-'   `-' `-'   `-' `-'   `-' `-'   `-' `-'   `-' `-'

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-15 Thread Matthew Wakeling

On Wed, 15 Apr 2009, Matthew Wakeling wrote:
If anyone needs this code in Java, we have a version at 
http://www.intermine.org/


Download source code: http://www.intermine.org/wiki/SVNCheckout

Javadoc: http://www.intermine.org/api/


Sorry, that should be http://www.flymine.org/api/

Matthew

--
What goes up must come down. Ask any system administrator.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Craig Ringer
Hi

I've been doing some testing for the Bacula project, which uses
PostgreSQL as one of the databases in which it stores backup catalogs.

Insert times are critical in this environment, as the app may insert
millions of records a day.

I've been evaluating a schema change for Bacula that takes a field
that's currently stored as a gruesome-to-work-with base64-encoded
representation of a binary blob, and expands it into a set of integer
fields that can be searched, indexed, etc.

The table size of the expanded form is marginally smaller than for the
base64-encoded string version. However, INSERT times are *CONSIDERABLY*
greater for the version with more fields. It takes 1011 seconds to
insert the base64 version, vs 1290 seconds for the expanded-fields
version. That's a difference of 279 seconds, or 27%.

Despite that, the final table sizes are the same.

The SQL dump for the base64 version is 1734MB and the expanded one is
2189MB, about a 25% increase. Given that the final table sizes are the
same, is the slowdown likely to just be the cost of parsing the extra
SQL, converting the textual representations of the numbers, etc?

If I use tab-separated input and COPY, the original-format file is
1300MB and the expanded-structure format is 1618MB. The performance hit
on COPY-based insert is not as bad, at 161s vs 182s (13%), but still
quite significant.

Any ideas about what I might be able to do to improve the efficiency of
inserting records with many integer fields?


In case it's of interest, the base64 and expanded schema are:


CREATE TABLE file (
fileid bigint NOT NULL,
fileindex integer DEFAULT 0 NOT NULL,
jobid integer NOT NULL,
pathid integer NOT NULL,
filenameid integer NOT NULL,
markid integer DEFAULT 0 NOT NULL,
lstat text NOT NULL,
md5 text NOT NULL
);



CREATE TABLE file (
fileid bigint,
fileindex integer,
jobid integer,
pathid integer,
filenameid integer,
markid integer,
st_dev integer,
st_ino integer,
st_mod integer,
st_nlink integer,
st_uid integer,
st_gid integer,
st_rdev bigint,
st_size integer,
st_blksize integer,
st_blocks integer,
st_atime integer,
st_mtime integer,
st_ctime integer,
linkfi integer,
md5 text
);


( Yes, those are the fields of a `struct lstat' ).

--
Craig Ringer


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Stephen Frost
Craig,

* Craig Ringer (cr...@postnewspapers.com.au) wrote:
 I've been doing some testing for the Bacula project, which uses
 PostgreSQL as one of the databases in which it stores backup catalogs.

We also use Bacula with a PostgreSQL backend.

 I've been evaluating a schema change for Bacula that takes a field
 that's currently stored as a gruesome-to-work-with base64-encoded
 representation of a binary blob, and expands it into a set of integer
 fields that can be searched, indexed, etc.

This would be extremely nice.

 The table size of the expanded form is marginally smaller than for the
 base64-encoded string version. However, INSERT times are *CONSIDERABLY*
 greater for the version with more fields. It takes 1011 seconds to
 insert the base64 version, vs 1290 seconds for the expanded-fields
 version. That's a difference of 279 seconds, or 27%.
 
 Despite that, the final table sizes are the same.
 
 If I use tab-separated input and COPY, the original-format file is
 1300MB and the expanded-structure format is 1618MB. The performance hit
 on COPY-based insert is not as bad, at 161s vs 182s (13%), but still
 quite significant.
 
 Any ideas about what I might be able to do to improve the efficiency of
 inserting records with many integer fields?

Bacula should be using COPY for the batch data loads, so hopefully won't
suffer too much from having the fields split out.  I think it would be
interesting to try doing PQexecPrepared with binary-format data instead
of using COPY though.  I'd be happy to help you implement a test setup
for doing that, if you'd like.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Matthew Wakeling

On Tue, 14 Apr 2009, Stephen Frost wrote:

Bacula should be using COPY for the batch data loads, so hopefully won't
suffer too much from having the fields split out.  I think it would be
interesting to try doing PQexecPrepared with binary-format data instead
of using COPY though.  I'd be happy to help you implement a test setup
for doing that, if you'd like.


You can always do binary-format COPY.

Matthew

--
An ant doesn't have a lot of processing power available to it. I'm not trying
to be speciesist - I wouldn't want to detract you from such a wonderful
creature, but, well, there isn't a lot there, is there?
   -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Stephen Frost
* Matthew Wakeling (matt...@flymine.org) wrote:
 On Tue, 14 Apr 2009, Stephen Frost wrote:
 Bacula should be using COPY for the batch data loads, so hopefully won't
 suffer too much from having the fields split out.  I think it would be
 interesting to try doing PQexecPrepared with binary-format data instead
 of using COPY though.  I'd be happy to help you implement a test setup
 for doing that, if you'd like.

 You can always do binary-format COPY.

I've never played with binary-format COPY actually.  I'd be happy to
help test that too though.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Craig Ringer
Stephen Frost wrote:
 * Matthew Wakeling (matt...@flymine.org) wrote:
 On Tue, 14 Apr 2009, Stephen Frost wrote:
 Bacula should be using COPY for the batch data loads, so hopefully won't
 suffer too much from having the fields split out.  I think it would be
 interesting to try doing PQexecPrepared with binary-format data instead
 of using COPY though.  I'd be happy to help you implement a test setup
 for doing that, if you'd like.
 You can always do binary-format COPY.
 
 I've never played with binary-format COPY actually.  I'd be happy to
 help test that too though.

I'd have to check the source/a protocol dump to be sure, but I think
PQexecPrepared(...), while it takes binary arguments, actually sends
them over the wire in text form. PostgreSQL does have a binary protocol
as well, but it suffers from the same issues as binary-format COPY:

Unlike PQexecPrepared(...), binary-format COPY doesn't handle endian and
type size issues for you. You need to convert the data to the database
server's endianness and type sizes, but I don't think the PostgreSQL
protocol provides any way to find those out.

It's OK if we're connected via a UNIX socket (and thus are on the same
host), though I guess a sufficiently perverse individual could install a
32-bit bacula+libpq, and run a 64-bit PostgreSQL server, or even vice versa.

It should also be OK when connected to `localhost' (127.0.0.0/8) .

In other cases, binary-format COPY would be unsafe without some way to
determine remote endianness and sizeof(various types).

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Tom Lane
Craig Ringer cr...@postnewspapers.com.au writes:
 Unlike PQexecPrepared(...), binary-format COPY doesn't handle endian and
 type size issues for you. You need to convert the data to the database
 server's endianness and type sizes, but I don't think the PostgreSQL
 protocol provides any way to find those out.

The on-the-wire binary format is much better specified than you think.
(The documentation of it sucks, however.)  It's big-endian in all cases
and the datatype sizes are well defined.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts

2009-04-14 Thread Stephen Frost
Craig,

* Craig Ringer (cr...@postnewspapers.com.au) wrote:
 In other cases, binary-format COPY would be unsafe without some way to
 determine remote endianness and sizeof(various types).

As Tom mentioned already, the binary protocol is actually pretty well
defined, and it's in network-byte-order, aka, big-endian.  The only
issue that I can think of off-hand that you need to know about the
server is if it's using 64-bit integers for date-times or if it's using
float.  That's a simple check to do, however, specifically with:

show integer_datetimes;

It's also alot cheaper to do the necessary byte-flipping to go from
whatever-endian to network-byte-order than to do the whole printf/atoi
conversion.  Handling timestamps takes a bit more magic but you can just
pull the appropriate code/#defines from the server backend, but I don't
think that's even an issue for this particular set.

What does your test harness currently look like, and what would you like
to see to test the binary-format COPY?  I'd be happy to write up the
code necessary to implement binary-format COPY for this.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PERFORM] insert and Update slow after implementing slony.

2008-12-17 Thread David Rees
On Tue, Dec 16, 2008 at 8:03 PM, Nimesh Satam nimesh.z...@gmail.com wrote:
 We are trying to implement slony as a replication tool for one of our
 database. The Insert and updates have increased by approximately double
 making some of our important script slow.

What version of PostgreSQL are you running and on what type of hardware?

I suspect that moving pg_log onto a separate spindle and/or upgrading
your RAID controller to something with a BBU and configured in
write-back mode would get most of your performance back.

If you aren't running PostgreSQL 8.3, that might also help update
performance significantly as well.

-Dave

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] insert and Update slow after implementing slony.

2008-12-16 Thread Nimesh Satam
Hi,

We are trying to implement slony as a replication tool for one of our
database. The Insert and updates have increased by approximately double
making some of our important script slow.

The database in concern is a warehouse and we have added additional primary
key to support slony by using default vale as SERIAL for the primary key.

Can you any one let us know what we can do to speed up the queries and if it
is a good idea to use Slony for db warehouse?

Let me know if you need any further informatiom.

Regards,
Nimesh.


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread Matthew Wakeling

On Tue, 3 Jun 2008, andrew klassen wrote:
Basically, I have a somewhat constant rate of inserts/updates that go 
into a work queue and then get passed to postgres.



The cpu load is not that high, i.e. plenty of idle cpu. I am running an older
version of freebsd and the iostat output is not very detailed.


If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more than 
one connection. Postgres is designed pretty well to scale performance by 
the number of simultaneous connections.


Matthew

--
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are. -- Kyle Hearn
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen
I am using multiple threads, but only one worker thread for insert/updated to 
this table.
I don't mind trying to add multiple threads for this table, but my guess is it 
would not 
help because basically the overall tps rate is decreasing so dramatically. Since
the cpu time consumed by the corresponding postgres server process for my 
thread is
small it does not seem to be the bottleneck. There has to be a bottleneck 
somewhere else. 
Do you agree or is there some flaw in my reasoning?

- Original Message 
From: Matthew Wakeling [EMAIL PROTECTED]
To: andrew klassen [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 5:31:22 AM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

On Tue, 3 Jun 2008, andrew klassen wrote:
 Basically, I have a somewhat constant rate of inserts/updates that go 
 into a work queue and then get passed to postgres.

 The cpu load is not that high, i.e. plenty of idle cpu. I am running an older
 version of freebsd and the iostat output is not very detailed.

If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more than 
one connection. Postgres is designed pretty well to scale performance by 
the number of simultaneous connections.

Matthew

-- 
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are.                -- Kyle Hearn
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen

I agree that the discs are probably very busy. I do have 2 disks but they are
for redundancy. Would it help to put the data, indexes and xlog on separate 
disk partitions? 
I'll try adding more threads to update the table as you suggest.

- Original Message 
From: Matthew Wakeling [EMAIL PROTECTED]
To: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 10:10:38 AM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

On Wed, 4 Jun 2008, andrew klassen wrote:
 I am using multiple threads, but only one worker thread for insert/updated to 
 this table.
 I don't mind trying to add multiple threads for this table, but my guess is 
 it would not
 help because basically the overall tps rate is decreasing so dramatically. 
 Since
 the cpu time consumed by the corresponding postgres server process for my 
 thread is
 small it does not seem to be the bottleneck. There has to be a bottleneck 
 somewhere else.
 Do you agree or is there some flaw in my reasoning?

There is indeed a flaw in your reasoning - there may be very little CPU 
time consumed, but that just indicates that the discs are busy. Getting 
Postgres to do multiple things at once will cause a more efficient use of 
the disc subsystem, resulting in greater overall throughput. This is 
especially the case if you have multiple discs in your box.

Matthew

-- 
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are.                -- Kyle Hearn
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread James Mansion

Matthew Wakeling wrote:
If you're running a work queue architecture, that probably means you 
only have one thread doing all the updates/inserts? It might be worth 
going multi-threaded, and issuing inserts and updates through more 
than one connection. Postgres is designed pretty well to scale 
performance by the number of simultaneous connections.
That would explain a disappointing upper limit on insert rate, but not 
any sort of cliff for the rate.  Nor, really, any material slowdown, if 
the single thread implies that we're stuck on round trip latency as a 
material limiting factor.


James


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread James Mansion

andrew klassen wrote:

I'll try adding more threads to update the table as you suggest.
You could try materially increasing the update batch size too.  As an 
exercise you could
see what the performance of COPY is by backing out the data and 
reloading it from

a suitable file.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-04 Thread andrew klassen
I am using the c-library interface and for these particular transactions
I preload PREPARE statements. Then as I get requests, I issue a BEGIN, 
followed by at most 300 EXECUTES and then a COMMIT. That is the
general scenario. What value beyond 300 should I try? 
Also, how might COPY (which involves file I/O) improve the 
above scenario? 
Thanks.


- Original Message 
From: James Mansion [EMAIL PROTECTED]
To: andrew klassen [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Sent: Wednesday, June 4, 2008 3:20:26 PM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

andrew klassen wrote:
 I'll try adding more threads to update the table as you suggest.
You could try materially increasing the update batch size too.  As an 
exercise you could
see what the performance of COPY is by backing out the data and 
reloading it from
a suitable file.


  

[PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-03 Thread andrew klassen
Running postgres 8.2.5
 
I have a table that has 5 indices, no foreign keys or any 
dependency on any other table. If delete the database and 
start entering entries, everything works very well until I get
to some point (let's say 1M rows). Basically, I have a somewhat
constant rate of inserts/updates that go into a work queue and then
get passed to postgres. The work queue starts filling up as the
responsiveness slows down. For example at 1.5M 
rows it takes 2 seconds for 300 inserts issued in one transaction. 
 
Prior to this point I had added regular VACUUM ANALYZE on 
the table and it did help.  I increased maintenance work memory to 
128M. I also set the fillfactor on the table indices to 50% (not sure 
if that made any difference have to study results more closely).  
 
In an effort to figure out the bottleneck, I DROPed 4 of the indices 
on the table and the tps increased to over 1000. I don't really know 
which index removal gave the best performance improvement. I 
dropped 2 32-bit indices and 2 text indices which all using btree. 
 
The cpu load is not that high, i.e. plenty of idle cpu. I am running an older
version of freebsd and the iostat output is not very detailed.
During this time, the number is low  10Mbs. The system has an 
LSI Logic MegaRAID controller with 2 disks.
 
Any ideas on how to find the bottleneck/decrease overhead of index usage. 
 
Thanks.


  

Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-03 Thread PFC
On Wed, 04 Jun 2008 00:36:09 +0200, andrew klassen [EMAIL PROTECTED]  
wrote:



Running postgres 8.2.5
 
I have a table that has 5 indices, no foreign keys or any
dependency on any other table. If delete the database and
start entering entries, everything works very well until I get
to some point (let's say 1M rows). Basically, I have a somewhat
constant rate of inserts/updates that go into a work queue and then
get passed to postgres. The work queue starts filling up as the
responsiveness slows down. For example at 1.5M
rows it takes 2 seconds for 300 inserts issued in one transaction.
 
Prior to this point I had added regular VACUUM ANALYZE on
the table and it did help.  I increased maintenance work memory to
128M. I also set the fillfactor on the table indices to 50% (not sure
if that made any difference have to study results more closely). 
 
In an effort to figure out the bottleneck, I DROPed 4 of the indices
on the table and the tps increased to over 1000. I don't really know
which index removal gave the best performance improvement. I
dropped 2 32-bit indices and 2 text indices which all using btree.
 
The cpu load is not that high, i.e. plenty of idle cpu. I am running an  
older

version of freebsd and the iostat output is not very detailed.
During this time, the number is low  10Mbs. The system has an
LSI Logic MegaRAID controller with 2 disks.
 
Any ideas on how to find the bottleneck/decrease overhead of index usage.
 
Thanks.


	If you are filling an empty table it is generally faster to create the  
indexes after the data import.
	Of course if this is a live table or you need the indexes during the  
import, this is not an option.
	I find it generally faster to lightly preprocess the data and generate  
text files that I then import using COPY, then doing the rest of the  
processing in SQL.


How much RAM in the box ? size of the data  indexes ?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] insert/update tps slow with indices on table 1M rows

2008-06-03 Thread andrew klassen
I am not currently using copy, but  I am using prepared statements  
for table insert/updates so the overhead for the actual data transfer 
should be pretty good. I am sending at most  300 inserts/updates 
per transaction, but that is just an arbitrary value. When the queue 
grows, I could easily send more per transaction. I  did experiment 
a little, but it did not seem to help significantly at the time.
 
The system has 4G total memory. Shared memory is locked by the OS,
i.e. not paged so I am only using shared_buffers=28MB.
 
The maximum data per row is 324 bytes assuming maximum expected length of two 
text fields. There are 5 total indices: 1 8-byte, 2 4-byte and 2 text fields. 
As mentioned all indices are btree.
 

 
- Original Message 
From: PFC [EMAIL PROTECTED]
To: andrew klassen [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Sent: Tuesday, June 3, 2008 7:15:10 PM
Subject: Re: [PERFORM] insert/update tps slow with indices on table  1M rows

On Wed, 04 Jun 2008 00:36:09 +0200, andrew klassen [EMAIL PROTECTED]  
wrote:

 Running postgres 8.2.5
  
 I have a table that has 5 indices, no foreign keys or any
 dependency on any other table. If delete the database and
 start entering entries, everything works very well until I get
 to some point (let's say 1M rows). Basically, I have a somewhat
 constant rate of inserts/updates that go into a work queue and then
 get passed to postgres. The work queue starts filling up as the
 responsiveness slows down. For example at 1.5M
 rows it takes 2 seconds for 300 inserts issued in one transaction.
  
 Prior to this point I had added regular VACUUM ANALYZE on
 the table and it did help.  I increased maintenance work memory to
 128M. I also set the fillfactor on the table indices to 50% (not sure
 if that made any difference have to study results more closely). 
  
 In an effort to figure out the bottleneck, I DROPed 4 of the indices
 on the table and the tps increased to over 1000. I don't really know
 which index removal gave the best performance improvement. I
 dropped 2 32-bit indices and 2 text indices which all using btree.
  
 The cpu load is not that high, i.e. plenty of idle cpu. I am running an  
 older
 version of freebsd and the iostat output is not very detailed.
 During this time, the number is low  10Mbs. The system has an
 LSI Logic MegaRAID controller with 2 disks.
  
 Any ideas on how to find the bottleneck/decrease overhead of index usage.
  
 Thanks.

    If you are filling an empty table it is generally faster to create the  
indexes after the data import.
    Of course if this is a live table or you need the indexes during the  
import, this is not an option.
    I find it generally faster to lightly preprocess the data and generate  
text files that I then import using COPY, then doing the rest of the  
processing in SQL.

    How much RAM in the box ? size of the data  indexes ?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



  

[PERFORM] Insert time

2008-04-01 Thread Ioana Danes
Hi everyone,

I am running a test with 1 thread calling a stored
procedure in an endless loop. The stored procedure
inserts 1000 records in a table that does not have
indexes or constraints.
In the log file I see that the time to execute the
procedure sometimes it jumps from 100 ms to 700 ms.
The auto-vacuum is turned off.
Can anyone give me some details about this?

Thanks a lot,

17221%2008-04-01 09:22:53 ESTLOG:  statement: select *
from testinsert(11001,1000)
17221%2008-04-01 09:22:53 ESTLOG:  duration: 111.654
ms
17223%2008-04-01 09:22:53 ESTLOG:  statement: select *
from testinsert(11001,1000)
17223%2008-04-01 09:22:54 ESTLOG:  duration: 710.426
ms


  __
Ask a question on any topic and get answers from real people. Go to Yahoo! 
Answers and share what you know at http://ca.answers.yahoo.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Insert Statements Hanging

2007-07-25 Thread Alan Hodgson
On Wednesday 25 July 2007 13:27, Pallav Kalva [EMAIL PROTECTED] 
wrote:
 I am hoping SELECT 1 FROM ONLY provisioning.account x WHERE
 accountid = $1 FOR UPDATE OF x is causing the problem. If that is the
 case why doesnt it show in the pg_stat_activity view ? or am I missing
 something here ? what would be the reason for insert statement to hang
 like that ?

It's waiting for a lock, probably on one of the tables that it references 
for foreign keys.

8.1 or later would have that happen a lot less; they altered the locking 
requirements for foreign key lookups.

-- 
It is a besetting vice of democracies to substitute public opinion for
law. - James Fenimore Cooper 


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Insert Statements Hanging

2007-07-25 Thread Pallav Kalva
Hi,

I am having problems with some of the Insert statements in the prod
database. Our client application is trying into insert some of the
records and it is not going through , they are just hanging. They are
running in a transaction and some how it is not telling us what is it
waiting on . Here is the output from pg_stat_activity

select current_query from pg_stat_activity where current_query 
'IDLE' order by query_start;

current_query
--
 insert into provisioning.accountnote (fkserviceinstanceid, fkaccountid,
fknoteid, accountnoteid) values ($1, $2, $3, $4)

As soon as I kill the process id for this insert statement I see these
statements in the postgres log file

 FATAL:  terminating connection due to administrator command
 CONTEXT:  SQL statement SELECT 1 FROM ONLY provisioning.account x
WHERE accountid = $1 FOR UPDATE OF x

I am hoping SELECT 1 FROM ONLY provisioning.account x WHERE
accountid = $1 FOR UPDATE OF x is causing the problem. If that is the
case why doesnt it show in the pg_stat_activity view ? or am I missing
something here ? what would be the reason for insert statement to hang
like that ? 

Postgres version:  8.0.12, vacuums and analyze are done regularly.

Here are table structures

\d provisioning.accountnote
   Table provisioning.accountnote
   Column|  Type   |   Modifiers
-+-+---
 accountnoteid   | integer | not null default
nextval('provisioning.AccountNoteSeq'::text)
 fkaccountid | integer | not null
 fknoteid| integer | not null
 fkserviceinstanceid | integer |
Indexes:
pk_accountnote_accountnoteid PRIMARY KEY, btree (accountnoteid)
idx_accountnote_fkaccountid btree (fkaccountid)
idx_accountnote_fknoteid btree (fknoteid)
idx_accountnote_fkserviceinstanceid btree (fkserviceinstanceid)
Foreign-key constraints:
fk_accountnote_serviceinstance FOREIGN KEY (fkserviceinstanceid)
REFERENCES provisioning.serviceinstance(serviceinstanceid)
fk_accountnote_note FOREIGN KEY (fknoteid) REFERENCES
common.note(noteid)
fk_accountnote_account FOREIGN KEY (fkaccountid) REFERENCES
provisioning.account(accountid)



\d provisioning.account
  Table provisioning.account
  Column  |Type
|   Modifiers
--+-+
 accountid| integer | not null
default nextval('provisioning.AccountSeq'::text)
 createdate   | timestamp without time zone | not null
default ('now'::text)::timestamp(6) without time zone
 fkcontactid  | integer |
 login| text| not null
 password | text|
 fkserviceproviderid  | integer |
 serviceproviderreference | text|
Indexes:
pk_account_accountid PRIMARY KEY, btree (accountid)
idx_account_fkcontactid btree (fkcontactid)
idx_account_login btree (login)
idx_account_serviceproviderreference btree (serviceproviderreference)
Foreign-key constraints:
fk_account_serviceprovider FOREIGN KEY (fkserviceproviderid)
REFERENCES provisioning.serviceprovider(serviceproviderid)
fk_account_contact FOREIGN KEY (fkcontactid) REFERENCES
common.contact(contactid)

Thanks!
Pallav.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insert vs select into performance

2007-07-23 Thread Michael Stone

On Wed, Jul 18, 2007 at 09:13:14PM +0200, Thomas Finneid wrote:

Michael Stone wrote:

I don't understand how the insert you described is table to table?


SELECT INTO is table to table, so is INSERT INTO SELECT FROM.


I could have sworn that at least one of the examples you gave didn't 
have any select. Doesn't really matter.


Mike Stone

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread PFC


It's the time to parse statements, plan, execute, roundtrips with  
the client, context switches, time for your client library to escape  
the data and encode it and for postgres to decode it, etc. In a word :  
OVERHEAD.


I know there is some overhead, but that much when running it batched...?


Well, yeah ;)

Unfortunately its not fast enough, it needs to be done in no more than  
1-2 seconds, ( and in production it will be maybe 20-50 columns of data,  
perhaps divided over 5-10 tables.)
Additionally it needs to scale to perhaps three times as many columns  
and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
Further on it needs to allow for about 20 - 50 clients reading much of  
that data before the next batch of data arrives.


Wow. What is the application ?

	Test run on a desktop PC, Athlon 64 3200+, 2 IDE disks in RAID1 (pretty  
slow) :


test= CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT);
CREATE TABLE
Temps : 11,463 ms

test= INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 10  
) as a;

INSERT 0 10
Temps : 721,579 ms

	OK, so you see, insert speed is pretty fast. With a better CPU and faster  
disks, you can get a lot more.


test= TRUNCATE TABLE test;
TRUNCATE TABLE
Temps : 30,010 ms

test= ALTER TABLE test ADD PRIMARY KEY (f);
INFO:  ALTER TABLE / ADD PRIMARY KEY créera un index implicite «test_pkey»  
pour la table «test»

ALTER TABLE
Temps : 100,577 ms

test= INSERT INTO test SELECT 1,2,3,4,5,a FROM generate_series( 1, 10  
) as a;

INSERT 0 10
Temps : 1915,928 ms

This includes the time to update the index.

test= DROP TABLE test;
DROP TABLE
Temps : 28,804 ms

test= CREATE TABLE test (a INT, b INT, c INT, d INT, e INT, f INT);
CREATE TABLE
Temps : 1,626 ms

test= CREATE OR REPLACE FUNCTION test_insert( )
RETURNS VOID
LANGUAGE plpgsql
AS
$$
DECLARE
_i INTEGER;
BEGIN
FOR _i IN 0..10 LOOP
INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
END LOOP;
END;
$$;
CREATE FUNCTION
Temps : 51,948 ms

test= SELECT test_insert();
 test_insert
-

(1 ligne)

Temps : 1885,382 ms

	Now you see, performing 100K individual inserts inside a plpgsql function  
is also fast.
	The postgres engine is pretty damn fast ; it's the communication overhead  
that you feel, especially switching between client and server processes.


Another example :

= INSERT INTO test (a,b,c,d,e,f) VALUES (... 10 integer tuples)
INSERT 0 10
Temps : 1836,458 ms

	VALUES is actually pretty fast. Here, there is no context switch,  
everything is done in 1 INSERT.


	However COPY is much faster because the parsing overhead and de-escaping  
of data is faster. COPY is optimized for throughput.


So, advice :

	For optimum throughput, have your application build chunks of data into  
text files and use COPY. Or if your client lib supports the copy  
interface, use it.
	You will need a fast disk system with xlog and data on separate disks,  
several CPU cores (1 insert thread will max out 1 core, use the others for  
selects), lots of RAM so index updates don't need to seek, and tuning of  
bgwriter and checkpoints to avoid load spikes.


























---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Michael Stone

On Tue, Jul 17, 2007 at 10:58:01PM +0200, Thomas Finneid wrote:
I am not sure I understand you correctly here, are you saying that 
SELECT INTO in 8.1 disables WAL logging and uses just a single fsync at 
the end? in that case it means that I could disable WAL as well and 
achieve the same performance, does it not?


Yes. The difference is that the select into optimization just means that 
if the system crashes the data you're inserting is invalid (and is 
properly cleaned up), and disabling the WAL means that if the system 
crashes everything is invalid (and can't be cleaned up). 


Mike Stone

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Michael Stone

On Tue, Jul 17, 2007 at 11:01:15PM +0200, Thomas Finneid wrote:

Arjen van der Meijden wrote:
Have you also tried the COPY-statement? Afaik select into is similar to 
what happens in there.


No, because it only works on file to db or vice versa not table to table.


I don't understand how the insert you described is table to table?

Mike Stone

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Thomas Finneid


PFC wrote:
Unfortunately its not fast enough, it needs to be done in no more than 
1-2 seconds, ( and in production it will be maybe 20-50 columns of 
data, perhaps divided over 5-10 tables.)
Additionally it needs to scale to perhaps three times as many columns 
and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
Further on it needs to allow for about 20 - 50 clients reading much of 
that data before the next batch of data arrives.


Wow. What is the application ?


Geological surveys, where they perform realtime geo/hydro-phone shots of 
areas of the size of 10x10km every 3-15 seconds.




test= CREATE OR REPLACE FUNCTION test_insert( )
RETURNS VOID
LANGUAGE plpgsql
AS
$$
DECLARE
_i INTEGER;
BEGIN
FOR _i IN 0..10 LOOP
INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
END LOOP;
END;
$$;
CREATE FUNCTION
Temps : 51,948 ms

test= SELECT test_insert();
 test_insert
-

(1 ligne)

Temps : 1885,382 ms


I tested this one and it took 4 seconds, compared to the jdbc insert 
which took 14 seconds, so its a lot faster. but not as fast as the 
SELECT INTO.


I also tested an INSERT INTO FROM SELECT, which took 1.8 seconds, now we 
are starting to talk about real performance.



However COPY is much faster because the parsing overhead and 
de-escaping of data is faster. COPY is optimized for throughput.


So, advice :

For optimum throughput, have your application build chunks of data 
into text files and use COPY. Or if your client lib supports the copy 
interface, use it.


I did test COPY, i.e. the jdbc COPY patch for pg 8.1, it performs at 
approx 1.8 seconds :) The test was done with text input, I am going to 
test it with binary input, which I expect will increase the performance 
with 20-50%.


All these test have ben performed on a laptop with a Kubuntu 6.10 
version of pg 8.1 without any special pg performance tuning. So I expect 
that compiling lates pg and doing some tuning on it and testing it on 
the a representative server will give it an additional boost in performance.


The key here is that with abundance in performance, I can experiment 
with the solution in a completely different way than if I had any 
artificial restrictions.


You will need a fast disk system with xlog and data on separate 
disks, several CPU cores (1 insert thread will max out 1 core, use the 
others for selects), lots of RAM so index updates don't need to seek, 
and tuning of bgwriter and checkpoints to avoid load spikes.


will have a look at it.

regards

thomas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Thomas Finneid

Michael Glaesemann wrote:


As they're individual inserts, I think what you're seeing is overhead 
from calling this statement 100,000 times, not just on the server but 
also the overhead through JDBC. For comparison, try


CREATE TABLE ciu_data_type_copy LIKE ciu_data_type;

INSERT INTO ciu_data_type_copy (id, loc_id, value3, value5, value8, 
value9, value10, value11)

SELECT id, loc_id, value3, value5, value8, value9, value10, value11
FROM ciu_data_type;

I think this would be more comparable to what you're seeing.


This is much faster than my previous solution, but, I also tested two 
other solutions

- a stored function with array arguments and it performed 3 times better.
- jdbc with COPY patch performed 8.4 times faster with text input, 
expect binary input to be even faster.


regards

thomas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Thomas Finneid



Michael Stone wrote:

On Tue, Jul 17, 2007 at 11:01:15PM +0200, Thomas Finneid wrote:

Arjen van der Meijden wrote:
Have you also tried the COPY-statement? Afaik select into is similar 
to what happens in there.


No, because it only works on file to db or vice versa not table to table.


I don't understand how the insert you described is table to table?


SELECT INTO is table to table, so is INSERT INTO SELECT FROM.

regards

thomas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] insert vs select into performance

2007-07-18 Thread Heikki Linnakangas
Adriaan van Os wrote:
 So, how does one (temporarily) disable WAL logging ? Or, for example,
 disable WAL logging for a temporary table ?

Operations on temporary tables are never WAL logged. Operations on other
tables are, and there's no way to disable it.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] insert vs select into performance

2007-07-17 Thread Thomas Finneid

Hi

I was doing some testing on insert compared to select into. I 
inserted 100 000 rows (with 8 column values) into a table, which took 14 
seconds, compared to a select into, which took 0.8 seconds.
(fyi, the inserts where batched, autocommit was turned off and it all 
happend on the local machine)


Now I am wondering why the select into is that much faster?
Does the select into translate into a specially optimised function in c 
that can cut corners which a insert can not do (e.g. lazy copying), or 
is it some other reason?


The reason I am asking is that select into shows that a number of rows 
can be inserted into a table quite a lot faster than one would think was 
possible with ordinary sql. If that is the case, it means that if I 
write an pl-pgsql insert function in C instead of sql, then I can have 
my db perform order of magnitude faster.


Any comments?

regards

thomas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insert vs select into performance

2007-07-17 Thread Michael Glaesemann


On Jul 17, 2007, at 14:38 , Thomas Finneid wrote:

I was doing some testing on insert compared to select into. I  
inserted 100 000 rows (with 8 column values) into a table, which  
took 14 seconds, compared to a select into, which took 0.8 seconds.
(fyi, the inserts where batched, autocommit was turned off and it  
all happend on the local machine)


Now I am wondering why the select into is that much faster?


It would be helpful if you included the actual queries you're using,  
as there are a number of variables:


1) If there are any constraints on the original table, the INSERT  
will be checking those constraints. AIUI, SELECT INTO does not  
generate any table constraints.


2a) Are you using INSERT INTO foo (foo1, foo2, foo2) SELECT foo1,  
foo2, foo3 FROM pre_foo or individual inserts for each row? The  
former would be faster than the latter.


2b) If you are doing individual inserts, are you wrapping them in a  
transaction? The latter would be faster.


Michael Glaesemann
grzm seespotcode net



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] insert vs select into performance

2007-07-17 Thread Arjen van der Meijden
Have you also tried the COPY-statement? Afaik select into is similar to 
what happens in there.


Best regards,

Arjen

On 17-7-2007 21:38 Thomas Finneid wrote:

Hi

I was doing some testing on insert compared to select into. I 
inserted 100 000 rows (with 8 column values) into a table, which took 14 
seconds, compared to a select into, which took 0.8 seconds.
(fyi, the inserts where batched, autocommit was turned off and it all 
happend on the local machine)


Now I am wondering why the select into is that much faster?
Does the select into translate into a specially optimised function in c 
that can cut corners which a insert can not do (e.g. lazy copying), or 
is it some other reason?


The reason I am asking is that select into shows that a number of rows 
can be inserted into a table quite a lot faster than one would think was 
possible with ordinary sql. If that is the case, it means that if I 
write an pl-pgsql insert function in C instead of sql, then I can have 
my db perform order of magnitude faster.


Any comments?

regards

thomas

---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


  1   2   3   >