On Thu, Sep 17, 2009 at 07:10:35PM -0400, Andrew Dunstan wrote:
>
>
> Greg Smith wrote:
>> On Thu, 17 Sep 2009, Dan Colish wrote:
>>
>>> - Performance appears to be the same although I don't have a good
>>> way for
>>> testing this at the moment
>>
>> Here's what I do to generate simple COPY performance test cases:
>>
>> CREATE TABLE t (i integer);
>> INSERT INTO t SELECT x FROM generate_series(1,100000) AS x;
>> \timing
>> COPY t TO '/some/file' WITH [options];
>> BEGIN;
>> TRUNCATE TABLE t;
>> COPY t FROM '/some/file' WITH [options];
>> COMMIT;
>>
>> You can adjust the size of the generated table based on whether you
>> want to minimize (small number) or maximize (big number) the impact of
>> the setup overhead relative to actual processing time. Big numbers
>> make sense if there's a per-row change, small ones if it's mainly COPY
>> setup that's been changed if you want a small bit of data to test
>> against.
>>
>> An example with one column in it is a good test case for seeing
>> whether per-row impact has gone up. You'd want something with a wider
>> row for other types of performance tests.
>>
>> The reason for the BEGIN/COMMIT there is that form utilizes an
>> optimization that lowers WAL volume when doing the COPY insertion,
>> which makes it more likely you'll be testing performance of the right
>> thing.
>>
>>
>
> I usually prefer to test with a table that is more varied than anything
> you can make with generate_series. When I tested my ragged copy patch
> the other day I copied 1,000,000 rows out of a large table with a
> mixture of dates, strings, numbers and nulls.
>
> But then, it has a (tiny) per field overhead so I wanted to make sure
> that was well represented in the test.
>
> You are certainly right about wrapping it in begin/truncate/commit (and
> when you do make sure that archiving is not on).
>
> You probably want to make sure that the file is not on the same disk as
> the database, to avoid disk contention. Or, better, make sure that it is
> in OS file system cache, or on a RAM disk.
>
> cheers
>
> andrew
If someone with a more significant setup can run tests that would ideal.
I only have my laptop which is a single disk and fairly underpowered.
That said, here are my results running the script above, it looks like
the pach improves performance. I would really interested to see results
on a larger data set and heavier iron.
--
--Dan
Without Patch:
CREATE TABLE
INSERT 0 100000
Timing is on.
COPY 100000
Time: 83.273 ms
BEGIN
Time: 0.412 ms
TRUNCATE TABLE
Time: 0.357 ms
COPY 100000
Time: 140.911 ms
COMMIT
Time: 4.909 ms
With Patch:
CREATE TABLE
INSERT 0 100000
Timing is on.
COPY 100000
Time: 80.205 ms
BEGIN
Time: 0.351 ms
TRUNCATE TABLE
Time: 0.346 ms
COPY 100000
Time: 124.303 ms
COMMIT
Time: 4.130 ms
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers