On Thu, Sep 17, 2009 at 07:10:35PM -0400, Andrew Dunstan wrote:
>
>
> Greg Smith wrote:
>> On Thu, 17 Sep 2009, Dan Colish wrote:
>>
>>>     - Performance appears to be the same although I don't have a good 
>>> way for
>>>       testing this at the moment
>>
>> Here's what I do to generate simple COPY performance test cases:
>>
>> CREATE TABLE t (i integer);
>> INSERT INTO t SELECT x FROM generate_series(1,100000) AS x;
>> \timing
>> COPY t TO '/some/file' WITH [options];
>> BEGIN;
>> TRUNCATE TABLE t;
>> COPY t FROM '/some/file' WITH [options];
>> COMMIT;
>>
>> You can adjust the size of the generated table based on whether you  
>> want to minimize (small number) or maximize (big number) the impact of  
>> the setup overhead relative to actual processing time.  Big numbers  
>> make sense if there's a per-row change, small ones if it's mainly COPY  
>> setup that's been changed if you want a small bit of data to test  
>> against.
>>
>> An example with one column in it is a good test case for seeing  
>> whether per-row impact has gone up.  You'd want something with a wider  
>> row for other types of performance tests.
>>
>> The reason for the BEGIN/COMMIT there is that form utilizes an  
>> optimization that lowers WAL volume when doing the COPY insertion,  
>> which makes it more likely you'll be testing performance of the right  
>> thing.
>>
>>
>
> I usually prefer to test with a table that is more varied than anything  
> you can make with generate_series. When I tested my ragged copy patch  
> the other day I copied 1,000,000 rows out of a large table with a  
> mixture of dates, strings, numbers and nulls.
>
> But then, it has a (tiny) per field overhead so I wanted to make sure  
> that was well represented in the test.
>
> You are certainly right about wrapping it in begin/truncate/commit (and  
> when you do make sure that archiving is not on).
>
> You probably want to make sure that the file is not on the same disk as  
> the database, to avoid disk contention. Or, better, make sure that it is  
> in OS file system cache, or on a RAM disk.
>
> cheers
>
> andrew

If someone with a more significant setup can run tests that would ideal.
I only have my laptop which is a single disk and fairly underpowered.

That said, here are my results running the script above, it looks like
the pach improves performance. I would really interested to see results
on a larger data set and heavier iron.

--
--Dan

Without Patch:

        CREATE TABLE
        INSERT 0 100000
        Timing is on.
        COPY 100000
        Time: 83.273 ms
        BEGIN
        Time: 0.412 ms
        TRUNCATE TABLE
        Time: 0.357 ms
        COPY 100000
        Time: 140.911 ms
        COMMIT
        Time: 4.909 ms

With Patch:

        CREATE TABLE
        INSERT 0 100000
        Timing is on.
        COPY 100000
        Time: 80.205 ms
        BEGIN
        Time: 0.351 ms
        TRUNCATE TABLE
        Time: 0.346 ms
        COPY 100000
        Time: 124.303 ms
        COMMIT
        Time: 4.130 ms




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to