Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-11-03 Thread Marc Cousin
The Saturday 30 October 2010 11:05:17, Andres Freund wrote :
 Hi,
 
 This thread died after me not implementing a new version and some potential
 license problems.
 
 I still think its worthwile (and I used it in production for some time) so
 I would like to implement a version fit for the next commitfest.
 
 The code where I started out from is under the zlib license - which is to
 my knowledge compatible with PGs licence. Whats the position of HACKERS
 there? There already is some separately licenced code around and were
 already linking to zlib licenced code...
 
 For simplicitly I asked Mark Adler (the original Copyright Owner) if he
 would be willing to relicence - he is not.
 
 For anybody not hording all old mail like me here is a link to the archives
 about my old patch:
 
 http://archives.postgresql.org/message-
 id/201005202227.49990.and...@anarazel.de
 
 
 Andres

I forgot to report this a few months ago:

I had a very intensive COPY load, and this patch helped. The context was a 
server that was CPU bound on loading data (8 COPY on the same table in 
parallel, not indexed). This patch gave me a 10% boost in load time. I don't 
have the figures right now, but I could try to do this test again if this can 
help. At that time, I just tried it out of curiosity, but the load time was 
sufficient without it, so I didn't spend more time on it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-10-31 Thread Robert Haas
On Sat, Oct 30, 2010 at 5:05 AM, Andres Freund and...@anarazel.de wrote:
 This thread died after me not implementing a new version and some potential
 license problems.

 I still think its worthwile (and I used it in production for some time) so I
 would like to implement a version fit for the next commitfest.

 The code where I started out from is under the zlib license - which is to my
 knowledge compatible with PGs licence. Whats the position of HACKERS there?
 There already is some separately licenced code around and were already linking
 to zlib licenced code...

 For simplicitly I asked Mark Adler (the original Copyright Owner) if he would
 be willing to relicence - he is not.

 For anybody not hording all old mail like me here is a link to the archives
 about my old patch:

 http://archives.postgresql.org/message-id/201005202227.49990.and...@anarazel.de

IANAL, but the license doesn't appear incompatible to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-10-30 Thread Andres Freund
Hi,

This thread died after me not implementing a new version and some potential 
license problems.

I still think its worthwile (and I used it in production for some time) so I 
would like to implement a version fit for the next commitfest.

The code where I started out from is under the zlib license - which is to my 
knowledge compatible with PGs licence. Whats the position of HACKERS there? 
There already is some separately licenced code around and were already linking 
to zlib licenced code...

For simplicitly I asked Mark Adler (the original Copyright Owner) if he would 
be willing to relicence - he is not.

For anybody not hording all old mail like me here is a link to the archives 
about my old patch:

http://archives.postgresql.org/message-
id/201005202227.49990.and...@anarazel.de


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-30 Thread Andres Freund
On Sunday 30 May 2010 04:56:09 Greg Stark wrote:
 This sounds familiar. If you search back in the archives around 2004
 or so I think you'll find a similar discussion when we replaced the
 crc32 implementation with what we have now. We put a fair amount of
 effort into searching for faster implementations so if you've found
 one 3x faster I'm pretty startled. 
All of those didnt think of computing more than one byte at the same time. 
Most if not all current architectures are more or less superscalar (explictly 
by the compiler or implicitly by somewhat intelligent silicon) - the current 
algorithm has an ordering restrictions that prevent any benefit from that.
Basically it needs the CRC of the last byte for the next one - the zlib/my 
version computes 4 bytes independently and then squashes them together which 
results in way much better overall usage.

 Are you sure it's faster on all
 architectures and not a win sometimes and a loss other times? And are
 you sure it's faster in our use case where we're crcing small
 sequences of data often and not crcing a large block?
I tried on several and it was never a loss at 16+ bytes, never worse at 8, and 
most of the time equal if not better at 4. Sizes of 1-4 are somewhat slower as  
they use the same algorithm as the old version but do have an additional jump. 
Thats a difference of about 3-4cycles.

I will try to implement an updated patch sometime these days.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-29 Thread Bruce Momjian

Added to TODO:

Consider a faster CRC32 algorithm

* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php 

---

Andres Freund wrote:
 Hi,
 
 I started to analyze XLogInsert because it was the major bottleneck when 
 creating some materialized view/cached tables/whatever.
 Analyzing it I could see that content of the COMP_CRC32 macro was taking most 
 of the time which isn't immediately obvious when you profile because it 
 obviously doesn't show up as a separate function.
 I first put it into functions to make it easier to profile. I couldn't 
 measure 
 any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware 
 (Core2, older Xeon, older Sparc systems).
 
 I looked a bit around for faster implementations of CRC32 and found one in 
 zlib. After adapting it (pg uses slightly different computation (non-
 inverted)) I found that it increases the speed of the CRC32 calculation 
 itself 
 3 fold.
 It does that by not only using one lookup table but four (one for each byte 
 of 
 a word). Those four calculations are independent and thus are considerably 
 faster on somewhat recent hardware.
 Also it does memory lookups in 4 byte steps instead of 1 byte as the pg 
 version (thats only about ~8% benefit in itself).
 
 I wrote a preliminary patch which includes both, the original implementation 
 and the new one switchable via an #define.
 
 
 I tested performance differences in a small number of scenarios:
 - CTAS/INSERT ... SELECT (8-30%)
 - COPY (3-20%)
 - pgbench (no real difference unless directly after a checkpoint)
 
 Setup:
 
 CREATE TABLE blub (ai int, bi int, aibi int);
 CREATE TABLE speedtest (ai int, bi int, aibi int);
 
 
 INSERT ... SELECT:
 
 Statement:
 INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) 
 a(i), generate_series(1, 1000) b(i);
 
 legacy crc:
 
 11526.588
 11406.518
 11412.182
 11430.245
 
 zlib:
 9977.394
 9945.408
 9840.907
 9842.875
 
 
 COPY:
 Statement:
 ('blub' enlarged here 4 times, as otherwise the variances were to large)
 
 COPY blub TO '/tmp/b' BINARY;
 ...
 CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY;
 
 legacy:
 44835.840
 44832.876
 
 zlib:
 39530.549
 39365.109
 39295.167
 
 The performance differences are bigger if the table rows are significantly 
 bigger. 
 
 Do you think something like that is sensible? If yes, I will make it into a 
 proper patch and such.
 
 Thanks,
 
 Andres
 
 INSERT ... SELECT profile before patch:
 
 20.22% postgres  postgres   [.] comp_crc32
  5.77% postgres  postgres   [.] XLogInsert
  5.55% postgres  postgres   [.] LWLockAcquire
  5.21% postgres  [kernel.   [k] copy_user_generic_string
  4.64% postgres  postgres   [.] LWLockRelease
  4.39% postgres  postgres   [.] ReadBuffer_common
  2.75% postgres  postgres   [.] heap_insert
  2.22% postgres  libc-2.1   [.] memcpy
  2.09% postgres  postgres   [.] UnlockReleaseBuffer
  1.85% postgres  postgres   [.] hash_any
  1.77% postgres  [kernel.   [k] clear_page_c
  1.69% postgres  postgres   [.] hash_search_with_hash_value
  1.61% postgres  postgres   [.] heapgettup_pagemode
  1.50% postgres  postgres   [.] PageAddItem
  1.42% postgres  postgres   [.] MarkBufferDirty
  1.28% postgres  postgres   [.] RelationGetBufferForTuple
  1.15% postgres  postgres   [.] ExecModifyTable
  1.06% postgres  postgres   [.] RelationPutHeapTuple
 
 
 After:
 
  9.97% postgres  postgres   [.] comp_crc32
  5.95% postgres  [kernel.   [k] copy_user_generic_string
  5.94% postgres  postgres   [.] LWLockAcquire
  5.64% postgres  postgres   [.] XLogInsert
  5.11% postgres  postgres   [.] LWLockRelease
  4.63% postgres  postgres   [.] ReadBuffer_common
  3.45% postgres  postgres   [.] heap_insert
  2.54% postgres  libc-2.1   [.] memcpy
  2.03% postgres  postgres   [.] UnlockReleaseBuffer
  1.94% postgres  postgres   [.] hash_search_with_hash_value
  1.84% postgres  postgres   [.] hash_any
  1.73% postgres  [kernel.   [k] clear_page_c
  1.68% postgres  postgres   [.] PageAddItem
  1.62% postgres  postgres   [.] heapgettup_pagemode
  1.52% postgres  postgres   [.] RelationGetBufferForTuple
  1.47% postgres  postgres   [.] MarkBufferDirty
  1.30% postgres  postgres   [.] ExecModifyTable
  1.23% postgres  postgres   [.] RelationPutHeapTuple

[ Attachment, skipping... ]

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-20 Thread Andres Freund
Hi,

I started to analyze XLogInsert because it was the major bottleneck when 
creating some materialized view/cached tables/whatever.
Analyzing it I could see that content of the COMP_CRC32 macro was taking most 
of the time which isn't immediately obvious when you profile because it 
obviously doesn't show up as a separate function.
I first put it into functions to make it easier to profile. I couldn't measure 
any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware 
(Core2, older Xeon, older Sparc systems).

I looked a bit around for faster implementations of CRC32 and found one in 
zlib. After adapting it (pg uses slightly different computation (non-
inverted)) I found that it increases the speed of the CRC32 calculation itself 
3 fold.
It does that by not only using one lookup table but four (one for each byte of 
a word). Those four calculations are independent and thus are considerably 
faster on somewhat recent hardware.
Also it does memory lookups in 4 byte steps instead of 1 byte as the pg 
version (thats only about ~8% benefit in itself).

I wrote a preliminary patch which includes both, the original implementation 
and the new one switchable via an #define.


I tested performance differences in a small number of scenarios:
- CTAS/INSERT ... SELECT (8-30%)
- COPY (3-20%)
- pgbench (no real difference unless directly after a checkpoint)

Setup:

CREATE TABLE blub (ai int, bi int, aibi int);
CREATE TABLE speedtest (ai int, bi int, aibi int);


INSERT ... SELECT:

Statement:
INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) 
a(i), generate_series(1, 1000) b(i);

legacy crc:

11526.588
11406.518
11412.182
11430.245

zlib:
9977.394
9945.408
9840.907
9842.875


COPY:
Statement:
('blub' enlarged here 4 times, as otherwise the variances were to large)

COPY blub TO '/tmp/b' BINARY;
...
CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY;

legacy:
44835.840
44832.876

zlib:
39530.549
39365.109
39295.167

The performance differences are bigger if the table rows are significantly 
bigger. 

Do you think something like that is sensible? If yes, I will make it into a 
proper patch and such.

Thanks,

Andres

INSERT ... SELECT profile before patch:

20.22% postgres  postgres   [.] comp_crc32
 5.77% postgres  postgres   [.] XLogInsert
 5.55% postgres  postgres   [.] LWLockAcquire
 5.21% postgres  [kernel.   [k] copy_user_generic_string
 4.64% postgres  postgres   [.] LWLockRelease
 4.39% postgres  postgres   [.] ReadBuffer_common
 2.75% postgres  postgres   [.] heap_insert
 2.22% postgres  libc-2.1   [.] memcpy
 2.09% postgres  postgres   [.] UnlockReleaseBuffer
 1.85% postgres  postgres   [.] hash_any
 1.77% postgres  [kernel.   [k] clear_page_c
 1.69% postgres  postgres   [.] hash_search_with_hash_value
 1.61% postgres  postgres   [.] heapgettup_pagemode
 1.50% postgres  postgres   [.] PageAddItem
 1.42% postgres  postgres   [.] MarkBufferDirty
 1.28% postgres  postgres   [.] RelationGetBufferForTuple
 1.15% postgres  postgres   [.] ExecModifyTable
 1.06% postgres  postgres   [.] RelationPutHeapTuple


After:

 9.97% postgres  postgres   [.] comp_crc32
 5.95% postgres  [kernel.   [k] copy_user_generic_string
 5.94% postgres  postgres   [.] LWLockAcquire
 5.64% postgres  postgres   [.] XLogInsert
 5.11% postgres  postgres   [.] LWLockRelease
 4.63% postgres  postgres   [.] ReadBuffer_common
 3.45% postgres  postgres   [.] heap_insert
 2.54% postgres  libc-2.1   [.] memcpy
 2.03% postgres  postgres   [.] UnlockReleaseBuffer
 1.94% postgres  postgres   [.] hash_search_with_hash_value
 1.84% postgres  postgres   [.] hash_any
 1.73% postgres  [kernel.   [k] clear_page_c
 1.68% postgres  postgres   [.] PageAddItem
 1.62% postgres  postgres   [.] heapgettup_pagemode
 1.52% postgres  postgres   [.] RelationGetBufferForTuple
 1.47% postgres  postgres   [.] MarkBufferDirty
 1.30% postgres  postgres   [.] ExecModifyTable
 1.23% postgres  postgres   [.] RelationPutHeapTuple
From f8ec18769e581cf039535730d2088466c461d8f6 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 29 Apr 2010 22:19:08 +0200
Subject: [PATCH] Preliminary patch using an improved out of line crc32 computation for
 the xlog.

---
 src/backend/access/transam/xlog.c |   66 +-
 src/backend/utils/hash/pg_crc.c   |  142 -
 src/include/utils/pg_crc.h|9 ++-
 3 files changed, 180 insertions(+), 37 deletions(-)

diff --git a/src/backend/access/transam/xlog.c 

Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-20 Thread Stephen Frost
Andres,

* Andres Freund (and...@anarazel.de) wrote:
 Statement:
 INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) 
 a(i), generate_series(1, 1000) b(i);
 
 legacy crc:
 
 zlib:

Is this legacy crc using the function-based calls, or the macro?  Do you
have statistics for the zlib approach vs unmodified PG?

 Do you think something like that is sensible? If yes, I will make it into a 
 proper patch and such.

I think that in general we're typically looking for ways to improve
performance, yes.. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-20 Thread Andres Freund
Hi Stephen,

On Thursday 20 May 2010 22:39:26 Stephen Frost wrote:
 * Andres Freund (and...@anarazel.de) wrote:
  Statement:
  INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1)
  a(i), generate_series(1, 1000) b(i);
  
  legacy crc:
 Is this legacy crc using the function-based calls, or the macro?  Do you
 have statistics for the zlib approach vs unmodified PG?
'legacy' is out of line as well. I couldn't find a real performance difference 
above noise between out of line (function) and inline (macro). If anything out 
of line was a bit faster (instruction cache usage could cause that).

So vanilla-zlib should be the same as legacy-zlib

Greetings, 
Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-20 Thread Robert Haas
On Thu, May 20, 2010 at 4:27 PM, Andres Freund and...@anarazel.de wrote:
 I looked a bit around for faster implementations of CRC32 and found one in
 zlib. After adapting it (pg uses slightly different computation (non-
 inverted)) I found that it increases the speed of the CRC32 calculation itself
 3 fold.

But zlib is not under the PostgreSQL license.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-20 Thread Andres Freund
On Friday 21 May 2010 05:40:03 Robert Haas wrote:
 On Thu, May 20, 2010 at 4:27 PM, Andres Freund and...@anarazel.de wrote:
  I looked a bit around for faster implementations of CRC32 and found one
  in zlib. After adapting it (pg uses slightly different computation (non-
  inverted)) I found that it increases the speed of the CRC32 calculation
  itself 3 fold.
 
 But zlib is not under the PostgreSQL license.
Yes. But:
1. the zlib license shouldn't be a problem in itself - pg_dump also already 
links to zlib
2. I planned to ask Mark Adler whether he would support relicising those bits. 
I have read some other discussions where he was supportive of doing such a 
thing
3. Given that idea was posted publically on the usenet it is not hard to 
produce an independent implementation.

So I do not see any big problems there... Or am I missing something?

Greetings,

Andres

/* zlib.h -- interface of the 'zlib' general purpose compression library
  version 1.2.2, October 3rd, 2004

  Copyright (C) 1995-2004 Jean-loup Gailly and Mark Adler

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
 claim that you wrote the original software. If you use this software
 in a product, an acknowledgment in the product documentation would be
 appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
 misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  Jean-loup Gailly jl...@gzip.org
  Mark Adler mad...@alumni.caltech.edu

*/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers