Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
The Saturday 30 October 2010 11:05:17, Andres Freund wrote : Hi, This thread died after me not implementing a new version and some potential license problems. I still think its worthwile (and I used it in production for some time) so I would like to implement a version fit for the next commitfest. The code where I started out from is under the zlib license - which is to my knowledge compatible with PGs licence. Whats the position of HACKERS there? There already is some separately licenced code around and were already linking to zlib licenced code... For simplicitly I asked Mark Adler (the original Copyright Owner) if he would be willing to relicence - he is not. For anybody not hording all old mail like me here is a link to the archives about my old patch: http://archives.postgresql.org/message- id/201005202227.49990.and...@anarazel.de Andres I forgot to report this a few months ago: I had a very intensive COPY load, and this patch helped. The context was a server that was CPU bound on loading data (8 COPY on the same table in parallel, not indexed). This patch gave me a 10% boost in load time. I don't have the figures right now, but I could try to do this test again if this can help. At that time, I just tried it out of curiosity, but the load time was sufficient without it, so I didn't spend more time on it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
On Sat, Oct 30, 2010 at 5:05 AM, Andres Freund and...@anarazel.de wrote: This thread died after me not implementing a new version and some potential license problems. I still think its worthwile (and I used it in production for some time) so I would like to implement a version fit for the next commitfest. The code where I started out from is under the zlib license - which is to my knowledge compatible with PGs licence. Whats the position of HACKERS there? There already is some separately licenced code around and were already linking to zlib licenced code... For simplicitly I asked Mark Adler (the original Copyright Owner) if he would be willing to relicence - he is not. For anybody not hording all old mail like me here is a link to the archives about my old patch: http://archives.postgresql.org/message-id/201005202227.49990.and...@anarazel.de IANAL, but the license doesn't appear incompatible to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Hi, This thread died after me not implementing a new version and some potential license problems. I still think its worthwile (and I used it in production for some time) so I would like to implement a version fit for the next commitfest. The code where I started out from is under the zlib license - which is to my knowledge compatible with PGs licence. Whats the position of HACKERS there? There already is some separately licenced code around and were already linking to zlib licenced code... For simplicitly I asked Mark Adler (the original Copyright Owner) if he would be willing to relicence - he is not. For anybody not hording all old mail like me here is a link to the archives about my old patch: http://archives.postgresql.org/message- id/201005202227.49990.and...@anarazel.de Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
On Sunday 30 May 2010 04:56:09 Greg Stark wrote: This sounds familiar. If you search back in the archives around 2004 or so I think you'll find a similar discussion when we replaced the crc32 implementation with what we have now. We put a fair amount of effort into searching for faster implementations so if you've found one 3x faster I'm pretty startled. All of those didnt think of computing more than one byte at the same time. Most if not all current architectures are more or less superscalar (explictly by the compiler or implicitly by somewhat intelligent silicon) - the current algorithm has an ordering restrictions that prevent any benefit from that. Basically it needs the CRC of the last byte for the next one - the zlib/my version computes 4 bytes independently and then squashes them together which results in way much better overall usage. Are you sure it's faster on all architectures and not a win sometimes and a loss other times? And are you sure it's faster in our use case where we're crcing small sequences of data often and not crcing a large block? I tried on several and it was never a loss at 16+ bytes, never worse at 8, and most of the time equal if not better at 4. Sizes of 1-4 are somewhat slower as they use the same algorithm as the old version but do have an additional jump. Thats a difference of about 3-4cycles. I will try to implement an updated patch sometime these days. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Added to TODO: Consider a faster CRC32 algorithm * http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php --- Andres Freund wrote: Hi, I started to analyze XLogInsert because it was the major bottleneck when creating some materialized view/cached tables/whatever. Analyzing it I could see that content of the COMP_CRC32 macro was taking most of the time which isn't immediately obvious when you profile because it obviously doesn't show up as a separate function. I first put it into functions to make it easier to profile. I couldn't measure any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware (Core2, older Xeon, older Sparc systems). I looked a bit around for faster implementations of CRC32 and found one in zlib. After adapting it (pg uses slightly different computation (non- inverted)) I found that it increases the speed of the CRC32 calculation itself 3 fold. It does that by not only using one lookup table but four (one for each byte of a word). Those four calculations are independent and thus are considerably faster on somewhat recent hardware. Also it does memory lookups in 4 byte steps instead of 1 byte as the pg version (thats only about ~8% benefit in itself). I wrote a preliminary patch which includes both, the original implementation and the new one switchable via an #define. I tested performance differences in a small number of scenarios: - CTAS/INSERT ... SELECT (8-30%) - COPY (3-20%) - pgbench (no real difference unless directly after a checkpoint) Setup: CREATE TABLE blub (ai int, bi int, aibi int); CREATE TABLE speedtest (ai int, bi int, aibi int); INSERT ... SELECT: Statement: INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) a(i), generate_series(1, 1000) b(i); legacy crc: 11526.588 11406.518 11412.182 11430.245 zlib: 9977.394 9945.408 9840.907 9842.875 COPY: Statement: ('blub' enlarged here 4 times, as otherwise the variances were to large) COPY blub TO '/tmp/b' BINARY; ... CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY; legacy: 44835.840 44832.876 zlib: 39530.549 39365.109 39295.167 The performance differences are bigger if the table rows are significantly bigger. Do you think something like that is sensible? If yes, I will make it into a proper patch and such. Thanks, Andres INSERT ... SELECT profile before patch: 20.22% postgres postgres [.] comp_crc32 5.77% postgres postgres [.] XLogInsert 5.55% postgres postgres [.] LWLockAcquire 5.21% postgres [kernel. [k] copy_user_generic_string 4.64% postgres postgres [.] LWLockRelease 4.39% postgres postgres [.] ReadBuffer_common 2.75% postgres postgres [.] heap_insert 2.22% postgres libc-2.1 [.] memcpy 2.09% postgres postgres [.] UnlockReleaseBuffer 1.85% postgres postgres [.] hash_any 1.77% postgres [kernel. [k] clear_page_c 1.69% postgres postgres [.] hash_search_with_hash_value 1.61% postgres postgres [.] heapgettup_pagemode 1.50% postgres postgres [.] PageAddItem 1.42% postgres postgres [.] MarkBufferDirty 1.28% postgres postgres [.] RelationGetBufferForTuple 1.15% postgres postgres [.] ExecModifyTable 1.06% postgres postgres [.] RelationPutHeapTuple After: 9.97% postgres postgres [.] comp_crc32 5.95% postgres [kernel. [k] copy_user_generic_string 5.94% postgres postgres [.] LWLockAcquire 5.64% postgres postgres [.] XLogInsert 5.11% postgres postgres [.] LWLockRelease 4.63% postgres postgres [.] ReadBuffer_common 3.45% postgres postgres [.] heap_insert 2.54% postgres libc-2.1 [.] memcpy 2.03% postgres postgres [.] UnlockReleaseBuffer 1.94% postgres postgres [.] hash_search_with_hash_value 1.84% postgres postgres [.] hash_any 1.73% postgres [kernel. [k] clear_page_c 1.68% postgres postgres [.] PageAddItem 1.62% postgres postgres [.] heapgettup_pagemode 1.52% postgres postgres [.] RelationGetBufferForTuple 1.47% postgres postgres [.] MarkBufferDirty 1.30% postgres postgres [.] ExecModifyTable 1.23% postgres postgres [.] RelationPutHeapTuple [ Attachment, skipping... ] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Hi, I started to analyze XLogInsert because it was the major bottleneck when creating some materialized view/cached tables/whatever. Analyzing it I could see that content of the COMP_CRC32 macro was taking most of the time which isn't immediately obvious when you profile because it obviously doesn't show up as a separate function. I first put it into functions to make it easier to profile. I couldn't measure any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware (Core2, older Xeon, older Sparc systems). I looked a bit around for faster implementations of CRC32 and found one in zlib. After adapting it (pg uses slightly different computation (non- inverted)) I found that it increases the speed of the CRC32 calculation itself 3 fold. It does that by not only using one lookup table but four (one for each byte of a word). Those four calculations are independent and thus are considerably faster on somewhat recent hardware. Also it does memory lookups in 4 byte steps instead of 1 byte as the pg version (thats only about ~8% benefit in itself). I wrote a preliminary patch which includes both, the original implementation and the new one switchable via an #define. I tested performance differences in a small number of scenarios: - CTAS/INSERT ... SELECT (8-30%) - COPY (3-20%) - pgbench (no real difference unless directly after a checkpoint) Setup: CREATE TABLE blub (ai int, bi int, aibi int); CREATE TABLE speedtest (ai int, bi int, aibi int); INSERT ... SELECT: Statement: INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) a(i), generate_series(1, 1000) b(i); legacy crc: 11526.588 11406.518 11412.182 11430.245 zlib: 9977.394 9945.408 9840.907 9842.875 COPY: Statement: ('blub' enlarged here 4 times, as otherwise the variances were to large) COPY blub TO '/tmp/b' BINARY; ... CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY; legacy: 44835.840 44832.876 zlib: 39530.549 39365.109 39295.167 The performance differences are bigger if the table rows are significantly bigger. Do you think something like that is sensible? If yes, I will make it into a proper patch and such. Thanks, Andres INSERT ... SELECT profile before patch: 20.22% postgres postgres [.] comp_crc32 5.77% postgres postgres [.] XLogInsert 5.55% postgres postgres [.] LWLockAcquire 5.21% postgres [kernel. [k] copy_user_generic_string 4.64% postgres postgres [.] LWLockRelease 4.39% postgres postgres [.] ReadBuffer_common 2.75% postgres postgres [.] heap_insert 2.22% postgres libc-2.1 [.] memcpy 2.09% postgres postgres [.] UnlockReleaseBuffer 1.85% postgres postgres [.] hash_any 1.77% postgres [kernel. [k] clear_page_c 1.69% postgres postgres [.] hash_search_with_hash_value 1.61% postgres postgres [.] heapgettup_pagemode 1.50% postgres postgres [.] PageAddItem 1.42% postgres postgres [.] MarkBufferDirty 1.28% postgres postgres [.] RelationGetBufferForTuple 1.15% postgres postgres [.] ExecModifyTable 1.06% postgres postgres [.] RelationPutHeapTuple After: 9.97% postgres postgres [.] comp_crc32 5.95% postgres [kernel. [k] copy_user_generic_string 5.94% postgres postgres [.] LWLockAcquire 5.64% postgres postgres [.] XLogInsert 5.11% postgres postgres [.] LWLockRelease 4.63% postgres postgres [.] ReadBuffer_common 3.45% postgres postgres [.] heap_insert 2.54% postgres libc-2.1 [.] memcpy 2.03% postgres postgres [.] UnlockReleaseBuffer 1.94% postgres postgres [.] hash_search_with_hash_value 1.84% postgres postgres [.] hash_any 1.73% postgres [kernel. [k] clear_page_c 1.68% postgres postgres [.] PageAddItem 1.62% postgres postgres [.] heapgettup_pagemode 1.52% postgres postgres [.] RelationGetBufferForTuple 1.47% postgres postgres [.] MarkBufferDirty 1.30% postgres postgres [.] ExecModifyTable 1.23% postgres postgres [.] RelationPutHeapTuple From f8ec18769e581cf039535730d2088466c461d8f6 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Thu, 29 Apr 2010 22:19:08 +0200 Subject: [PATCH] Preliminary patch using an improved out of line crc32 computation for the xlog. --- src/backend/access/transam/xlog.c | 66 +- src/backend/utils/hash/pg_crc.c | 142 - src/include/utils/pg_crc.h|9 ++- 3 files changed, 180 insertions(+), 37 deletions(-) diff --git a/src/backend/access/transam/xlog.c
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Andres, * Andres Freund (and...@anarazel.de) wrote: Statement: INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) a(i), generate_series(1, 1000) b(i); legacy crc: zlib: Is this legacy crc using the function-based calls, or the macro? Do you have statistics for the zlib approach vs unmodified PG? Do you think something like that is sensible? If yes, I will make it into a proper patch and such. I think that in general we're typically looking for ways to improve performance, yes.. :) Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Hi Stephen, On Thursday 20 May 2010 22:39:26 Stephen Frost wrote: * Andres Freund (and...@anarazel.de) wrote: Statement: INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 1) a(i), generate_series(1, 1000) b(i); legacy crc: Is this legacy crc using the function-based calls, or the macro? Do you have statistics for the zlib approach vs unmodified PG? 'legacy' is out of line as well. I couldn't find a real performance difference above noise between out of line (function) and inline (macro). If anything out of line was a bit faster (instruction cache usage could cause that). So vanilla-zlib should be the same as legacy-zlib Greetings, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
On Thu, May 20, 2010 at 4:27 PM, Andres Freund and...@anarazel.de wrote: I looked a bit around for faster implementations of CRC32 and found one in zlib. After adapting it (pg uses slightly different computation (non- inverted)) I found that it increases the speed of the CRC32 calculation itself 3 fold. But zlib is not under the PostgreSQL license. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
On Friday 21 May 2010 05:40:03 Robert Haas wrote: On Thu, May 20, 2010 at 4:27 PM, Andres Freund and...@anarazel.de wrote: I looked a bit around for faster implementations of CRC32 and found one in zlib. After adapting it (pg uses slightly different computation (non- inverted)) I found that it increases the speed of the CRC32 calculation itself 3 fold. But zlib is not under the PostgreSQL license. Yes. But: 1. the zlib license shouldn't be a problem in itself - pg_dump also already links to zlib 2. I planned to ask Mark Adler whether he would support relicising those bits. I have read some other discussions where he was supportive of doing such a thing 3. Given that idea was posted publically on the usenet it is not hard to produce an independent implementation. So I do not see any big problems there... Or am I missing something? Greetings, Andres /* zlib.h -- interface of the 'zlib' general purpose compression library version 1.2.2, October 3rd, 2004 Copyright (C) 1995-2004 Jean-loup Gailly and Mark Adler This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. Jean-loup Gailly jl...@gzip.org Mark Adler mad...@alumni.caltech.edu */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers