Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

Heikki Linnakangas Tue, 16 Sep 2014 03:50:57 -0700

On 09/16/2014 01:28 PM, Andres Freund wrote:

On 2014-09-16 15:43:06 +0530, Amit Kapila wrote:

On Sat, Sep 13, 2014 at 1:33 AM, Heikki Linnakangas <hlinnakan...@vmware.com>
wrote:

On 09/12/2014 10:54 PM, Abhijit Menon-Sen wrote:

At 2014-09-12 22:38:01 +0300, hlinnakan...@vmware.com wrote:

We probably should consider switching to a faster CRC algorithm again,
regardless of what we do with compression.


As it happens, I'm already working on resurrecting a patch that Andres
posted in 2010 to switch to zlib's faster CRC implementation.


As it happens, I also wrote an implementation of Slice-by-4 the other day

:-).

Haven't gotten around to post it, but here it is.


Incase we are using the implementation for everything that uses
COMP_CRC32() macro, won't it give problem for older version
databases.  I have created a database with Head code and then
tried to start server after applying this patch it gives below error:
FATAL:  incorrect checksum in control file


That's indicative of a bug. This really shouldn't cause such problems -
at least my version was compatible with the current definition, and IIRC
Heikki's should be the same in theory. If I read it right.

In general, the idea sounds quite promising.  To see how it performs
on small to medium size data, I have used attached test which is
written be you (with some additional tests) during performance test
of WAL reduction patch in 9.4.


Yes, we should really do this.

The patched version gives better results in all cases
(in range of 10~15%), though this is not the perfect test, however
it gives fair idea that the patch is quite promising.  I think to test
the benefit from crc calculation for full page, we can have some
checkpoint during each test (may be after insert).  Let me know
what other kind of tests do you think are required to see the
gain/loss from this patch.


I actually think we don't really need this. It's pretty evident that
slice-by-4 is a clear improvement.

I think the main difference in this patch and what Andres has
developed sometime back was code for manually unrolled loop
doing 32bytes at once, so once Andres or Abhijit will post an
updated version, we can do some performance tests to see
if there is any additional gain.


If Heikki's version works I see little need to use my/Abhijit's
patch. That version has part of it under the zlib license. If Heikki's
version is a 'clean room', then I'd say we go with it. It looks really
quite similar though... We can make minor changes like additional
unrolling without problems lateron.

I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview asreference - you can probably see the similarity. Any implementation isgoing to look more or less the same, though; there aren't that many waysto write the implementation.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)

Reply via email to