Re: std.hash: More questions

Dmitry Olshansky Sun, 08 Jul 2012 06:15:25 -0700

On 08-Jul-12 17:09, Johannes Pfau wrote:

Am Fri, 06 Jul 2012 01:24:04 +0400
schrieb Dmitry Olshansky <[email protected]>:


The only thing  I can think of that would require start function is
using unconventional initial vectors.


Those could be done as template parameters though? (If the hash is
written as a templated struct).

Well probably, but it will lead to code duplication for no real benefit.Maybe it'll be faster with constant vectors, but I'm not so sure.


But e.g. OpenSSL has *_init functions as well, so we probably should
keep the start function even if it's just to allow wrappers for
OpenSSL?


CRC32 sums are usually presented as a uint, not a ubyte[4]. To fit
the rest of the API ubyte[4] is used. Now there's a small annoying
detail: The CRC32 should be printed in LSB-first order.

You probably meant MSB first.


The rosettacode.org site which I used to verify the CRC32 results said
LSB-first but it seems it only describes the data layout of the uint
value (Little Endian). The printf/writef result is indeed MSB-first.

When printing an uint like this, that works well:
writefln("%#x", 4157704578); //0xf7d18982
but this doesn't:
toHexString(*cast(ubyte[4]*)&4157704578); //8289D1F7


There is no problem it's just order of printing that at fault. So I
suggest to *stop* doing a bswap.

It's just that printing something as an array of ubytes does it from
least significant byte to most significant. You could try to add
MSB/LSB first options to toHexString.


Yes, but that's not very intuitive. Most people would expect the same
result (by default) that other languages provide:
http://rosettacode.org/wiki/CRC-32

I'll add the order option to toHexString but I think I'll also
add an alias crcToHexString/crcHexString or something like that.


I can't change toHexString as it's used for all hashes and it's
correct for SHA1, MD5, ...
So I currently use bswap in the CRC32 finish() implementation to fix
this issue.

no-no-no see the above ;)

Now the question is should I provide an additional finishUint
function which avoids the bswap?


Implementation issue:

The current implementation of SHA1 and MD5 uses memcpy which doesn't
work in CTFE IIRC and which also prevents the code from being pure.
I could replace those memcpy calls with array copying but I'm not
sure if memcpy was used for performance, so I'd like to keep it as
long as we have no performance tests.

Replace memcpy with and array ops:
ptr1[x..y] = ptr2[x2..y2];
note that it's better to have them be pointers as it avoid bounds
check & D runtime magic.

If need be I can provide benchmarks but I'm certain from the days of
optimizing std.regex that it's faster or on par with memcpy.


OK great, pure is working. CTFE not yet, but that can be added later.

Do we want to add 'pure' as part of the functions in the Digest
interface? This would require all implementations to be pure, I don't
know if that's a good idea right now.

Some implementations may choose to call into kernel for respectivecrypto-primitives. I'd say no need to slap pure on top of it in a harry.


--
Dmitry Olshansky

Re: std.hash: More questions

Reply via email to