Re: BLAKE2 and Tree Implementation?

Jean-Pierre Münch Thu, 26 Nov 2015 13:56:21 -0800

Just to clarify:

We're only considering the depth 1 tree here, like


Root<-------------
|      |        |      |
Leaf Leaf Leaf Leaf

(epic ASCII art, I know ;)

and not something fancy as

Root<-----
 |            |
Node  Node<---
|     |        |      |
Leaf Leaf Leaf Leaf

?

Note well: I'm not promoting that we should implement the latter one,
but want to be sure.

Am 26.11.2015 um 19:20 schrieb Jeffrey Walton:
> Hi Everyone,
>
> BLAKE2 (https://blake2.net/blake2.pdf) has a tree implementation
> option that can execute in parallel. The parallel tree-code is missing
> from the RFC (RFC 7693), but I'm guessing its probably useful in some
> situations.
Example for everyone who can't think of one:
Say you're on your new PC with one of those fancy M.2 SSDs achieving
more than 1GB/s transfer rate. Now a single core implementation won't
cut in terms of speed to keep up with the speed of the SSD so you need
to use tree hashing (-> partitionize your message into chunks, hash each
of them and hash the hashes) to use all your CPU cores.
>
> I'm working on a BLAKE2 implementation, but it _will not_ include the
> tree implementation for the initial cut-in. However, I want to ensure
> we can add the tree code in the future without changing things or
> breaking clients. I'd like some feedback on the stub implementation
> for the tree-based code.
I'll also copy the path we choose here for my Skein (-Family)
implementation(s).
>
> I'm not examining the option of: we perform all the parallel
> processing in the "core library" using Windows threads, pthreads or
> OpenMP tasks. I don't feel its within purview of the library. However,
> we probably want to provide an OpenMP test case for the tree-based
> code to ensure it works as expected.
Didn't we once introduce a compile-time switch to allow for internal
OpenMP usage (it was in context of the RW blinding) or did we drop it
again? But anyways I *think* we may want to have some sort of helper
class / function to run a "fire-and-forget" BLAKE2 tree-hash using
OpenMP as the reference BLAKE2 code does.
>
> *****
>
> Crypto++ follows the Init/Update/Final pattern:
>
>     // Constructor follows RAII and performs Init()
I think Init() should be done in Restart() and the constructor should
just call Restart(). IIRC this is what is used in the rest of the
library (along with TruncatedFinal() also calling Restart() usually IIRC)
>
>     // Existing....
>     void Update(const byte* inString size_t length);
>
>     // Existing
>     void Final(byte* digest);
>     void TruncatedFinal(byte* digest, size_t length);
>
> *****
>
> Now we need to decide how to carve-in the tree-based code. I'm think
> four things need to be performed, similar to Hadoop processing:
>
>     (1) Data needs to be partitioned
>     (2) Data needs to be packaged
>     (3) Data needs to processed
>     (4) Results needs to combined
>
> (1) is the client's responsibility. (2) is the client's
> responsibility, but we need to provide the data structure. (3) is our
> responsibility. (4) is the client's responsibility.
>
> With that said, I'm thinking there should be a structure to represent
> the packaged data. I'm thinking it should be an inner class, similar
> to an iterator:
>
> (a)   BLAKE2::TreeNode
Is this supposed to be basically a struct holding a bunch of data
values? This is C-Style thinking and doesn't use the full beauty of C++ IMO.
My *personal* preference would be to make it a proper class that
implements everything from HashTransformation except TruncatedFinal()
(or maybe it should and give the bytes being passed to the root node?).
>
> I think we need an additional Update method, and that method should be
> const:
>
> (b)    bool BLAKE2::Update(TreeNode& node, bool throwOnError) const
This will update the current node state into the root state?
>
> (c) TreeNode will likely need to be an IN/OUT parameter. IN, it will
> hold a reference to the existing state of the BLAKE2 hash. The BLAKE2
> object state is constant. The TreeNode state is mutable.
I don't see any reason here for not making it "const TreeNode&"?
>
> (d) Update will perform the processing on the node, and only modify
> the members of TreeNode. It will not modify the state of the BLAKE2
> object.
So you'd do BLAKE2::TreeNode::Update() (w/o ref to parent node) to feed
in the data? I'm fine with that.
>
> (e) OUT, the TreeNode will provide the result of processing, meaning
> the transformed data.
Are we still talking about "BLAKE2::Update(TreeNode&,bool)const" ? If so
what transformed data would you like to output into the TreeNode&, after
all it's pretty useless after it has been incorporated into the parent
node's state.
>
> (f) Update will return true/false to indicate success/failure. If
> throwOnError is true, then Update will throw on failure instead. I
> think this is important because async methods and parallel processing
> complicates catching a throw and matching an error to offending code.
>
> (g) After the function returns, the results need to be combined and
> the BLAKE2 object needs to be updated.
There is a standard way to combine the data (internally) which was used
in the reference code IIRC. As for the API I *thought* the
Update(TreeNode&,bool) calls would merge the TreeNode results into the
parent node?
> This is where we update the BLAKE2 object state, if necessary.
Well, we need to for Final() to work...
>
> *****
>
> Does anyone see any gaps or problems with (a) - (g)?
>
> Does anyone have an idea of what should happen after (f) to accomplish
> (g)? What should the (g) function look like? what is its signature?
>
> Sorry about the long write-up. I know its not easy to parse. Thanks in
> advance.
It's an important topic worth the read. I hope I understood it all
correctly though... :)

BR

JPM
>
> Jeff
>
> -- 
> -- 
> You received this message because you are subscribed to the "Crypto++
> Users" Google Group.
> To unsubscribe, send an email to
> [email protected].
> More information about Crypto++ and this group is available at
> http://www.cryptopp.com.
> ---
> You received this message because you are subscribed to the Google
> Groups "Crypto++ Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected]
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout.

-- 
-- 
You received this message because you are subscribed to the "Crypto++ Users" 
Google Group.
To unsubscribe, send an email to [email protected].
More information about Crypto++ and this group is available at 
http://www.cryptopp.com.
--- 
You received this message because you are subscribed to the Google Groups 
"Crypto++ Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: BLAKE2 and Tree Implementation?

Reply via email to