On Thu, Nov 12, 2020 at 07:45:14PM +0200, Maamoun TK wrote:
> ---------- Forwarded message ---------
> From: Maamoun TK <maamoun...@googlemail.com>
> Date: Thu, Nov 12, 2020 at 7:42 PM
> Subject: Re: [PowerPC] GCM optimization
> To: Niels Möller <ni...@lysator.liu.se>
> 
> 
> On Thu, Nov 12, 2020 at 6:40 PM Niels Möller <ni...@lysator.liu.se> wrote:
> 
> > I gave it a test run on gcc112 in the gcc compile farm, and speedup of
> > gcm update seems to be 26 times(!) compared to the C version.
> >
> 
> That's reasonable, I got similar speedup on more stable POWER instances
> than gcc compile farm.
> 
> 
> > Where would that documentation be published? In the Nettle manual, as
> > some IBM white paper, or as a more-or-less academic paper, e.g., on
> > arxiv? I will not be able to spend much time on writing, but I'd be
> > happy to review.
> >
> 
> I'll start writing the papers once I got more details from IBM, similar to
> intel documents, the document will be academic and practical at the same

Hi Mamone,

What do you need from the IBM side?  I may be able to help.  We'd definitely
like to support you and Niels in publishing your results.

> time, I'll dive into finite field equations to demonstrate how we get there
> as well as I'll add a practical example to clarify the preference of this
> method in addition to the expected speedup of this method. My
> intention that other crypto libraries could take advantage of this document
> or maybe be a starting point for further improvements to the algorithm so
> I'm checking if IBM would publish or approve such a document the same as
> intel.
> 
> 
> > I have a sketch of ARM Neon code doing the equivalent of two vpmsumd,
> > with reasonable parallelism. Quite a lot of instructions needed.
> >
> 
> If you don't have much time, you can send it here and I'll continue from
> that point. I'm planning to compare the new method with the usual method
> with and without the karatsuba algorithm.
> 
> > +C Alignment of gcm_key table elements, which is declared in gcm.h
> > > +define(`TableElemAlign', `0x100')
> >
> > I still find this large constant puzzling. If I try
> >
> >   struct gcm_key key;
> >   printf("sizeof (key): %zd, sizeof(key.h[0]): %zd\n", sizeof(key),
> > sizeof(key.h[0]));
> >
> > (I added it to the start of test_main in gcm-test.c) and run on the
> > gcc112 machine, I get
> >
> >   sizeof (key): 4096, sizeof(key.h[0]): 16
> >
> > Which is what I'd expect, with elements of size 16 bytes, not 256 bytes.
> >
> > I haven't yet had the time to read the code carefully.
> >
> 
> You see, the alignment of each element is 0x100 (256). The table has 16
> elements and you got the size of the table 4096 which is reasonable because
> 16*256=4096
> 
> regards,
> Mamone
> _______________________________________________
> nettle-bugs mailing list
> nettle-bugs@lists.lysator.liu.se
> http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

-- 
George Wilson
IBM Linux Technology Center
Security Development
_______________________________________________
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to