On Thu, Nov 12, 2020 at 07:45:14PM +0200, Maamoun TK wrote: > ---------- Forwarded message --------- > From: Maamoun TK <maamoun...@googlemail.com> > Date: Thu, Nov 12, 2020 at 7:42 PM > Subject: Re: [PowerPC] GCM optimization > To: Niels Möller <ni...@lysator.liu.se> > > > On Thu, Nov 12, 2020 at 6:40 PM Niels Möller <ni...@lysator.liu.se> wrote: > > > I gave it a test run on gcc112 in the gcc compile farm, and speedup of > > gcm update seems to be 26 times(!) compared to the C version. > > > > That's reasonable, I got similar speedup on more stable POWER instances > than gcc compile farm. > > > > Where would that documentation be published? In the Nettle manual, as > > some IBM white paper, or as a more-or-less academic paper, e.g., on > > arxiv? I will not be able to spend much time on writing, but I'd be > > happy to review. > > > > I'll start writing the papers once I got more details from IBM, similar to > intel documents, the document will be academic and practical at the same
Hi Mamone, What do you need from the IBM side? I may be able to help. We'd definitely like to support you and Niels in publishing your results. > time, I'll dive into finite field equations to demonstrate how we get there > as well as I'll add a practical example to clarify the preference of this > method in addition to the expected speedup of this method. My > intention that other crypto libraries could take advantage of this document > or maybe be a starting point for further improvements to the algorithm so > I'm checking if IBM would publish or approve such a document the same as > intel. > > > > I have a sketch of ARM Neon code doing the equivalent of two vpmsumd, > > with reasonable parallelism. Quite a lot of instructions needed. > > > > If you don't have much time, you can send it here and I'll continue from > that point. I'm planning to compare the new method with the usual method > with and without the karatsuba algorithm. > > > +C Alignment of gcm_key table elements, which is declared in gcm.h > > > +define(`TableElemAlign', `0x100') > > > > I still find this large constant puzzling. If I try > > > > struct gcm_key key; > > printf("sizeof (key): %zd, sizeof(key.h[0]): %zd\n", sizeof(key), > > sizeof(key.h[0])); > > > > (I added it to the start of test_main in gcm-test.c) and run on the > > gcc112 machine, I get > > > > sizeof (key): 4096, sizeof(key.h[0]): 16 > > > > Which is what I'd expect, with elements of size 16 bytes, not 256 bytes. > > > > I haven't yet had the time to read the code carefully. > > > > You see, the alignment of each element is 0x100 (256). The table has 16 > elements and you got the size of the table 4096 which is reasonable because > 16*256=4096 > > regards, > Mamone > _______________________________________________ > nettle-bugs mailing list > nettle-bugs@lists.lysator.liu.se > http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs -- George Wilson IBM Linux Technology Center Security Development _______________________________________________ nettle-bugs mailing list nettle-bugs@lists.lysator.liu.se http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs