Hi Werner,

It's been a while since my last response.  Do you have more comments or 
questions?  And what's the status of this patch?

Thanks.
-Danny
________________________________
From: Danny Tsen <[email protected]>
Sent: Sunday, March 1, 2026 8:19 PM
To: Werner Koch <[email protected]>; Danny Tsen via Gcrypt-devel 
<[email protected]>; Danny Tsen <[email protected]>
Subject: Re: [EXTERNAL] RE: [PATCH 0/5] dilithium-kyber: Optimized (i)NTT 
support for

Hi Werner,

I do some modification for the ML-KEM format.  Here is the raw performance 
number for ML-KEM NTT.  Hope this help.

Thanks.
-Danny

[16:33] danny@ltcden12-lp1 mlkem-ipcri % ./perf_mlkem_test


=== Optimized assembly NTT test

cpu_time_used (sec)=0.016707
loops=100000
-->ops / sec = 5985515.053570


=== Original C NTT test

cpu_time_used (sec)=0.107232
loops=100000
-->ops / sec = 932557.445539
-->Optimized improvement over original = 5.418388
-->Optimized speed over original faster = 6.418388


=== Optimized Assembly Inverse NTT test

cpu_time_used (sec)=0.031500
loops=100000
-->ops / sec = 3174603.174603


=== Original C Inverse NTT test

cpu_time_used (sec)=0.138457
loops=100000
-->ops / sec = 722245.895838
-->Optimized improvement over original = 3.395460
-->Optimized speed over original faster = 4.395460
________________________________
From: Gcrypt-devel <[email protected]> on behalf of Danny Tsen via 
Gcrypt-devel <[email protected]>
Sent: Monday, March 2, 2026 9:37 AM
To: Werner Koch <[email protected]>; Danny Tsen via Gcrypt-devel 
<[email protected]>
Subject: [EXTERNAL] RE: [PATCH 0/5] dilithium-kyber: Optimized (i)NTT support 
for

Hi Werner, For some reason, I can't display your message. I got to display it 
now. I don't have a good comparison performance format for ML-KEM. But here is 
the raw performance number for MLDSA. Thanks. -Danny [15: 47] danny@ 
ltcden12-lp1 mldsa-ntt_tests

Hi Werner,

For some reason, I can't display your message.  I got to display it now.  I 
don't have a good comparison performance format for ML-KEM.  But here is the 
raw performance number for MLDSA.

Thanks.
-Danny

[15:47] danny@ltcden12-lp1 mldsa-ntt_tests % ./perf_mldsa_ntt_opt


=== Optimized assembly NTT test

cpu_time_used (sec)=0.046582
loops=100000
-->ops / sec = 2146751.964278


=== Original C NTT test

cpu_time_used (sec)=0.229215
loops=100000
-->ops / sec = 436271.622712
-->Optimized improvement over original = 3.920678
-->Optimized speed over original faster = 4.920678


=== Optimized Assembly Inverse NTT test

cpu_time_used (sec)=0.052021
loops=100000
-->ops / sec = 1922300.609369


=== Original C Inverse NTT test

cpu_time_used (sec)=0.270790
loops=100000
-->ops / sec = 369289.855608
-->Optimized improvement over original = 4.205398
-->Optimized speed over original faster = 5.205398


________________________________
From: Werner Koch
Sent: Thursday, February 26, 2026 9:47 PM
To: Danny Tsen via Gcrypt-devel
Cc: Danny Tsen
Subject: [EXTERNAL] Re: [PATCH 0/5] dilithium-kyber: Optimized (i)NTT support 
for

On Thu, 26 Feb 2026 10:23, Danny Tsen said:

> I don't have benchmark for libgcrypt.  I do have my own testing
> performance number on NTT operation. That probably not what you are

I just noticed that we do have support for MLKEM and MLDSA in our
./bench-slope .  We should change that to make it easier torun
benchmarks.

I was actually looking only for a rough figure on how much performance
you gain with your patches.


Salam-Shalom,

   Werner

--
The pioneers of a warless world are the youth that
refuse military service.             - A. Einstein
_______________________________________________
Gcrypt-devel mailing list
[email protected]
https://lists.gnupg.org/mailman/listinfo/gcrypt-devel

Reply via email to