On 21.07.2017 11:26, Jan Glauber wrote:

Nice catch. How much does the performance improve on Ryzen when you
use arch_get_random_int()?

Okay, now I have some results for you:

On Ryzen 1800X (using arch_get_random_int()):

---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
8751415296 bytes (8,8 GB, 8,2 GiB) copied, 71,0079 s, 123 MB/s
# perf top
   57,37%  [kernel]                    [k] _extract_crng
   26,20%  [kernel]                    [k] chacha20_block
---

Better, but obviously there is still much room for improvement by reducing the number of calls to RDRAND.

On Ryzen 1800X (with nordrand kernel option):

---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
22643998720 bytes (23 GB, 21 GiB) copied, 67,0025 s, 338 MB/s
---

Here is the patch I used:

--- drivers/char/random.c.orig  2017-07-03 01:07:02.000000000 +0200
+++ drivers/char/random.c       2017-07-21 11:57:40.541677118 +0200
@@ -859,13 +859,14 @@
  static void _extract_crng(struct crng_state *crng,
                           __u8 out[CHACHA20_BLOCK_SIZE])
  {
-       unsigned long v, flags;
+       unsigned int v;
+       unsigned long flags;

         if (crng_init > 1 &&
             time_after(jiffies, crng->init_time + CRNG_RESEED_INTERVAL))
crng_reseed(crng, crng == &primary_crng ? &input_pool : NULL);
         spin_lock_irqsave(&crng->lock, flags);
-       if (arch_get_random_long(&v))
+       if (arch_get_random_int(&v))
                 crng->state[14] ^= v;
         chacha20_block(&crng->state[0], out);
         if (crng->state[12] == 0)

Reply via email to