A seasoned programmer's take on Nim's docs

mratsim Sat, 14 Jan 2023 09:30:23 -0800

@jtv,

I have looked further into the subject. My answer will be based on the 
following papers:


  1. Parallel Random Numbers: As Easy as 1, 2, 3, 2011

John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw

<https://thesalmons.org/john/random123/papers/random123sc11.pdf>

<https://github.com/DEShawResearch/random123>

  2. New and Old Limits for AES Known-Key Distinguishers, 2017

Lorenzo Grassi and Christian Rechberger

<https://eprint.iacr.org/2017/255.pdf>

  3. Fast-key-erasure random-number generators, 2017

Daniel J. Bernstein

<https://blog.cr.yp.to/20170723-random.html>

<https://github.com/floodyberry/supercop/blob/a351f2c/crypto_sign/ntrumls439x/ref/fastrandombytes.c>

  4. Randen - fast backtracking-resistant random generator with 
AES+Feistel+Reverie, 2018

Jan Wassenberg, Robert Obryk, Jyrki Alakuijala, Emmanuel Mogenet

<https://arxiv.org/abs/1810.02227>

<https://github.com/google/randen>

  5. Scrambled linear pseudorandom number generators, 2021

David Blackman and Sebastiano Vigna

<http://vigna.di.unimi.it/ftp/papers/ScrambledLinear.pdf>

<https://prng.di.unimi.it/>

(Nim uses xoroshiro128+ by default)

  6. BearSSL - Constant-time

<https://bearssl.org/constanttime.html#aes>

  7. JAX PRNG design

<https://github.com/google/jax/blob/main/docs/jep/263-prng.md>

  8. Weave and Trace-of-Radiance, PRNG research for parallel Monte-Carlo 
simulations

<https://github.com/mratsim/trace-of-radiance/pull/1>

<https://github.com/mratsim/weave/issues/147>

  9. Computer Go strength being highly sensitive to Monte-Carlo simulation 
speed:

Can be used as real-world PRNG bench: 
<https://github.com/mratsim/golem-prime/blob/8953347/src/golem_prime.nim#L85-L101>

Monte-Carlo tree search and rapid action value estimation in computer Go

Sylvain Gelly, David Silver

<https://www.ics.uci.edu/~dechter/courses/ics-295/winter-2018/papers/mcts-gelly-silver.pdf>




> No, I did not say that non-cryptographic PRNGs would be slower than 
> cryptographic PRNGs. What I said is that you can absolutely make AES-CTR fast 
> enough to be the DEFAULT, because few people will need anything else. Most 
> people grossly overestimate the cost.

First of all, there is no hierarchy in Nim stdlib, Nim provides std/random and 
std/sysrand. And no package within the standard library forces either.

In [1], section 4.1, on 2011 CPUs.

> Using compiler intrinsics to invoke the AES instructions and gcc-4.5.2, we 
> have measured an AES PRNG running at 1.7 cpB on a single core, faster than 
> the Mersenne Twister in the C++0x standard lib > The software implementation 
> we used for AES is not the fastest known. AES has been reported to run at 
> 10–15 cpB on a variety of platforms [cites D. J. Bernstein and P. Schwabe. 
> New AES software speed records] (without AES-NI]

In [4], chapter 3

> The AES block cipher is well-understood and often hardware-accelerated. 
> Intel’s AESNI instructions are five to ten times faster than opti- mized 
> software implementations. This implies a software-only Randen would be 
> unacceptably slow (and likely vulnerable to side- channel attacks). On CPUs 
> without hardware AES, it may be faster to replace AES with SIMD-friendly 
> permutations such as ChaCha. However, most modern CPUs have AES hardware, 
> including POWER (VCIPHER) and ARMv8 (AESE)

In [6] Pornin provides bench in MB/s of AES-CTR with a classic table-based 
approach and with AES-NI: 170MB/s vs 2420MB/s, so a 14.2x factor

In [5] on the speed section of Vigna website

> Otherwise, with suitable hardware support I could just use AES in counter 
> mode and get 64 secure bits in 0.56 ns (or just use Randen). The tests were 
> performed on a 12th Gen Intel® Core™ i7-12700KF @3.60GHz using gcc 12.2.1.

xoroshiro128+ is still at 0.34 cycles per byte so 1.65x faster.

Which brings us to the next question, does that speed difference matter when 
both are sub-nanosecond? Likely not, see Randen paper [4] where PCG is faster 
on microbenchmark but not in real-life reservoir sampling scenario.

However 8x definitely matters, in [9] in table 1 p1869, between 3000 
Monte-Carlo simulations per move and 10000 simulations per move, there is a 69% 
to 83% winrate improvement and an ELO rating improvement from 1960 to 2110. In 
practice this means, that with 10k simulations per move, you would skip your 
first move to give a fair fighting chance to the program with 3k simulations 
per move.

Now, can AES-CTR be the default in Nim?

The issue is that it is way too slow when implemented in software, 8x to 14x 
slower than when hardware acceleration is available. Implementing it with 
hardware acceleration requires divergent code paths with intrinsics or inline 
assembly for all platforms we support and a C fallback, especially for RNG in 
the compile-time VM, which will slowdown drastically if we need RNG at 
compile-time, which does happen 
(<https://github.com/numforge/laser/blob/e23b5d6/laser/openmp.nim#L13>). 
Testing the ARM code will be very hard in CI as well not even talking about the 
rest of supported [hostCPU](https://nim-lang.org/docs/system.html#hostCPU) 
"i386", "alpha", "powerpc", "powerpc64", "powerpc64el", "sparc", "amd64", 
"mips", "mipsel", "arm", "arm64", "mips64", "mips64el", "riscv32", "riscv64" 
(and s390x is missing from the doc, 
<https://github.com/nim-lang/Nim/pull/20943>) This is something that Nim should 
delegate to libraries.

> Again, people use cryptographic PRNGs in these things all the time. For 
> example, cryptographic PRNGs are perfectly capable of repeatability for monte 
> carlo simulations, and have the advantage in such cases that random access 
> into the output stream is incredibly cheap, which is NOT the case with any 
> non-cryptographic PRNG, where doing so without saving state is neither 
> straightforward nor cheap, due to the nature of the algorithms.

Besides plain AES-CTR has vulnerabilities when used as a PRNG:

  * See Known-Key distinguishers paper [3], Randen uses 2 rounds for example.
  * if implemented in software it is not constant-time or about 2x even slower 
than table-based and its vulnerable to cache timing attacks [3, 6], see also 
<https://github.com/mratsim/constantine/wiki/Constant-time-arithmetics> 
(attacks on WPA3, TLS, Intel SGX Secure Enclave, libgcrypt/GPG, ...)



So you get pain and you're still not secure enough when security matter

> Note that I'm not saying to take fast, non-cryptographic PRNGs should be 
> taken out of the library. People who are worried about "what's Nim's fastest 
> option" can test them, otherwise they're not comparing apples and oranges.
> 
> And caring about the security of the programs written in Nim would be a MUCH 
> more unique and valuable thing to market around. I don't see it being the 
> thing everyone slams Rust for; instead, it's one of the things I've heard 
> people tout about it.

Nim offers a Cryptographically-Secure PRNG in 
<https://nim-lang.org/docs/sysrand.html>. It's pretty new so discoverability is 
bad. It should be mentioned in std/random docs for starters. And maybe 
std/random should be renamed std/pseudorandom. But I don't think we can do much 
beyond education for this problem.

Re: security, I'm probably the one of the people caring most about it:

  * I am greatly interested in making formal verification of Nim programs a 
reality: <https://github.com/nim-lang/RFCs/issues/222>
  * Similarly regarding proving that concurrent data structures and 
multithreaded applications are sound with a race/deadlock/livelock detector:
    * <https://github.com/mratsim/weave/issues/18>
  * Writing a guide for security auditors to audit Nim: 
<https://nimbus.guide/auditors-book/>
  * Writing a comprehensive cryptography library: 
<https://github.com/mratsim/constantine>



However, in this case either the PRNG will be too slow and insecure (software) 
or hard to test (HW acceleration on non-x86).

> That assumes people understand well enough to realize that, for instance, the 
> online game they're building. Most people are not security experts, make 
> incredibly wrong assumptions, and end up with risks by default.

In that case, either their RNG is critical (say online poker) and they hire an 
expert and have an audit as well. Or it's not and maybe it can be exploited for 
lootboxes or matchmaking but those can be handled with a refund, an apology and 
a bugfix.

There are scenarios where you can exploit a bad RNG that can surprise devs, for 
example to carry an eclipse attack in a P2P setting 
(<https://eprint.iacr.org/2015/263.pdf>). But the problem is education:

  * Does the code accepts all incoming peers for peer exchange (i.e. potential 
attacker-controller) ?
  * Does the code ensures that some percentage of connections will be initiated 
by itself (i.e. cannot be controlled by attacker) ?



If you don't say yes to the second one, even a cryptographically secure PRNG 
won't save you.

## My take

Finally, my recommendations. As said at the very beginning, Nim standard 
library provides no default, only options regarding RNG. Only thing we can do 
is improve discoverability and documentation:

  1. Rename std/random to std/pseudorandom to reinforce the fact that we're not 
dealing with true randomness
  2. Mention std/sysrand in std/pseudorandom documentation. Mention that 
anytime having a flawed RNG might lead to financial or security consequences, 
use std/sysrand instead. This includes games where players might want an edge, 
a network application that randomly accept peers or reject messages when 
rate-limiting or for cryptographic protocols.
  3. Mention in std/sha1 and std/md5 that those have not met cryptographic 
standards since 1996 and 2005 and can only be used to hash trusted data, for 
example files your compiler just generated. (and standardize output: 
<https://github.com/nim-lang/RFCs/issues/32>)
  4. Create a new hash table with SipHash or provide SipHash as a compile-time 
option for tables used in Network libraries and applications ( 
<https://en.wikipedia.org/wiki/SipHash> )

A seasoned programmer's take on Nim's docs

Reply via email to