Hi all,

Megane looked into #1650 and found out that the srand(time(NULL)) call
is the culprit; due to the hash table randomization the number of GCs
can differ on every run.

This led me to think that this could be one reason our benchmarks are so
noisy, so I added a runtime option to fix the randomization factor on
startup.  This does indeed completely remove all deviation on the
"number of major GCs" metric in the CHICKEN benchmarks, but unfortunately
doesn't seem to do all that much for the cpu-time or GC time metrics.

Oh well, I think it could be useful enough anyway, and perhaps a good
first stab at reducing the noisiness of our benchmarks.

The patch should speak for itself.

Cheers,
Peter
From eecb550157a0bf809132329928b9338c37875bd8 Mon Sep 17 00:00:00 2001
From: Peter Bex <pe...@more-magic.net>
Date: Thu, 17 Jun 2021 13:28:57 +0200
Subject: [PATCH] Add new -:R runtime option to influence how srand() is called

Without this, there's some nondeterminism when running an
application, which defends against symbol table stuffing attacks
but can make debugging or benchmarking GC issues difficult.

As noted in #1650, it is unsettling when a program behaves
differently every time you run it.

While we're at it, remove unnecessary C_fix() call on the output of
time(NULL), as srand() is not a CHICKEN C function accepting fixnums,
but a native C function accepting regular integers.
---
 NEWS                      |  2 ++
 manual/Using the compiler |  2 ++
 runtime.c                 | 12 +++++++++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index e06914c7..46af9bd1 100644
--- a/NEWS
+++ b/NEWS
@@ -50,6 +50,8 @@
   - Garbage collection algorithm has been changed to reduce thrashing
     when heap is almost full, by growing the heap sooner.  A new -:hf
     option was added to tweak when heap growth should occur.
+  - Added `-:R' runtime option to initialize rand() state
+    deterministically (should help with #1650 and benchmarking).
 
 - Compiler
   - Avoid re-using argvector when inline rest operations are being
diff --git a/manual/Using the compiler b/manual/Using the compiler
index 7e8ff6de..68f09908 100644
--- a/manual/Using the compiler	
+++ b/manual/Using the compiler	
@@ -237,6 +237,8 @@ by the startup code and will not be contained in the result of
 
 ; {{-:r}} : Writes trace output to stderr. This option has no effect in files compiled with the {{-no-trace}} options.
 
+; {{-:RNUMBER}} : Specifies the initial number passed to seed the {{rand()}} PRNG (which is currently only used to randomize the symbol table).  If not supplied, the current system time is used.  This can be useful when debugging or benchmarking because it removes a source of nondeterminism which can affect how soon or how often the GC is triggered.
+
 ; {{-:sNUMBER}} : Specifies stack size.
 
 ; {{-:tNUMBER}} : Specifies symbol table size.
diff --git a/runtime.c b/runtime.c
index 580c6fbe..93dd9d29 100644
--- a/runtime.c
+++ b/runtime.c
@@ -454,6 +454,7 @@ static C_TLS int
   stack_size_changed,
   dlopen_flags,
   heap_size_changed,
+  random_state_initialized = 0,
   chicken_is_running,
   chicken_ran_once,
   pass_serious_signals = 1,
@@ -845,7 +846,10 @@ int CHICKEN_initialize(int heap, int stack, int symbols, void *toplevel)
   current_module_handle = NULL;
   callback_continuation_level = 0;
   gc_ms = 0;
-  srand(C_fix(time(NULL)));
+  if (!random_state_initialized) {
+    srand(time(NULL));
+    random_state_initialized = 1;
+  }
 
   for(i = 0; i < C_RANDOM_STATE_SIZE / sizeof(C_uword); ++i)
     random_state[ i ] = rand();
@@ -1379,6 +1383,7 @@ void CHICKEN_parse_command_line(int argc, char *argv[], C_word *heap, C_word *st
 		 " -:huPERCENTAGE   set percentage of memory used at which heap will be shrunk\n"
 		 " -:hSIZE          set fixed heap size\n"
 		 " -:r              write trace output to stderr\n"
+		 " -:RSEED          initialize rand() seed with SEED (helpful for benchmark stability)\n"
 		 " -:p              collect statistical profile and write to file at exit\n"
 		 " -:PFREQUENCY     like -:p, specifying sampling frequency in us (default: 10000)\n"
 		 " -:sSIZE          set nursery (stack) size\n"
@@ -1494,6 +1499,11 @@ void CHICKEN_parse_command_line(int argc, char *argv[], C_word *heap, C_word *st
 	  show_trace = 1;
 	  break;
 
+	case 'R':
+	  srand((unsigned int)arg_val(ptr));
+	  random_state_initialized = 1;
+	  goto next;
+
 	case 'x':
 	  C_abort_on_thread_exceptions = 1;
 	  break;
-- 
2.20.1

Attachment: signature.asc
Description: PGP signature

Reply via email to