[MIT-Scheme-devel] Symmetric MultiProcessing

Friar Puck Sun, 21 Dec 2014 12:03:48 -0800

> From: Friar Puck <[email protected]>
> Date: Tue, 25 Nov 2014 13:22:47 -0700
> 
> [...]  And SOMEONE should look at every global variable in the
> microcode.  Each deserves to be thread-local or protected by
> explicit serialization. [...]


Another rebase, another progress update: SOMEONE has sorted the
microcode.

There are 91 modules (.o files) in the Unix link but 43 do not define
any variables (according to nm, not a howsoever careful perusal).
Each of the 48 modules with variables was analyzed to ensure that each
variable is either pthread-local or shareable -- read-only (after
initialization) or read/written only by serial readers/writers.

Each analysis was saved in a commit, the latest 48 on my rebased SMP
branch.  26 of them patch nothing except a top-level README.txt (a
temporary file) in which I tracked my progress through the analysis.
An example of the 26 is appended.

Are we ready to multi-process in a symmetric fashion?  Nearly!  In
fact I cheated and used a simple fib procedure to waste time and
consume stack (and only lightly touch the runtime system) inside a
varying number of threads (e.g. 3 to test 2 processors).

    (define (fib n)
      (cond ((<= n 2) n)
            (else (+ (fib (- n 1)) (fib (- n 2))))))

    (map join-thread*
         (map (lambda (n)
                (create-thread #f
                  (lambda ()
                    (let ((fib-34 (fib 34)))
                      (outf-error ";thread-"n": "fib-34"\n")
                      fib-34))))
              '(1 2 3)))

    $ microcode/scheme --library lib --processors 1
    ...
    ;process time: 25860 (25420 RUN + 440 GC); real time: 25887

    $ microcode/scheme --library lib --processors 2
    ...
    ;process time: 28090 (27850 RUN + 240 GC); real time: 14926

The encouraging results: 2 processors got it done in 58% of the real
time it took 1!

Note that I didn't compile anything.  When I do, the garbage collector
never runs.  When I don't, I see 2 processors (using 50% more heap)
take little more than half the GC time(!).  They must be consuming
heap at similar rates, taking full advantage of each flip regardless
who runs out of heap first.

The impact on system performance of all the mutex grabbing and
releasing (on every thread switch!) isn't too bad.  When 9.2 ran the
same cheat/test, it took 23179msec so the SMPing world (single
processor) took just 12% longer.

Encouraging results are just what SOMEONE needs when going in the ring
with up to 104(!) possible demons in the guise of without-interrupts.
And that's just the runtime system.

Yours in Scheme,
SOMEONE


$ git remote add puck git://birchwood-abbey.net/~matt/mit-scheme.git
$ git fetch puck SMP
$ git checkout puck/SMP
$ cd src/
$ ./Setup.sh
$ ./configure --enable-smp
$ make tags all check


commit 12827782fa527ac00259794813951514208e1d8e
Author: Matt Birkholz <[email protected]>
Date:   Thu Dec 4 17:19:42 2014 -0700

    smp: share: boot.o

diff --git a/README.txt b/README.txt
index e4b1ed0..fc7f441 100644
--- a/README.txt
+++ b/README.txt
@@ -200,17 +200,22 @@ command line.  The remaining 12 belong to the 7 microcode 
modules and
   bitstr.o:
 
   boot.o:
-  00000004 C OS_Name
-  00000004 C OS_Variant
-  00000050 B critical_section_hook
-  0000004c B critical_section_hook_p
-  00000048 B critical_section_name
-  00000024 B ffi_obstack
-  00000000 b initial_C_stack_pointer
-  00000004 b reload_saved_string
-  00000008 b reload_saved_string_length
-  00000004 C scheme_program_name
-  00000000 B scratch_obstack
+  00000004 C OS_Name                   read-only, initialized by OS_initialize
+  00000004 C OS_Variant                        read-only, initialized by 
OS_initialize
+  00000050 B critical_section_hook     __thread
+  0000004c B critical_section_hook_p   __thread
+  00000048 B critical_section_name     __thread
+  00000024 B ffi_obstack               __thread
+  00000000 b initial_C_stack_pointer   read-only, initialized by main
+  00000004 b reload_saved_string       written by Prim_reload_save_string
+  00000008 b reload_saved_string_length        written by 
Prim_reload_save_string
+  00000004 C scheme_program_name       read-only, initialized by main
+  00000000 B scratch_obstack           __thread
+
+       OK.  Localized (or not) already.  The reload-save-string and
+       reload-retrieve-string primitives are only used in
+       save-console-input and reset-console, presumably not by
+       multiple threads at the same time.
 
   char.o:
 

_______________________________________________
MIT-Scheme-devel mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/mit-scheme-devel

[MIT-Scheme-devel] Symmetric MultiProcessing

Reply via email to