Hi!

----

Below are some notes about thread support... feedback/rants/etc. would
be very welcome...

1. Goal:
Goal should be that the shell can run multiple independent threads.
Each thread will have it's own set of function-local variables and set
of signals but all threads share a set of global variables (which must
be protected by locks) and other global resources (e.g. I/O, locale,
current working directory, list of functions/types/namespaces).
Basically it should work like this:
$ sh -c 'function mythr { sleep 100 ; } ; builtin pthread_create ;
integer tid ; pthread_create -L -f mythr -t tid ; wait -T ${tid} ;
exit 0' #
The implementation of threads should only require a _minimum_ of
overhead if possible, e.g. there should be no penalty for thread
support in scripts which do not use this feature

Misc notes:
- Currently the shell passes around a |Shell_t*| as context... in the
future this needs to be replaced with a |Shellthr_t*| (=shell thread
context) which contains a pointer to the (global) |Shell_t*|, the pool
of function-local variables and signals+registered traps for this
thread

- It should be possible to have more than one |Shell_t*| object around
so that an application like /usr/bin/make can use it instead of
launching a shell for each make rule  execution. This is not for
multithread support but a usefull side-effect of the cleanup work
required for thread support.

- Threads are restricted to the current subshell only... basically
working with this set of rules:
1. If the shell is not in a subshell (e.g. |shp->subshell==NULL|) yet
creating a thread works without calling |fork()|
2. If a shell is in a subshell and creates a thread the subshell
should call |sh_subfork()| before |pthread_create()| to make sure the
subshell instance and it's threads are independent from the calling
subshell level or other threads in other subshell instances
3. If a thread creates a new subshell then the implementation should
call |fork()| (to replicate only the calling thread and run it in a
seperate independent process) to make sure the new subshell is
independent from the parent
These rules are *needed* to isolate the subshell instances from
threads in other subshells, otherwise they could affect each other in
very unpredictable ways (as a side-effect we get back to the old rule
in thread programming: "Either you use threads or processes but not
both at the same time" (which could be re-written in our cause to
"Either you use threads or subshells but not both at the same time
(subshells will still work but run in a seperate process if the parent
has any threads running))).

- If the first thread is started some resources like the LC_*
variables and PWD (and "cd") become read-only resources (technically
there is |newlocale()| and the |openat()| API... but for a simple
thread prototype we need to reduce the scope of the work for now)

- We need some new options for builtins:
    - "kill" needs -T to send signals to a specific thread in the
current process
    - "wait" needs -T to wait for a specfic thread to end
    - "jobs" needs -T to list the current threads belonging to the current shell
    - We need a new builtin "pthread_create" which works more or less
like the POSIX |pthread_create| function

- The ".sh.*" global variable. As global variable it comes with the
almightly horror of being... erm... global. The best I can think about
is this:
    1. All data in .sh.* is read-only unless [2]
    2. We create a sub-variable .sh.currthread which is a nameref to a
location where information about the current thread is stored (e.g.
.sh.currthread.tid for the current thread's tid number) and data which
is r/w. Any r/w data in the current .sh.<name> must be replaced by a
nameref pointing to .sh.currthread.<name> to prevent that multiple
threads can stomp on each other

- List of jobs:
List of jobs is global resource and needs to be protected since
multiple threads may launch external processes concurrently

- I/O:
List of open fds is a global resource and needs a r/w lock, individual
threads will have their normal sfio locking

- Functions/namespaces/types:
All three are global resources... the problem is that this may get
messy since the code is very interwinded/interconnected with many
other places

- Questions:
    - How can we seperate the pool of global variables from the pool
of function-local variables ?
    - How should locking be done for the |nv_*()| API ? Technically
function-local variables do not require any locking (because they are
only visible for the current thread) ... but the global ones need it.

What else did I miss ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.ma...@nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 3992797
 (;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to