Now, it is fairly straightforward to translate most of c--'s constructs
to something compilable by gcc with minimal processor dependent code.
mainly just enough to pull values out of the stack frame for
continuations and longjmp between threads.
The main thing that can't be done is get at the info needed for c--'s
garbage collection facilities, I have had some luck decoding the DWARF
debugging info attached to object files which provides exactly the
information we need. a small example is below, gotten via the
'dwarfdump' program, which also has a library for getting at the same
info:
> <1>< 791> DW_TAG_subprogram
> DW_AT_sibling <896>
> DW_AT_external yes(1)
> DW_AT_name command_read
> DW_AT_decl_file 1 /lhome/john/prog/norm/command.c
> DW_AT_decl_line 16
> DW_AT_prototyped yes(1)
> DW_AT_low_pc 0x8048be3
> DW_AT_high_pc 0x8048d58
> DW_AT_frame_base <loclist with 3 entries follows>
> [ 0]<lowpc=0x13><highpc=0x14><from .debug_loc offset
> 0x1ce>DW_OP_breg4+4
> [ 1]<lowpc=0x14><highpc=0x16><from .debug_loc offset
> 0x1da>DW_OP_breg4+8
> [ 2]<lowpc=0x16><highpc=0x188><from .debug_loc offset
> 0x1e6>DW_OP_breg5+8
> <2>< 816> DW_TAG_variable
> DW_AT_name cbuf
> DW_AT_decl_file 1 /lhome/john/prog/norm/command.c
> DW_AT_decl_line 17
> DW_AT_type <627>
> DW_AT_location <loclist with 2 entries follows>
> [ 0]<lowpc=0x25><highpc=0xf4><from .debug_loc offset
> 0x1fa>DW_OP_reg6
> [ 1]<lowpc=0xf7><highpc=0x188><from .debug_loc offset
> 0x205>DW_OP_reg6
> <2>< 831> DW_TAG_variable
> DW_AT_name cbuf_size
> DW_AT_decl_file 1 /lhome/john/prog/norm/command.c
> DW_AT_decl_line 18
> DW_AT_type <503>
> DW_AT_location <loclist with 3 entries follows>
> [ 0]<lowpc=0x2d><highpc=0xf3><from .debug_loc offset
> 0x218>DW_OP_reg3
> [ 1]<lowpc=0xf7><highpc=0x15a><from .debug_loc offset
> 0x223>DW_OP_reg3
> [ 2]<lowpc=0x161><highpc=0x188><from .debug_loc
> offset 0x22e>DW_OP_reg3
so, what this means (I think, am still experimenting)
there is a function, command_read, and two variables, cbuf and
cbuf_size.
when the program counter is between 0x8048be3 and 0x8048d58, then you
are in command_read, all other program counter values are given relative
to the base program counter of the current function.
first we need to identify the pointers, cbuf has a type of 627, which
means pointer, cbuf_size has one of 503, which means int. which numbers
represent pointers is easy to look up in another table (or you can just
remember the variable names you generate) the important bit is the location
field:
> DW_AT_location <loclist with 2 entries follows>
> [ 0]<lowpc=0x25><highpc=0xf4><from .debug_loc offset
> 0x1fa>DW_OP_reg6
> [ 1]<lowpc=0xf7><highpc=0x188><from .debug_loc offset
> 0x205>DW_OP_reg6
what this means is that in the given program counter ranges, that
variable is stored in register 6. the locations variables are stored are
often registers, but can also be given as relative offsets from the
frame pointer or absolute memory locations. there is only one more
complication, the frame_base, rather than actually changing the stack
pointer when pushing and poping things, the compiler keeps track of the
offset internally, only modifying the stack pointer lazily when it makes
a function call. the frame_base section attached to the function records
these changes relative to the program counter.
This is exactly the info we need to locate all roots. just walk the
stack, perform a fast lookup on the stored program counter locations and
get a list of the pointers out. quite straigtforward.
I imagine the easist way to take advantage of this would be to compile
the c-- program to c with an uninitialized pointer to the stack frame
info, compile it with gcc, then have another program
read the DWARF info, create an optimized table, add it to the executable
as a constant data section and poke its address into the uninitialized
stack frame info pointer.
the task can be simplified by telling gcc that your garbage collection
function touches all registers (there is an attribute for this) so it
will nicely flush all values to memory for easy access by your root
finder.
I probably won't have time to run with this idea... but I thought I'd
throw it out there as a possibility if someone else wanted to work on it
or had comments...
John
--
John Meacham - ⑆repetae.net⑆john⑈
_______________________________________________
Cminusminus mailing list
[email protected]
https://cminusminus.org/mailman/listinfo/cminusminus