On Jan 16, 2008, at 7:39 PM, Geoffrey Broadwell wrote:
I am starting to implement a GLUT and OpenGL binding for Parrot. GLUT
is extremely callback-oriented.
Unfortunately, none of the GLUT callbacks fall within the current
limitations on Parrot NCI callbacks.
As you've discovered, callbacks are easy enough to implement portably
provided that an opaque pointer-sized value is one of the arguments
-- you register a single C function, which uses information stored in
the opaque structure to dispatch to the real handler -- and it's
unfortunate that GLUT's callbacks don't support this.
So ... how can Parrot support callbacks of the types that GLUT uses?
The trick of putting magical properties on a special user data
parameter
won't work anymore. I've been tanking on this for a while, and
come up
with several possible schemes.
My first idea was that each time a new callback was registered, Parrot
would copy a tiny shim function and then poke the address(es) of the
data it needed to figure out what PIR routine to call directly into
the
copied shim code.
This requires special knowledge per platform.
So my next idea was to have a shim function that did nothing but call
another function, thus putting its own address on the stack. The
other
function would then reach up above its own stack frame and grab that
return address, using *that address* to look up the needed information
in a global callback registry. By making copies of this simple shim
every time a new callback was registered, each copy would get its own
address, and everything would magically work.
This was certainly better than the previous ideas, because it didn't
involve poking anything into the shim code, just being able to copy
it.
But it still requires per-platform support, at least for retrieving
the stack frame pointer.
But I still don't know how to do that in pure portable C. My guess is
that you could declare another function (which need not do
anything) in
some special way so that it is guaranteed to be compiled into the
object
code directly after the magic shim. Subtracting the address of the
shim
function from the dummy function would then tell you exactly how many
bytes long the compiled shim is (including perhaps some padding, which
doesn't matter). This would allow you to copy the shim at will.
I've seen this done (so it 'works') but the underlying assumption
isn't portable.
But this still leaves the problems of:
A. Being able to portably force two functions two be compiled
sequentially in memory. I would hope this is easy, but I don't
recall which set of declarations makes exactly this guarantee.
B. Being able to execute a copy of the shim on architectures that have
non-executable stack and/or heap. I assume this had to be solved
before JIT could work, so I'm guessing this isn't a real problem.
C. Being able to reach up above one's own stack frame and grab the
return address from the shim function's dummy call. Is there a
trick to make this work in pure portable C?
No. What you're trying to do can't be done portably. Period.
However, once you place per-platform implementation on the table, it
becomes simple (in theory, at least). Rather than teach Parrot to
cope with GLUT callbacks, I suggest a scheme for creating 'GLUT-
prime' callbacks which accept the opaque structure pointer you're
familiar with, which Parrot can already deal with.
For example: Here's a callback interface that includes the opaque
handle:
void generic_handler( void* user_data );
struct foo foo_data;
install_generic_handler( &generic_handler, &foo_data );
Here's one without:
struct bar bar_data;
typedef void (*handler)();
handler bar_handler = new_handler( &generic_handler, bar_data );
install_handler( bar_handler );
The new_handler() function creates a thunk that loads the data
argument and jumps to the generic handler. In 68K, the thunk code
might (if it were written out) look like:
asm void thunk_zero_args()
{
MOVEA.L #$11223344,A0; // load immediate handler address into
A0
MOVE.L #$55667788,-(SP); // load immediate data pointer onto
stack
JMP (A0)
}
asm void thunk_one_arg( void* arg1 )
{
MOVEA.L #$11223344,A0; // load immediate handler address into
A0
MOVE.L #$55667788,-(SP); // load immediate data pointer onto
stack
MOVE.L 8(SP),-(SP); // load previous arg 1 onto stack
JMP (A0)
}
asm void thunk_two_args( void* arg1, void* arg2 )
{
MOVEA.L #$11223344,A0; // load immediate handler address into
A0
MOVE.L #$55667788,-(SP); // load immediate data pointer onto
stack
MOVE.L 12(SP),-(SP); // load previous arg 2 onto stack
MOVE.L 12(SP),-(SP); // load previous arg 1 onto stack
JMP (A0)
}
Etc. Except you'd generate the actual machine code and store it
somewhere, returning the address from new_handler() (after calling
the magic OS routine that makes it executable).
Now that I think about it, this is just machine-level currying.
Actually turning this into working code and implementing this for
other platforms are left as exercises for the reader.
I haven't written viruses either. Not that there's anything *wrong*
with that... oh, wait.
Josh