On Jan 16, 2008, at 7:39 PM, Geoffrey Broadwell wrote:

I am starting to implement a GLUT and OpenGL binding for Parrot.  GLUT
is extremely callback-oriented.

Unfortunately, none of the GLUT callbacks fall within the current
limitations on Parrot NCI callbacks.

As you've discovered, callbacks are easy enough to implement portably provided that an opaque pointer-sized value is one of the arguments -- you register a single C function, which uses information stored in the opaque structure to dispatch to the real handler -- and it's unfortunate that GLUT's callbacks don't support this.

So ... how can Parrot support callbacks of the types that GLUT uses?
The trick of putting magical properties on a special user data parameter won't work anymore. I've been tanking on this for a while, and come up
with several possible schemes.

My first idea was that each time a new callback was registered, Parrot
would copy a tiny shim function and then poke the address(es) of the
data it needed to figure out what PIR routine to call directly into the
copied shim code.

This requires special knowledge per platform.

So my next idea was to have a shim function that did nothing but call
another function, thus putting its own address on the stack. The other
function would then reach up above its own stack frame and grab that
return address, using *that address* to look up the needed information
in a global callback registry.  By making copies of this simple shim
every time a new callback was registered, each copy would get its own
address, and everything would magically work.

This was certainly better than the previous ideas, because it didn't
involve poking anything into the shim code, just being able to copy it.

But it still requires per-platform support, at least for retrieving the stack frame pointer.

But I still don't know how to do that in pure portable C.  My guess is
that you could declare another function (which need not do anything) in some special way so that it is guaranteed to be compiled into the object code directly after the magic shim. Subtracting the address of the shim
function from the dummy function would then tell you exactly how many
bytes long the compiled shim is (including perhaps some padding, which
doesn't matter).  This would allow you to copy the shim at will.

I've seen this done (so it 'works') but the underlying assumption isn't portable.

But this still leaves the problems of:

A. Being able to portably force two functions two be compiled
   sequentially in memory.  I would hope this is easy, but I don't
   recall which set of declarations makes exactly this guarantee.

B. Being able to execute a copy of the shim on architectures that have
   non-executable stack and/or heap.  I assume this had to be solved
   before JIT could work, so I'm guessing this isn't a real problem.

C. Being able to reach up above one's own stack frame and grab the
   return address from the shim function's dummy call.  Is there a
   trick to make this work in pure portable C?

No.  What you're trying to do can't be done portably.  Period.

However, once you place per-platform implementation on the table, it becomes simple (in theory, at least). Rather than teach Parrot to cope with GLUT callbacks, I suggest a scheme for creating 'GLUT- prime' callbacks which accept the opaque structure pointer you're familiar with, which Parrot can already deal with.

For example: Here's a callback interface that includes the opaque handle:

        void generic_handler( void* user_data );
        
        struct foo foo_data;
        
        install_generic_handler( &generic_handler, &foo_data );

Here's one without:

        struct bar bar_data;
        
        typedef void (*handler)();
        
        handler bar_handler = new_handler( &generic_handler, bar_data );
        
        install_handler( bar_handler );

The new_handler() function creates a thunk that loads the data argument and jumps to the generic handler. In 68K, the thunk code might (if it were written out) look like:

        asm void thunk_zero_args()
        {
                MOVEA.L #$11223344,A0;  // load immediate handler address into 
A0
                MOVE.L  #$55667788,-(SP);  // load immediate data pointer onto 
stack
                JMP (A0)
        }

        asm void thunk_one_arg( void* arg1 )
        {
                MOVEA.L #$11223344,A0;  // load immediate handler address into 
A0
                MOVE.L  #$55667788,-(SP);  // load immediate data pointer onto 
stack
                MOVE.L  8(SP),-(SP);  // load previous arg 1 onto stack
                JMP (A0)
        }

        asm void thunk_two_args( void* arg1, void* arg2 )
        {
                MOVEA.L #$11223344,A0;  // load immediate handler address into 
A0
                MOVE.L  #$55667788,-(SP);  // load immediate data pointer onto 
stack
                MOVE.L  12(SP),-(SP);  // load previous arg 2 onto stack
                MOVE.L  12(SP),-(SP);  // load previous arg 1 onto stack
                JMP (A0)
        }

Etc. Except you'd generate the actual machine code and store it somewhere, returning the address from new_handler() (after calling the magic OS routine that makes it executable).

Now that I think about it, this is just machine-level currying.

Actually turning this into working code and implementing this for other platforms are left as exercises for the reader.


I haven't written viruses either. Not that there's anything *wrong* with that... oh, wait.

Josh


Reply via email to