On Jan 7, 2004, at 10:39 AM, Adam Thomason wrote:

The 40 ops range from 0x200fca28 to 0x200fca90, with 0x200fca94 onward being
".long 0x0". The pc at failure is 0x7c0802a4. So it's probably safe to
assume the trouble is pre-pasm.

Yep, you're right. Thanks for all of the great information--I'm pretty sure I now understand what's going on. (But I don't know that I have a fix.) So here's the scoop:


On AIX, function calls via pointer expect that the supplied address isn't a pointer to where the code starts, but rather a pointer to a structure which contains a pointer to where the code starts, as well as some other information (a TOC pointer, and an optional environment pointer). Since that's not what jit_code is, it blows up. How is blows up is that rather than jumping to jit_code, it jumps to *jit_code--that is, it dereferences jit_code and interprets that value as a function pointer. But that value is actually going to be the numeric value the first instruction:

(gdb) x/i jit_code
0x1024c00:      mflr    r0
(gdb) x/w jit_code
0x1024c00:      0x7c0802a6

You see above that the numeric value of that instruction is 0x7c0802a6. But, instructions have to be on 4-byte boundaries, and the branch instructions for this reason ignore the 2 low-order bits (ie, they round down to a multiple of 4), so branching to 0x7c0802a6 really lands you at 0x7c0802a4, which is where you are crashing.

(This extra level of indirection for function pointers is part of the AIX scheme for doing indirect function calls, and the _ptrgl is some boilerplate code for pulling out the pieces of that struct and putting them into the correct registers. If you use "stepi" instead of "nexti", which I almost suggested before but then didn't, you should see that you get into the _ptrgl code okay, but then crash at the bctr instruction. "nexti" steps over function calls, so it's being misleading in this case--"stepi" will continue to step an instruction at a time across branches.)

So the problem is that the calling conventions differ between AIX and Mac OS X, and I think that we might need (at least) a different Parrot_jit_begin() and Parrot_jit_normal_op() for AIX. The above problem might not matter once we enter the JIT-generated code, but AIX also uses r2 as a TOC pointer and we'll probably need to maintain that properly. (At least, the NCI code will need to worry about that, once we are using a JIT-ish scheme on ppc to generate the needed stubs, since those will be jumping between libraries.) The AIX calling conventions seem to be closer to the Mac OS 9 conventions than to those for Mac OS X.

Just for fun, try changing runops_jit() to something like the following, and see what happens. I expect it to still blow up somewhere, but it might get further:

static opcode_t *
runops_jit(struct Parrot_Interp *interpreter, opcode_t *pc)
{
        struct { jit_f functPtr, void *toc, void *env } temp;

#if JIT_CAPABLE
    jit_f jit_code = (jit_f) D2FPTR(init_jit(interpreter, pc));
    temp.functPtr = jit_code;
    temp.toc = NULL; /* should fill with current TOC, but I don't know
                        how to find out what that is */
    temp.env = NULL
    (&temp) (interpreter, pc);
#endif
    return NULL;
}

I didn't compile that to make sure that the syntax is okay, but the idea is to make jit_code the first member of a struct big enough to hold 3 pointers, and to use a pointer to that as the function pointer.

One of the first things I did to get the port to work was introduce the
'#ifndef __IBMC__' guard in ppc_sync_cache in jit/ppc/jit_emit.h, just to get
the file to compile (xlC doesn't support any form of inlined asm).

I saw that, and a couple of things in CVS comments, and jumped to the conclusion that it was working on AIX, but now I'm thinking that it must never have worked there.


Might the
absence of the sync cause the trouble? If so, now is probably the time to
patch up the build system to support assembling a separate .s file containing
the necessary snippet. I can look into doing that, and calling it from
ppc_sync_cache, but I've no idea if there's any trouble w.r.t not inlining
that piece, so help would be appreciated.

I think it should be okay for it not to be inline, and alternatively it might make sense to just implement ppc_sync_cache entirely in assembly (though I guess that's a little more work). But that seems not to be the problem in this case.


JEff



Reply via email to