Ian Romanick wrote:
Ian Romanick wrote:

The one caveat with this patch is the x86 & SSE codegen is disabled for all TexCoord and MultiTexCoord commands. If you look at the changes to r200_vtxfmt_c.c, you'll see that I had to make some changes to the way those routines work.


The previous patch is committed. The attached patch adds x86 & SSE codegen back. I've changed the way the codegen works just slightly.

Each codegen stub consists of a bit of assembly code that needs to be reloced / fixed-up at run-time. Prepended to the assembly code is a small preamble that describes how to do this. The preamble contains the size of the assembly stub and array of "fix-ups" that need to be done. The stub code follows immediatly after the array of fix-ups.

At run-time, the function r200_do_codegen is called to create the executable stub. It is passed a pointer to the stub's preamble and an array of fix-up values. Each entry in the stub's fix-up array specifies a size, an offset in the stub, and an element index to use for the fix-up. This is similar to how a reloc table works in an object file.

There are two obvious advantages. If a stub is modified, it is likely that only one file (the file containing the stub) needs to be updated.
Code size (in the form of FIXUP macros) is cut way down.


There are a couple of advantages to this that aren't fully realized in this code. This is a *lot* more cross-platform. The only difference between r200_makeX86TexCoord2f and r200_makeSSETexCoord2f (and the non-existent r200_makePowerPCTexCoord2f) is a single pointer passed to r200_do_codegen. This should make it possible to cut down on a lot of redundant code. Additionlly, since the codegen stubs contain all the information needed to do the fix-ups, it should be possible to share common assembly stubs in multiple places (i.e., _x86_Vertex3f in r200, radeon, and t_vertex).

One disadvantage is if the codegen_stub structure is changed. If that structure is changed, all of the assembly files will also have to change. However, there won't be any compiler warnings for any that are "missed." We'll just get mysterious codegen related bugs. :(

Another disadvantage is that this code seems to be more prone to cut-and-paste type errors.

If this new method is acceptable to everyone, I'll modify the rest of the codegen stubs in the R200 driver to use it. I'd really like to put some form of r200_do_codegen in a shared location so that other places that do codegen can re-use it.

I can't help thinking there's a "right" way to do these fixups and we're just not using it. For instance, I don't know how the assembler "marks" addresses requiring relocation so that ld.so can find them efficiently later on - or whether we could use the same or simlar mechanism. I know linux does a related trick with its copy_from_user code by emitting labels or pointers to another section of the object file.


It seems like your approach still involves guessing (or pre-calculating) offsets into the generated machine-code. We've done a slightly different thing in the t_vtx_* codegen by using a distinctive dword (0x10101010+n, as it turns out), and basically doing a search & replace on that value, which seems to work. The C code still knows which order the fixups are supposed to occur in the code being fixed up. I guess the approaches could be combined, so that the 'n' value took on the same meaning as the 'entry' field in your structs, so that you might get something like:

GLOBL( _x86_MultiTexCoord2fv_stub )
        .long   _x86_MultiTexCoord2fv_end - _x86_MultiTexCoord2fv
        .long   2
        .long   4, FIXUP(0), 0
        .long   4, FIXUP(1), 1
_x86_MultiTexCoord2fv:
        movl    4(%esp), %eax
        movl    8(%esp), %ecx
        and     $TEX_TARGET_MASK, %eax

        movl    FIXUP(0)(,%eax,4), %edx # texcoord_size[unit] is 1, 2, or 3
        movl    FIXUP(1)(,%eax,4), %eax # texcoord_ptr[unit]

        decl    %edx
        jne     .3_2fv

        movl    (%ecx), %edx
        movl    %edx, (%eax)
        ret

etc.



Secondly, I see you're using a single TexCoord2f function to cope with all possible sizes of the texcoord in the actual emitted vertex. This is certainly the simplest appraoch, but it's worth pointing out that it's possible to have multiple versions of TexCoord2f, etc which are specialized for each emitted texcoord size, and thereby eliminate the branches in your code. It's probably not significant, though.

Keiht






-------------------------------------------------------
This SF.Net email is sponsored by Sleepycat Software
Learn developer strategies Cisco, Motorola, Ericsson & Lucent use to deliver higher performing products faster, at low TCO.
http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3
--
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to