I'm *finally* getting back to this.  Sheesh...

Keith Whitwell wrote:

I can't help thinking there's a "right" way to do these fixups and we're just not using it. For instance, I don't know how the assembler "marks" addresses requiring relocation so that ld.so can find them efficiently later on - or whether we could use the same or simlar mechanism. I know linux does a related trick with its copy_from_user code by emitting labels or pointers to another section of the object file.

It would be nice if we somehow had access to that data. To do that we'd need some sort of custom tool (probably) that could pre-process the .S file and generate a table of offsets.


It seems like your approach still involves guessing (or pre-calculating) offsets into the generated machine-code. We've done a slightly

There is still some error-prone human work involved. A cycle of assemble -> objdump -D -> examine code works well for short stubs like this. I can see it becoming very unwieldly for larger stubs. Hmm...perhaps a Python script could be written to automate that process...


different thing in the t_vtx_* codegen by using a distinctive dword (0x10101010+n, as it turns out), and basically doing a search & replace on that value, which seems to work. The C code still knows which order the fixups are supposed to occur in the code being fixed up. I guess the approaches could be combined, so that the 'n' value took on the same meaning as the 'entry' field in your structs, so that you might get something like:

Right. I saw that code when I was part way through writing these stubs. The problem is that you can't fix-up anything except 4-byte values. I don't see a clear way to extend that setup to 1 or 2-byte values or (more importantly) 8-byte values.


Secondly, I see you're using a single TexCoord2f function to cope with all possible sizes of the texcoord in the actual emitted vertex. This is certainly the simplest appraoch, but it's worth pointing out that it's possible to have multiple versions of TexCoord2f, etc which are specialized for each emitted texcoord size, and thereby eliminate the branches in your code. It's probably not significant, though.

Yeah, I took the easy route. :) I wasn't too worried about TexCoord2f, and specializing for MultiTexCoord would require examining the texture unit number to figure out how to emit. That was the nice thing when everything was emitted as 2f. Now that unit 0 can be 2f, unit 1 can be 3f, and unit 2 be 1f, things get messy. :(


Part of the reason getting this codegen stuff done right is important to me is I have some codegen for sw Mesa that I've had kicking around for a number of months. It's currently written more like my R200 patch, but it could be done in any style. Either way, it will need some updating to get working again. Basically, I wrote stubs to fetch texel data from texture maps. Each gl_texture_image got a stub generated that was hard-coded for its format, height, and width. The codegen was used in place of the texture's FetchTexelFunc function. That alone gave a litte more than a 5% speed up to tunnel. My ultimate (unrealized) goal was to codegen the entire texel-fetch (coordinate clamp, fetch, filter) stack.

The generated "fetch a filtered texel" functions could be used from the fragement program compiler, too.




-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
--
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to