Please do not reply to this email: if you want to comment on the bug, go to    
       
the URL shown below and enter yourcomments there.     
   
https://bugs.freedesktop.org/show_bug.cgi?id=4197          
     

[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |NOTABUG




------- Additional Comments From [EMAIL PROTECTED]  2005-12-19 12:19 -------
(In reply to comment #8)

> (In reply to comment #7)
> > (In reply to comment #6)
> >  I also expect that it will kill the performance of quite a few apps.  
> > There's a
> > reason that we use custom assembly for this part of the code, and this patch
> > seems to defeat most of that purpose.
> 
> if you *really* cared about performance (instead of trying to find an excuse)
> then you'd have looked hard at the whole GL API/ABI picture already. let's see
> what a GL API call ends up in on linux/i386.

First of all, you will attract many more bees with honey than with vinegar.

Second of all, OpenGL is an industry-wide defined application interface.  We
aren't at liberty to change it willy-nilly.

Third of all, people on the graphics industry care about two things: 
correctness of rendering and the time required to do it.

>   app:          call [EMAIL PROTECTED]
>   app.plt:      jmp [EMAIL PROTECTED]
>   gl.dispatch:  mov eax,[_glapi_Dispatch]
>                 test eax,eax
>                 jz .get_dispatch
>                 jmp [eax+0x...]
>                 .get_dispatch:
>                 call _glapi_get_dispatch
>                 jmp [eax+0x...]
> 
> on the fast path, that's 6 insns, with no less than 3 memory accesses 
> (potential
> cache misses), 2 of which are indirect control flow changes (potential branch
> misprediction). and all that just to get to the first insn of the actual GL 
> API
> (which will do its prologue/epilogue on top of doing real work, mind you). if
> that's not an overkill then i don't know what is.

This isn't the fast path.  The fast path is the TLS case.  What is your point?

> my patch adds 5 insns to this, 3 of which are memory accesses. one memory 
> access
> is the same as the old code (same potential cache miss), the other two are

So, your patch roughly doubles the worst-case function call overhead.  There are
a number of drivers (e.g., the open-source R200 driver) where the ABI overhead
is *more* instructions than the actual function being called (e.g., the run-time
codegen immediate mode functions).  Doubling (or even increasing by 30%) that
overhead will make a measurable impact on industry standard benchmarks.  I can
guarnantee that people that care about graphics care way the hell more about
that they do about texrels.

> guaranteed to be cached, one is the return address on the stack, the other is
> the call insn executed just before the access. in other words, the execution
> overhead in absolute terms (clock cycles) is minimal. where you can have a
> measurable impact is when the absolut overhead is comparable to the given GL 
> API
> execution time itself. i recall ajax mentioned that some of them are very 
> short
> (in terms of asm insns), but those are also the APIs that already suffer the 6
> insns/3 memory access 'custom assembly' that you had for this purpose (again, 
> in
> addition to the API's prologue/epilogue). adding 5 more will make it worse, 
> but
> it's already bad and can't be the reason for outright rejection. if you really
> wanted to fix it, then you'd find a way for inlining such short API calls, 
> that
> will not only eliminate the API call overhead (including prologue/epilogue 
> code)

You clearly have zero understanding of the OpenGL ABI works on Linux.  We
*CAN'T* inline this stuff in the application.  There is no way to know at
compile time what the actual function will be!  This layer provides an level of
indirection similar to a C++ virtual function table.  Are you going to suggest
that G++ somehow inline virtual functions at compile time?  There's no way to
know what function will be called because:

1. It depends on the driver that is currently loaded.
2. It depends on the libGL being used (Nvidia's libGL handles some of these
cases very differently than we do).
3. It depends on the current GL state.

> > Given the choice between removing TEXTRELs and improving (or maintaining)
> > performance, I will pick performance every time.
> 
> i think you don't realize what choice you're making (and i definitely disagree
> that you should be making that choice for all users and deny them a 
> textrel-free
> X/GL). textrels are a subset of runtime code generation which itself is a

I think that penalizing the majority of our users to satisfy a minority is a
mistake.  If some particular disto wants a textrel-free libGL, I welcome them to
build the C version of the API dispatch routines with -fPIC.

End of story.
          
     
     
--           
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email         
     
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to