Hi;

On 11/06/2013 07:35 PM, Erik Faye-Lund wrote:
Sorry for the late reply.

On Fri, Nov 1, 2013 at 4:14 PM, Tapani <tapani.pa...@intel.com> wrote:
On 11/01/2013 03:31 PM, Erik Faye-Lund wrote:


That won't help for programs that maintain their own (explicit)
shader-cache, which was the intention of introducing binary shaders to
OpenGL ES in the first place.

Ok, we are talking about the extension, I thought you referred to the
automatic caching. For extension to work, we need at least more Piglit tests
to ensure that it does not break.
I was actually of talking about both. But for the caching, it's
probably more forgivable, as developers probably know they changed the
compiler and can step around it by flushing the cache. Especially if
the build time gets included, like Pauls suggested.

Yep, I'll add Paul's improvements to make this more robust/sane.

Of course every time you go and touch the
code, some functionality may break, be it this extension or something else.
That's completely irrelevant here. The rejection mechanism isn't
intended to catch bugs, but to detect intentional format changes. So
let's not confuse the issue.

Fair enough, what I meant is that implementation relies on some of the things to work as they work currently and this might change over time. I do admit that at it's current state it can break easily so it needs to be changed to be more robust.

I'm not sure if Chromium, Qt or other users expect glBinaryProgram call to
always succeed, hopefully not.
If they do, they're buggy. But there is a chance for that, yes. But
I'm actually talking about that they might get *told* that everything
went well, and still get a broken shader. Or even a crash. Of course,
this only applies when mesa is build with local modifications, but
that happens a *lot* while debugging application issues. Perhaps bugs
start to disappear, because applications take another non-buggy
code-path? It'll probably only affect developers, and not happen in
the wild. But I still don't think that's a good excuse.

OK, if not perfect this should at least improve by adding the modification time and invalidating cache based on that.

However, by a strict reading of the spec, I don't even yhink we're
allowed to break the shaders for just any reason. The wording of the
spec is "An implementation may reject a program binary if it
determines the program binary was produced by an incompatible or
outdated version of the compiler". The way I read that, changes that
doesn't modify the compiler aren't really allowed to reject previous
shaders. While diverging from the spec on this *might* not have many
real-world implications, at the very best your solution goes against
the intention of this rejection-mechanism.

IMO this is a bit academical nit-picking on the subject, you can't really draw very strict lines here (as with some other Khronos specs). As example many structures are shared within compiler code and other code. Changing these structures means you are changing the compiler code so your produced shader will be incompatible. I think having a strict approach here is more safer and still provides the functionality for most of the users, it is up to the driver to decide when a binary is unacceptable for whatever reason, it cannot be specified in a spec in a sufficient 'works for everyone' way.


I would assume they simply piggie-backed on their binary-shader
support. But I somehow doubt they have such a "lazy" approach to
binary shaders as this series attempts. I worked on
ARM_mali_shader_binary for the Mali-200 drivers myself, and our
approach was quite different from this, and surely much more robust.

With such strong opinion It would be nice to have some more technical
explanation. Why it was "surely more robust"?
Easy; it didn't break compatibility for every change to the driver. It
didn't even break when we updated the compiler. As a matter of fact,
I'm not even aware of *any* compatibility breakages (apart from if you
change the hardware, of course), ever.

I don't see my implementation breaking on 'every change to the driver'. It's just the check that is quite conservative with this. I will try to improve the structures I'm dumping to make the implementation more robust.

The implementation itself can
be likely very different as it targets only a particular GPU while the
approach here is meant to be more generic.
Our implementation did also support multiple GPUs, however I was only
involved in the initial Mali-200 work. But the Mali-200 does not have
a unified ISA, so we had to touch on this area anyway.

To be honest, I find the whole idea of serializing the IR quite
repelling, as it goes against almost every intention of the extension.
Add to that mesa's IR not at all being stable, well.... yeah, that's a
super-highway to disaster.

Again, I would appreciate a bit more technical approach if possible. I can't
tell from this paragraph if you have a better idea for the architecture or
not.
So, what we did was to create a container-format for the shaders and
programs, including all the information needed to use these. We stored
binary machine-code, symbol-tables, plus information needed to do
binary-patching (IIRC this was only needed for the linking of binary
shaders, not for already-linked binary programs. I think we still kept
it, though, but mostly as an implementation-detail). The container
format was common for all cores, and so was the symbol-table. In
addition, each core had some dedicated code to read and write a blob
of their own, containing whatever information they needed to do their
job.

This is pretty much the same thing as NVIDIA did for their Tegra-driver.

Storing halfway compiled code rather than fully compiled code seems
quite sketchy to me, as it introduces the need for compatibility at a
point where flexibility is needed. The result is a program-binary that
1) must still perform register allocation and scheduling when loading,
so the time-saving is limited, and 2) fails to reload even for trivial
bugfixes in unrelated code. To me, it sounds like it would be better
to simply not have the extension exposed than this middle-ground. I'd
hate to see applications having to do vendor detection for mesa to
disable caching that turned out to be more of a burden than a gain.

I'd like to add that this last piece of criticism is only about the
extension, and not of a shader cache. A shader-cache probably isn't
really exposed to developers with some expectation of what it will do.

An approach that seems much more sane to me would be to simply have
the different drivers deal entirely with the binary programs, and have
them default to not supporting any at all. Perhaps the drivers can
share some helper-code to serialize symbol tables and whatnot. Or even
have a common format for all high-level stuff, and let the drivers
serialize a blob of their own for each shader. It's more work, but IMO
the extension would become much more useful.

Thanks for the reply and taking time on this. I understand your approach and opinion. IMO it would be cool to share some of the cache with all the drivers though. I've already seen nice shader-fb run improvements on just with caching the linked IR, of course caching the driver binaries would/will make it much better but I consider it as optimization.

// Tapani

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to