On 15.02.2017 17:49, Marek Olšák wrote:
On Wed, Feb 15, 2017 at 1:49 PM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:
On 14.02.2017 23:51, Marek Olšák wrote:
On Tue, Feb 14, 2017 at 11:47 PM, Timothy Arceri <tarc...@itsqueeze.com>
wrote:
On 15/02/17 08:35, Marek Olšák wrote:
On Tue, Feb 14, 2017 at 9:53 PM, Timothy Arceri <tarc...@itsqueeze.com>
wrote:
On 15/02/17 02:14, Marek Olšák wrote:
On Tue, Feb 14, 2017 at 1:52 AM, Timothy Arceri
<tarc...@itsqueeze.com>
wrote:
---
src/gallium/drivers/r600/r600_pipe.c | 10 ++++++++++
src/gallium/drivers/radeon/r600_pipe_common.c | 2 +-
src/gallium/drivers/radeon/r600_pipe_common.h | 1 +
src/gallium/drivers/radeonsi/si_pipe.c | 11 +++++++++++
src/gallium/include/pipe/p_screen.h | 3 +++
src/mesa/state_tracker/st_context.c | 2 ++
6 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/r600/r600_pipe.c
b/src/gallium/drivers/r600/r600_pipe.c
index 5290f40..bdd8c0a 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -600,6 +600,7 @@ static void r600_destroy_screen(struct
pipe_screen*
pscreen)
compute_memory_pool_delete(rscreen->global_pool);
}
+ disk_cache_destroy(rscreen->b.b.disk_shader_cache);
r600_destroy_common_screen(&rscreen->b);
}
@@ -633,6 +634,15 @@ struct pipe_screen *r600_screen_create(struct
radeon_winsys *ws)
return NULL;
}
+ uint32_t mesa_timestamp;
+ if (disk_cache_get_function_timestamp(r600_screen_create,
&mesa_timestamp)) {
+ char *timestamp_str;
+ if (asprintf(×tamp_str, "%u", mesa_timestamp) !=
-1)
{
+ rscreen->b.b.disk_shader_cache =
disk_cache_create(r600_get_chip_name(&rscreen->b), timestamp_str);
+ free(timestamp_str);
+ }
+ }
+
if (rscreen->b.info.chip_class >= EVERGREEN) {
rscreen->b.b.is_format_supported =
evergreen_is_format_supported;
} else {
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 95a6a48..4e5582f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -722,7 +722,7 @@ static const char* r600_get_device_vendor(struct
pipe_screen* pscreen)
return "AMD";
}
-static const char* r600_get_chip_name(struct r600_common_screen
*rscreen)
+const char* r600_get_chip_name(struct r600_common_screen *rscreen)
{
switch (rscreen->info.family) {
case CHIP_R600: return "AMD R600";
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 6eff9aa..0449d4d 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -765,6 +765,7 @@ void radeon_save_cs(struct radeon_winsys *ws,
struct
radeon_winsys_cs *cs,
struct radeon_saved_cs *saved);
void radeon_clear_saved_cs(struct radeon_saved_cs *saved);
bool r600_check_device_reset(struct r600_common_context *rctx);
+const char* r600_get_chip_name(struct r600_common_screen *rscreen);
/* r600_gpu_load.c */
void r600_gpu_load_kill_thread(struct r600_common_screen
*rscreen);
diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
b/src/gallium/drivers/radeonsi/si_pipe.c
index 853d850..0bb95b1 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -712,6 +712,7 @@ static void si_destroy_screen(struct pipe_screen*
pscreen)
}
}
pipe_mutex_destroy(sscreen->shader_parts_mutex);
+ disk_cache_destroy(sscreen->b.b.disk_shader_cache);
si_destroy_shader_cache(sscreen);
r600_destroy_common_screen(&sscreen->b);
}
@@ -801,6 +802,16 @@ struct pipe_screen
*radeonsi_screen_create(struct
radeon_winsys *ws)
return NULL;
}
+ uint32_t mesa_timestamp, llvm_timestamp;
+ if (disk_cache_get_function_timestamp(radeonsi_screen_create,
&mesa_timestamp) &&
+
disk_cache_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo,
&llvm_timestamp)) {
+ char *timestamp_str;
+ if (asprintf(×tamp_str, "%u_%u", mesa_timestamp,
llvm_timestamp) != -1) {
+ sscreen->b.b.disk_shader_cache =
disk_cache_create(r600_get_chip_name(&sscreen->b), timestamp_str);
Can you please at least make some effort to get close to 80 chars per
line even though it's not strictly required in this driver?
+ free(timestamp_str);
+ }
+ }
+
si_handle_env_var_force_family(sscreen);
if (!debug_get_bool_option("RADEON_DISABLE_PERFCOUNTERS",
false))
diff --git a/src/gallium/include/pipe/p_screen.h
b/src/gallium/include/pipe/p_screen.h
index b6203f1..6fd527f 100644
--- a/src/gallium/include/pipe/p_screen.h
+++ b/src/gallium/include/pipe/p_screen.h
@@ -42,6 +42,7 @@
#include "pipe/p_format.h"
#include "pipe/p_defines.h"
#include "pipe/p_video_enums.h"
+#include "util/disk_cache.h"
@@ -318,6 +319,8 @@ struct pipe_screen {
const void *(*get_compiler_options)(struct pipe_screen *screen,
enum pipe_shader_ir ir,
unsigned shader);
+
+ struct disk_cache *disk_shader_cache;
};
Changes to gallium/include/pipe must be separate commits. We don't
change this interface unless there is a very good reason to do so. You
might have to get Brian's or Roland's ack if you wanna pursue this
change, though the likelihood of acceptance is pretty low on this one.
However you can ignore all the above, because there is a different
issue to discuss. If I understand it correctly, this only caches TGSI.
What's the point in putting the cache into the driver in that case?
I was trying to land the glsl ir and tgsi pieces and avoid getting
stuck
discussing the GCN cache patch at this point, there will defiantly be a
GCN
cache.
If
the goal is to have a GLSL->TGSI cache, it should live in st/mesa. If
the goal is to have a GLSL->GCN cache, it might live in the driver.
However, even that is not necessary if you just expose shader binaries
to st/mesa, in which case st/mesa can query binaries from the driver
and the GLSL->GCN cache can live entirely in st/mesa. If you need the
lib timestamp in st/mesa, you can also add a query to pipe_screen
returning the lib timestamp only. Then you'll have both native shader
binaries and the driver compiler timestamp exposed to st/mesa. So I'd
like to understand why the cache is in the driver.
This might work ok for radeonsi as you don't really have to worry about
variants, but what about other drivers? It would seem odd to add
something
to the st if its only used by a single driver wouldn't it? Happy to be
wrong
about this, but this was my thought process here.
OK. That's a fair point. Nouveau and radeonsi don't need a cache in
the driver,
Not sure what you're saying here. Having the GCN in the cache would
certainly help.
That said, implementing this in the state tracker via an interface like
below makes sense to me.
but other drivers like r600g might. However, r600g
compilation is so much faster than radeonsi, so it might not be that
important. Also given the rate of r600g development, I don't see a
native shader cache happening there. What other drivers are there
really? We have freedreno and vc4 as the other two actively developed
drivers. So you can ask Rob or Eric if they need a shader cache inside
the driver, and if not, well, there is no point in putting the cache
instance into drivers.
I asked Eric about this a while ago and while he hadn't thought about it
too
much he was thinking he would cache NIR. This is an interesting approach
and
something that I avoided in i965 because caching nir seemed painful.
Anyway
I'll point them to this thread for their comments.
Also its not clear to me where I can store the binary in the st, that
is
acceptable and accessible by the driver, any advice on this would be
appreciated.
The st is never accessible from the driver.
Sure what I mean is if we load the radeonsi/nouveau binary in the st at
link
time, how would you pass that down to the driver?
I would add an interface for creating a shader from a binary. So we have
this:
- pipe_screen::get_compiler_timestamp
I think this should return a string (and have a slightly different name) so
we have the freedom to add additional information when required. I'm
thinking GPU type (though that's also in the screen name) and R600_DEBUG
flags that affect compilation.
I talked with Timothy about this on IRC and the current plan is to
keep the on-disk cache inside gallium drivers. It might directly
interact with our in-memory cache, i.e. all in-memory cache misses
will fall back to the on-disk cache, which will load shaders from the
disk to the in-memory cache, and the in-memory cache will return a
cache hit. All shaders will stay in the in-memory cache after that.
Note that the on-disk cache shouldn't keep loaded shaders in memory,
because the in-memory cache will do that.
Makes sense.
- pipe_context::get_shader_binary
- pipe_context::create_shader_from_binary
Makes sense, but there's an open question: How do we store (optimized)
variants in the cache? I admit they're not as important for the raw loading
time, but it would be nice not to have the compiler threads busy during the
first seconds of a game, either.
We don't have to solve that all at once, but it would be nice to have an
interface that can be extended easily.
Adding optimized variants into the on-disk cache could add a lot of
complexity into the existing code but what would the benefit be? If
the async compilation is a problem, we can either decrease the number
of threads, or decrease the thread priority, or both (IMO).
All valid points. I just wanted us to think through the interface a bit
more to keep options open. In any case, this probably isn't an issue
anyway if the disk cache lives inside radeonsi.
Cheers,
Nicolai
Marek
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev