Re: [Mesa-dev] [PATCH shader-db 1/3] split-to-files: deal with minimum versions, other shader types

Eero Tamminen Wed, 09 Dec 2015 07:02:01 -0800

Hi,

On 11/09/2015 08:47 PM, Matt Turner wrote:

On Mon, Nov 9, 2015 at 10:46 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote:

I used this script in conjunction with ST_DUMP_SHADERS. What other way is there?


Some local hack and we should probably finish and upstream.


Did anything happen with this?

I had to rewrite split-to-files because it didn't output all (ARB)shaders and it picks the wrong one [1] when application re-uses sameprogram numbers.

[1] It picked first one, although I think almost always the last onewill be most interesting. Last one will also allow easily dumpingdifferent shader sets (that re-use same program numbers) from the sameapplication.


RFC patches for my changes are attached.


        - Eero

>From c52f02ff664af269fa5268627624fe94c647ad37 Mon Sep 17 00:00:00 2001
From: Eero Tamminen <eero.t.tammi...@intel.com>
Date: Wed, 9 Dec 2015 16:29:43 +0200
Subject: [PATCH 1/2] Rewrite split-to-files.py to fix it

This rewrite improves on previous version in following ways:

* Improve recognization of shader end.

* Remove extra lines after shader ends (in normal shaders anything
  after last line with '}' that closes main(), and in ARB shaders, lines
  after END).

* Optimize parsing by using compiled regexps.

* Calculate (md5) hashes for normalized (single line comments and white
  space removed) shader contents and identify duplicate shaders with
  those

* If program gets a new shader, output the latest one.  It should be
  more relevant one.  It also allows dumping different shader sets
  e.g. shaders for startup / game menu vs. actual game play, just by
  running application further before killing it.

* When application replaces ARB shaders, continue instead of claiming
  to be done & exiting.  Same program numbers can be used if
  application removes previous programs.

* Tell user which shaders were duplicates and which were replaced by
  which shaders.

* Remove duplicate programs based on shader stage hashes (of their
  normalized sources) and tell user about this.

* Output shader stage sources in 3D pipeline order.

* Give ARB shaders different file name from normal shaders.
---
 split-to-files.py |  409 +++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 306 insertions(+), 103 deletions(-)

diff --git a/split-to-files.py b/split-to-files.py
index 151681e..7150622 100755
--- a/split-to-files.py
+++ b/split-to-files.py
@@ -2,122 +2,300 @@
 
 import re
 import os
+import hashlib
 import argparse
 
 
+class ShaderBase:
+    def __init__(self, prog, stage):
+        self.lines = []
+        self.progid = prog # latest
+        self.programs = {self.progid: True} # all
+        self.stage = stage
+        self.hash = None
+        self.hashed_len = 0
+        self.done = False
+        self.replaced = False
+        # filled by subclasses
+        self.shadernum = 0
+        self.req_start = None
+        self.req_end = None
+        self.warn = None
+
+    def append_line(self, line):
+        assert not self.done
+        self.lines.append(line)
+
+    def is_finished(self, line):
+        return False
+
+    def get_source(self):
+        assert self.done
+        return "\n".join(self.lines) + "\n"
+
+    def add_program(self, dup):
+        self.programs[dup.progid] = True
+
+    def del_program(self, dup):
+        del(self.programs[dup.progid])
+
+    def get_hash(self):
+        if self.hash:
+            return self.hash
+        assert self.done
+
+        # source without single line comments & whitespace
+        normalized = []
+        for line in self.lines:
+            offset = line.find("//")
+            if offset >= 0:
+                line = line[:offset]
+            # Python2: line = line.translate(None, " \t")
+            line = line.translate({' ': None, '\t': None})
+            if line:
+                normalized.append(line)
+        normalized = "".join(normalized).encode()
+
+        # create hash for normalized source
+        md5 = hashlib.md5()
+        self.hashed_len = len(normalized)
+        md5.update(normalized)
+        self.hash = md5.hexdigest()
+        return self.hash
+
+    def check_conflict(self, dup):
+        assert self.done and dup.done
+        if self.hash == dup.hash and self.hashed_len != dup.hashed_len:
+            print("ERROR: hash collision with %s" % dup.get_info())
+            exit(-1)
+        if dup.stage != dup.stage:
+            # same shader for different stage, this isn't handled correctly at the moment
+            # all code assumes that each hash/shader represents just one shader stage
+            print("ERROR: duplicate is for different shader stage (%s)" % dup.stage)
+            exit(-1)
+
+    def get_info(self):
+        assert None # must be subclassed
+
+    def show_info(self):
+        print(self.get_info())
+
+
+class ShaderARB(ShaderBase):
+    def __init__(self, prog, stage):
+        ShaderBase.__init__(self, prog, stage)
+        self.progid = "%s-ARB_%s" % (prog, stage)
+
+        self.req_start = "GL_ARB_{0}_program".format(self.stage)
+        # INTEL_DEBUG won't output anything for ARB programs unless you draw
+        self.req_end = "[test]\ndraw rect -1 -1 1 2"
+
+    def is_finished(self, line):
+        if line == "END":
+            self.lines.append(line)
+            self.done = True
+            return True
+        return False
+
+    def do_cleanup(self):
+        if self.done:
+            return
+        self.done = True
+        idx = len(self.lines)
+        while idx > 0:
+            idx -= 1
+            if "END" in self.lines[idx]:
+                # shader end
+                return
+            del(self.lines[idx])
+        print("- ERROR: empty shader!")
+
+    def get_info(self):
+        return "program %s shader" % self.progid
+
+    def get_version(self):
+        return 0
+
+    def write_stage(self, out):
+        out.write("[{0} program]\n".format(self.stage))
+        out.write(self.get_source())
+
+
+class Shader(ShaderBase):
+    map = {
+     "vertex": "[vertex shader]",
+     "fragment": "[fragment shader]",
+     "geometry": "[geometry shader]",
+     "tess ctrl": "[tessellation control shader]",
+     "tess eval": "[tessellation evaluation shader]"
+    }
+    r_version = re.compile(r"^#version (\d\d\d)")
+
+    def __init__(self, prog, stage, shadernum):
+        ShaderBase.__init__(self, prog, stage)
+        self.shadernum = shadernum
+
+    def get_info(self):
+        return "program %s %s shader %d" % (self.progid, self.stage, self.shadernum)
+
+    def do_cleanup(self):
+        if self.done:
+            return
+        self.done = True
+        idx = len(self.lines)
+        while idx > 0:
+            idx -= 1
+            if "}" in self.lines[idx]:
+                # main() end
+                return
+            del(self.lines[idx])
+        print("- ERROR: empty shader!")
+
+    def get_version(self):
+        source = self.get_source()
+        match = self.r_version.match(source)
+        if match:
+            return int(match.group(1), 10)
+        return 110
+
+    def write_stage(self, out):
+        out.write("%s\n" % self.map[self.stage])
+        out.write(self.get_source())
+
+
+def finish_shader(shader, programs, shaders):
+    shader.do_cleanup()
+
+    hashed = shader.get_hash()
+    #print("- hash %s" % hashed)
+
+    # link shader hash to a program stage
+    if not shader.progid in programs:
+        programs[shader.progid] = {}
+    prog_dict = programs[shader.progid]
+    if shader.stage in prog_dict and hashed != prog_dict[shader.stage]:
+        dup = shaders[prog_dict[shader.stage]]
+        assert shader.progid in dup.programs
+        print("- replacing earlier (different) shader")
+        dup.del_program(shader)
+        dup.replaced = True
+    prog_dict[shader.stage] = hashed
+
+    # add shader to shader dict, keyed by hash, check for duplicates
+    if hashed in shaders:
+        dup = shaders[hashed]
+        if shader.progid in dup.programs:
+            print("- recompiled earlier one")
+        else:
+            replaced = ""
+            if dup.replaced:
+                replaced = "(replaced) "
+            print("- duplicate of %s%s" % (replaced, dup.get_info()))
+            dup.add_program(shader)
+        shader.check_conflict(dup)
+    else:
+        shaders[hashed] = shader
+
+
 def parse_input(infile):
-    shaders = dict()
-    programs = dict()
-    shadertuple = ("bad", 0)
-    prognum = ""
+    r_decl = re.compile(r"GLSL (.*) shader (.*) source for linked program (.*):")
+    r_arb = re.compile(r"ARB_([^_]*)_program source for program (.*):")
+    r_end = re.compile(r"(GLSL IR|Mesa IR|GLSL source) for")
+    shaders = {}
+    programs = {}
     reading = False
-    is_glsl = True
 
     for line in infile.splitlines():
-        declmatch = re.match(
-            r"GLSL (.*) shader (.*) source for linked program (.*):", line)
-        arbmatch = re.match(
-            r"ARB_([^_]*)_program source for program (.*):", line)
-        if declmatch:
-            shadertype = declmatch.group(1)
-            shadernum = declmatch.group(2)
-            prognum = declmatch.group(3)
-            shadertuple = (shadertype, shadernum)
 
+        match_decl = r_decl.match(line)
+        match_arb = r_arb.match(line)
+
+        if reading:
+            if match_decl or match_arb or r_end.match(line) or shader.is_finished(line):
+                finish_shader(shader, programs, shaders)
+                reading = False
+            else:
+                shader.append_line(line)
+                continue
+
+        if match_decl:
+            prognum = match_decl.group(3)
             # don't save driver-internal shaders.
             if prognum == "0":
                 continue
 
-            if prognum not in shaders:
-                shaders[prognum] = dict()
-            if shadertuple in shaders[prognum]:
-                print("Warning: duplicate", shadertype, " shader ", shadernum,
-                      "in program", prognum, "...tossing old shader.")
-            shaders[prognum][shadertuple] = ''
+            shadernum = int(match_decl.group(2))
+            shadertype = match_decl.group(1)
+            shader = Shader(prognum, shadertype, shadernum)
+            shader.show_info()
             reading = True
-            is_glsl = True
-            print("Reading program {0} {1} shader {2}".format(
-                prognum, shadertype, shadernum))
-        elif arbmatch:
-            shadertype = arbmatch.group(1)
-            prognum = arbmatch.group(2)
-            if prognum in programs:
-                print("dupe!")
-                exit(1)
-            programs[prognum] = (shadertype, '')
+            continue
+
+        if match_arb:
+            shadertype = match_arb.group(1)
+            prognum = match_arb.group(2)
+            shader = ShaderARB(prognum, shadertype)
+            shader.show_info()
             reading = True
-            is_glsl = False
-            print("Reading program {0} {1} shader".format(prognum, shadertype))
-        elif re.match("GLSL IR for ", line):
-            reading = False
-        elif re.match("Mesa IR for ", line):
-            reading = False
-        elif re.match("GLSL source for ", line):
-            reading = False
-        elif reading:
-            if is_glsl:
-                shaders[prognum][shadertuple] += line + '\n'
-            else:
-                type, source = programs[prognum]
-                programs[prognum] = (type, ''.join([source, line, '\n']))
-
-    return (shaders, programs)
-
-
-def write_shader_test(filename, shaders):
-    print("Writing {0}".format(filename))
-    out = open(filename, 'w')
-
-    min_version = 110
-    for stage, num in shaders:
-        shader = shaders[(stage, num)]
-        m = re.match(r"^#version (\d\d\d)", shader)
-        if m:
-            version = int(m.group(1), 10)
-            if version > min_version:
-                min_version = version
-
-    out.write("[require]\n")
-    out.write("GLSL >= %.2f\n" % (min_version / 100.))
-    out.write("\n")
-
-    for stage, num in shaders:
-        if stage == "vertex":
-            out.write("[vertex shader]\n")
-        elif stage == "fragment":
-            out.write("[fragment shader]\n")
-        elif stage == "geometry":
-            out.write("[geometry shader]\n")
-        elif stage == "tess ctrl":
-            out.write("[tessellation control shader]\n")
-        elif stage == "tess eval":
-            out.write("[tessellation evaluation shader]\n")
+            continue
+
+    return programs, shaders
+
+
+def remove_duplicates(programs):
+    "parse_input() just identifies duplicate shaders, this removes programs which all stages are duplicate"
+    toremove = []
+    hash2prog = {}
+    for program, stages in programs.items():
+        # tuple of hashes for all stages in a program
+        hashes = tuple(sorted(stages.values()))
+        if hashes in hash2prog:
+            toremove.append(program)
+            prevprog = hash2prog[hashes]
+            print("removing program %s which is duplicate of program %s" % (program, prevprog))
         else:
-            assert False, stage
-        out.write(shaders[(stage, num)])
-
-    out.close()
-
-def write_arb_shader_test(filename, type, source):
-    print("Writing {0}".format(filename))
-    out = open(filename, 'w')
-    out.write("[require]\n")
-    out.write("GL_ARB_{0}_program\n".format(type))
-    out.write("\n")
-    out.write("[{0} program]\n".format(type))
-    out.write(source)
-    # INTEL_DEBUG won't output anything for ARB programs unless you draw
-    out.write("\n[test]\ndraw rect -1 -1 1 2\n");
-    out.close()
-
-def write_files(directory, shaders, programs):
-    for prog in shaders:
-        write_shader_test("{0}/{1}.shader_test".format(directory, prog),
-                          shaders[prog])
-    for prognum in programs:
-        prog = programs[prognum]
-        write_arb_shader_test("{0}/{1}p-{2}.shader_test".format(directory,
-            prog[0][0], prognum), prog[0], prog[1])
+            hash2prog[hashes] = program
+
+    if toremove:
+        for program in toremove:
+            del(programs[program])
+    else:
+        print("- No duplicate programs")
+
+
+def get_req_check(shaders, hashes):
+    "do some extra program checks and return appropriate require section content"
+    types = {}
+    versions = []
+    for stage_hash in hashes:
+        shader = shaders[stage_hash]
+        versions.append(shader.get_version())
+        types[shader.__class__] = True
+        if shader.warn:
+            print("WARNING: %s" % shader.warn)
+
+    if len(types) > 1:
+        print("WARNING: program %s mixes different shader types in its stages" % shader.progid)
+
+    min_version = max(versions)
+    if min_version:
+        return "GLSL >= %.2f" % (min_version / 100.)
+    else:
+        # ARB
+        first_stage = shaders[list(hashes)[0]]
+        return first_stage.req_start
+
+
+def stage_order(stage):
+    order = {
+     "vertex": 1, 
+     "tess ctrl": 2,
+     "tess eval": 3,
+     "geometry": 4,
+     "fragment": 5
+    }
+    return order[stage]
+
 
 def main():
     parser = argparse.ArgumentParser()
@@ -125,14 +303,39 @@ def main():
     parser.add_argument('mesadebug', help='MESA_GLSL=dump output file')
     args = parser.parse_args()
 
+    print("Parsing...")
     dirname = "shaders/{0}".format(args.appname)
     if not os.path.isdir(dirname):
         os.mkdir(dirname)
 
     with open(args.mesadebug, 'r') as infile:
-        shaders, programs = parse_input(infile.read())
+        programs, shaders = parse_input(infile.read())
+
+    print("\nRemoving duplicate programs...")
+    remove_duplicates(programs)
+
+    print("\nDumping...")
+    for program, stage_dict in programs.items():
+        filename = "%s/%s.shader_test" % (dirname, program)
+        print("Writing {0}".format(filename))
+        out = open(filename, 'w')
+
+        hashes = stage_dict.values()
+        out.write("[require]\n")
+        out.write("%s\n" % get_req_check(shaders, hashes))
+
+        stages = sorted(stage_dict.keys(), key=stage_order)
+        for stage in stages:
+            out.write("\n")
+            hashed = stage_dict[stage]
+            shader = shaders[hashed]
+            shader.write_stage(out)
+
+        if shader.req_end:
+            out.write("\n%s\n" % shader.req_end)
+
+        out.close()
 
-    write_files(dirname, shaders, programs)
 
 if __name__ == "__main__":
     main()
-- 
1.7.10.4

>From d0657bed5a7592bf4bd428775129dec3376f1b6d Mon Sep 17 00:00:00 2001
From: Eero Tamminen <eero.t.tammi...@intel.com>
Date: Wed, 9 Dec 2015 16:50:22 +0200
Subject: [PATCH 2/2] Update README

* Note how to get different shader sets from same program with
  the rewritten split-to-files.py script
* Update list of supported drivers
* remove "env" (unecessary)
* More white space for readability
---
 README |   23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/README b/README
index 06294c9..31355c2 100644
--- a/README
+++ b/README
@@ -4,18 +4,26 @@ A giant pile of shaders from various apps, for whatever purpose.  In
 particular, we use it to capture assembly output of the shader
 compiler for analysis of regressions in compiler behavior.
 
-Currently it supports Mesa's i965 and radeonsi drivers.
+Currently it supports Mesa's i965, radeonsi, freedreno drivers.
+
 
 === Capturing shaders ===
-env MESA_GLSL=dump appname |& tee log
+
+MESA_GLSL=dump appname |& tee log
 ./split-to-files.py appname log
-# clean up resulting files, as the parsing is just an assist, not actually
-# complete.
+# check & clean up resulting files in case log file had extra content
 $EDITOR shaders/appname/*
 
+If application replaces a lot of shaders without them being duplicate,
+collect logs from different application stages (e.g. startup and
+gameplay) by killing it at suitable places.  ./split-to-files.py will
+save the latest version of each shader program sources in each case.
+
+
 === i965 Usage ===
 
 === Running shaders ===
+
 ./run shaders 2> err | tee new-run
 
 # To run just a subset:
@@ -30,6 +38,7 @@ To compile shaders for an i965 PCI ID different from your system, pass
 to run.
 
 === Analysis ===
+
 ./report.py old-run new-run
 
 
@@ -44,8 +53,10 @@ Note that a debug mesa build required (ie. --enable-debug)
 -1 option for disabling multi-threading is required to avoid garbled shader dumps.
 
 === Analysis ===
+
 ./si-report.py old-run new-run
 
+
 === freedreno Usage ===
 
 === Running shaders ===
@@ -57,13 +68,17 @@ Note that a debug mesa build required (ie. --enable-debug)
 -1 option for disabling multi-threading is required to avoid garbled shader dumps.
 
 === Analysis ===
+
 ./fd-report.py old-run new-run
 
+
 === Dependencies ===
+
 run requires some GNU C extensions, render nodes (/dev/dri/renderD128),
 libepoxy, OpenMP, and Mesa configured with --with-egl-platforms=x11,drm
 
 === jemalloc ===
+
 Since run compiles shaders in different threads, malloc/free locking overhead
 from inside Mesa can be expensive. Preloading jemalloc can cut significant
 amounts of time:
-- 
1.7.10.4

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH shader-db 1/3] split-to-files: deal with minimum versions, other shader types

Reply via email to