On 10/08/2016 06:58 PM, Jason Ekstrand wrote:
FYI, we use ralloc for a lot more than just the glsl compiler so the
first few changes make me a bit nervous.  There was someone working on
making our driver more I undefined-memory-friendly but I don't know what
happened to those patches.

There's bunch of patches like that in this series:
https://lists.freedesktop.org/archives/mesa-dev/2016-June/120445.html

it looks like it just never landed as would have required more testing on misc drivers?


On Oct 8, 2016 3:58 AM, "Marek Olšák" <mar...@gmail.com
<mailto:mar...@gmail.com>> wrote:

    Hi,

    This patch series reduces the number of malloc calls in the GLSL
    compiler by 63%. That leads to better compile times and less heap
    thrashing.

    It's done by switching memory allocations in the GLSL compiler to my
    new linear allocator that allocates out of a fixed-sized buffer with
    a monotonically increasing offset. If more buffers are needed, it
    chains them.

    The new allocator is used in all places where short-lived allocations
    are used with a high number of malloc calls. The series also contains
    other improvements not related to the new allocator that also improve
    compile times. The results are below.

    I tested my shader-db with shaders only being compiled to TGSI.
    (noop gallium driver)


    master + libc's malloc:

     real   0m54.182s
     user   3m33.640s
     sys    0m0.620s
     maxmem 275 MB


    master + jemalloc preloaded:

     real   0m45.044s
     user   2m56.356s
     sys    0m1.652s
     maxmem 284 MB


    the series + libc's malloc:

     real   0m46.221s
     user   3m2.080s
     sys    0m0.544s
     maxmem 270 MB


    the series + jemalloc preloaded:

     real   0m40.729s
     user   2m39.564s
     sys    0m1.232s
     maxmem 284 MB


    The series without jemalloc almost caught up with jemalloc + master.
    However, jemalloc also benefits.

    Current Mesa needs 54.182s and it drops to 40.729s with my series and
    jemalloc. The total change in compile time is -25% if we incorporate
    both. Without jemalloc, the difference is only -14.7%.

    With radeonsi, the improvement is approx. slightly more than 1/2 of that
    (if you add the LLVM time). However, radeonsi also has asynchronous
    shader compilation hiding LLVM overhead in some cases, so it depends.

    Drivers with faster compiler backends will benefit more than radeonsi,
    but will probably not reach -25% or -14.7% (except softpipe, which uses
    TGSI as-is).

    The memory usage looks reasonable in all tested cases.

    Note: One of the first patches moves memset from ralloc to rzalloc.
    I tested and fixed the GLSL source -> TGSI path, but other codepaths
    may break, and you need to use valgrind to find all uninitialized
    variables that relied on ralloc doing memset (if there are any).

    You can also find it here:
    https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework
    <https://cgit.freedesktop.org/~mareko/mesa/log/?h=glsl-alloc-rework>

    Please review.

     src/compiler/glsl/ast.h                             |   4 +-
     src/compiler/glsl/ast_to_hir.cpp                    |   4 +-
     src/compiler/glsl/ast_type.cpp                      |  13 ++-
     src/compiler/glsl/glcpp/glcpp-lex.l                 |   2 +-
     src/compiler/glsl/glcpp/glcpp-parse.y               | 203
    +++++++++++++++++---------------------
     src/compiler/glsl/glcpp/glcpp.h                     |   1 +
     src/compiler/glsl/glsl_lexer.ll                     |  16 +--
     src/compiler/glsl/glsl_parser.yy                    | 202
    +++++++++++++++++++-------------------
     src/compiler/glsl/glsl_parser_extras.cpp            |   6 +-
     src/compiler/glsl/glsl_parser_extras.h              |   4 +-
     src/compiler/glsl/glsl_symbol_table.cpp             |  19 ++--
     src/compiler/glsl/glsl_symbol_table.h               |   1 +
     src/compiler/glsl/ir.cpp                            |   4 +
     src/compiler/glsl/ir.h                              |  13 ++-
     src/compiler/glsl/link_uniform_blocks.cpp           |   2 +-
     src/compiler/glsl/list.h                            |   2 +-
     src/compiler/glsl/lower_packed_varyings.cpp         |   8 +-
     src/compiler/glsl/opt_constant_propagation.cpp      |  14 ++-
     src/compiler/glsl/opt_copy_propagation.cpp          |   7 +-
     src/compiler/glsl/opt_copy_propagation_elements.cpp |  19 ++--
     src/compiler/glsl/opt_dead_code_local.cpp           |  12 ++-
     src/compiler/glsl_types.cpp                         |  38 +------
     src/compiler/glsl_types.h                           |   6 +-
     src/compiler/nir/nir.c                              |   8 +-
     src/compiler/spirv/vtn_variables.c                  |   3 +-
     src/gallium/drivers/freedreno/ir3/ir3.c             |   2 +-
     src/gallium/drivers/vc4/vc4_cl.c                    |   2 +-
     src/gallium/drivers/vc4/vc4_program.c               |   2 +-
     src/gallium/drivers/vc4/vc4_simulator.c             |   5 +-
     src/mesa/drivers/dri/i965/brw_state_batch.c         |   5 +-
     src/util/ralloc.c                                   | 392
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
     src/util/ralloc.h                                   |  93
    ++++++++++++++++--
     32 files changed, 782 insertions(+), 330 deletions(-)

    Marek
    _______________________________________________
    mesa-dev mailing list
    mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org>
    https://lists.freedesktop.org/mailman/listinfo/mesa-dev
    <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>




_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to