On Wed 21 Jun 2017, Jason Ekstrand wrote: > On Wed, Jun 21, 2017 at 12:33 PM, Chad Versace <[1]chadvers...@chromium.org> > wrote: > > On Fri 16 Jun 2017, Jason Ekstrand wrote: > > This commit replaces the complex and confusing set of disable flags with > > two fairly straightforward fields which describe the intended auxiliary > > surface usage and whether or not the miptree supports fast clears. > > Right now, supports_fast_clear can be entirely derived from aux_usage > > but that will not always be the case. > > > > This commit makes functional changes. One of these changes is that it > > re-enables multisampled fast-clears which were accidentally disabled in > > cec30a666930ddb8476a9452a89364a24979ff62 around a year ago. It should > > also enable CCS_E for window-system buffers which are Y-tiled. They > > will still get a full resolve like CCS_D but we will at least get some > > of the advantage of compression. > > --- > > src/mesa/drivers/dri/i965/brw_blorp.c | 4 +- > > src/mesa/drivers/dri/i965/intel_fbo.c | 2 +- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 190 > +++++++++++++------------- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 43 +++--- > > 4 files changed, 120 insertions(+), 119 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri > /i965/brw_blorp.c > > index 00092ee..9bd25f0 100644 > > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > > @@ -762,7 +762,7 @@ do_single_blorp_clear(struct brw_context *brw, > struct > gl_framebuffer *fb, > > if (set_write_disables(irb, ctx->Color.ColorMask[buf], > color_write_disable)) > > can_fast_clear = false; > > > > - if (irb->mt->aux_disable & INTEL_AUX_DISABLE_CCS || > > + if (!irb->mt->supports_fast_clear || > > !brw_is_color_fast_clear_compatible(brw, irb->mt, &ctx-> > Color.ClearColor)) > > can_fast_clear = false; > > > > @@ -785,7 +785,7 @@ do_single_blorp_clear(struct brw_context *brw, > struct > gl_framebuffer *fb, > > */ > > if (!irb->mt->mcs_buf) { > > assert(!intel_miptree_is_lossless_compressed(brw, irb->mt)); > > - if (!intel_miptree_alloc_ccs(brw, irb->mt, false)) { > > + if (!intel_miptree_alloc_ccs(brw, irb->mt)) { > > /* MCS allocation failed--probably this will only happen in > > * out-of-memory conditions. But in any case, try to > recover > > * by falling back to a non-blorp clear technique. > > diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c > b/src/mesa/drivers/dri > /i965/intel_fbo.c > > index ee4aba9..6a64bcb 100644 > > --- a/src/mesa/drivers/dri/i965/intel_fbo.c > > +++ b/src/mesa/drivers/dri/i965/intel_fbo.c > > @@ -555,7 +555,7 @@ intel_renderbuffer_update_wrapper(struct brw_context > *brw, > > > > intel_renderbuffer_set_draw_offset(irb); > > > > - if (intel_miptree_wants_hiz_buffer(brw, mt)) { > > + if (mt->aux_usage == ISL_AUX_USAGE_HIZ && !mt->hiz_buf) { > > intel_miptree_alloc_hiz(brw, mt); > > if (!mt->hiz_buf) > > return false; > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/ > drivers/dri/i965/intel_mipmap_tree.c > > index 0f6d542..101317f 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -64,7 +64,7 @@ intel_miptree_alloc_mcs(struct brw_context *brw, > > */ > > static enum intel_msaa_layout > > compute_msaa_layout(struct brw_context *brw, mesa_format format, > > - enum intel_aux_disable aux_disable) > > + uint32_t layout_flags) > > { > > /* Prior to Gen7, all MSAA surfaces used IMS layout. */ > > if (brw->gen < 7) > > @@ -90,7 +90,7 @@ compute_msaa_layout(struct brw_context *brw, > mesa_format format, > > */ > > if (brw->gen == 7 && _mesa_get_format_datatype(format) == GL_INT) > { > > return INTEL_MSAA_LAYOUT_UMS; > > - } else if (aux_disable & INTEL_AUX_DISABLE_MCS) { > > + } else if (layout_flags & MIPTREE_LAYOUT_DISABLE_AUX) { > > /* We can't use the CMS layout because it uses an aux buffer, > the MCS > > * buffer. So fallback to UMS, which is identical to CMS > without the > > * MCS. */ > > @@ -148,9 +148,6 @@ intel_miptree_supports_ccs(struct brw_context *brw, > > if (brw->gen < 7) > > return false; > > > > - if (mt->aux_disable & INTEL_AUX_DISABLE_MCS) > > - return false; > > - > > /* This function applies only to non-multisampled render targets. */ > > if (mt->num_samples > 1) > > return false; > > @@ -215,6 +212,26 @@ intel_miptree_supports_ccs(struct brw_context *brw, > > return true; > > } > > > > +static bool > > +intel_miptree_supports_hiz(struct brw_context *brw, > > + struct intel_mipmap_tree *mt) > > +{ > > + if (!brw->has_hiz) > > + return false; > > + > > + switch (mt->format) { > > + case MESA_FORMAT_Z_FLOAT32: > > + case MESA_FORMAT_Z32_FLOAT_S8X24_UINT: > > + case MESA_FORMAT_Z24_UNORM_X8_UINT: > > + case MESA_FORMAT_Z24_UNORM_S8_UINT: > > + case MESA_FORMAT_Z_UNORM16: > > + return true; > > + default: > > + return false; > > + } > > +} > > + > > + > > /* On Gen9 support for color buffer compression was extended to single > > * sampled surfaces. This is a helper considering both auxiliary buffer > > * type and number of samples telling if the given miptree represents > > @@ -320,10 +337,9 @@ intel_miptree_create_layout(struct brw_context > *brw, > > mt->logical_width0 = width0; > > mt->logical_height0 = height0; > > mt->logical_depth0 = depth0; > > - mt->aux_disable = (layout_flags & MIPTREE_LAYOUT_DISABLE_AUX) != 0 ? > > - INTEL_AUX_DISABLE_ALL : INTEL_AUX_DISABLE_NONE; > > - mt->aux_disable |= INTEL_AUX_DISABLE_CCS; > > mt->is_scanout = (layout_flags & MIPTREE_LAYOUT_FOR_SCANOUT) != 0; > > + mt->aux_usage = ISL_AUX_USAGE_NONE; > > + mt->supports_fast_clear = false; > > mt->aux_state = NULL; > > mt->cpp = _mesa_get_format_bytes(format); > > mt->num_samples = num_samples; > > @@ -337,7 +353,7 @@ intel_miptree_create_layout(struct brw_context *brw, > > int depth_multiply = 1; > > if (num_samples > 1) { > > /* Adjust width/height/depth for MSAA */ > > - mt->msaa_layout = compute_msaa_layout(brw, format, mt-> > aux_disable); > > + mt->msaa_layout = compute_msaa_layout(brw, format, layout_flags); > > if (mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) { > > /* From the Ivybridge PRM, Volume 1, Part 1, page 108: > > * "If the surface is multisampled and it is a depth or > stencil > > @@ -460,8 +476,7 @@ intel_miptree_create_layout(struct brw_context *brw, > > if (!(layout_flags & MIPTREE_LAYOUT_FOR_BO) && > > _mesa_get_format_base_format(format) == GL_DEPTH_STENCIL && > > (brw->must_use_separate_stencil || > > - (brw->has_separate_stencil && > > - intel_miptree_wants_hiz_buffer(brw, mt)))) { > > + (brw->has_separate_stencil && intel_miptree_supports_hiz(brw, > mt)))) { > > uint32_t stencil_flags = MIPTREE_LAYOUT_ACCELERATED_UPLOAD; > > if (brw->gen == 6) { > > stencil_flags |= MIPTREE_LAYOUT_TILING_ANY; > > @@ -530,14 +545,44 @@ intel_miptree_create_layout(struct brw_context > *brw, > > return NULL; > > } > > > > - if (mt->aux_disable & INTEL_AUX_DISABLE_MCS) > > - assert(mt->msaa_layout != INTEL_MSAA_LAYOUT_CMS); > > - > > return mt; > > } > > > > > > /** > > + * Choose the aux usage for this miptree. This function must be called > fairly > > + * late in the miptree create process after we have a tiling. > > + */ > > +static void > > +intel_miptree_choose_aux_usage(struct brw_context *brw, > > + struct intel_mipmap_tree *mt) > > +{ > > + assert(mt->aux_usage == ISL_AUX_USAGE_NONE); > > + > > + if (mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) { > > + mt->aux_usage = ISL_AUX_USAGE_MCS; > > + } else if (intel_tiling_supports_ccs(brw, mt->tiling) && > > + intel_miptree_supports_ccs(brw, mt)) { > > + if (!unlikely(INTEL_DEBUG & DEBUG_NO_RBC) && > > + brw->gen >= 9 && !mt->is_scanout && > > + intel_miptree_supports_ccs_e(brw, mt)) { > > + mt->aux_usage = ISL_AUX_USAGE_CCS_E; > > + } else { > > + mt->aux_usage = ISL_AUX_USAGE_CCS_D; > > + } > > + } else if (intel_miptree_supports_hiz(brw, mt)) { > > + mt->aux_usage = ISL_AUX_USAGE_HIZ; > > + } > > + > > + /* We can do fast-clear on all auxiliary surface types that are > > + * allocated through the normal texture creation paths. > > + */ > > + if (mt->aux_usage != ISL_AUX_USAGE_NONE) > > + mt->supports_fast_clear = true; > > +} > > + > > + > > +/** > > * Choose an appropriate uncompressed format for a requested > > * compressed format, if unsupported. > > */ > > @@ -670,6 +715,9 @@ miptree_create(struct brw_context *brw, > > if (layout_flags & MIPTREE_LAYOUT_FOR_SCANOUT) > > mt->bo->cache_coherent = false; > > > > + if (!(layout_flags & MIPTREE_LAYOUT_DISABLE_AUX)) > > + intel_miptree_choose_aux_usage(brw, mt); > > + > > return mt; > > } > > > > @@ -726,29 +774,14 @@ intel_miptree_create(struct brw_context *brw, > > } > > } > > > > - /* If this miptree is capable of supporting fast color clears, set > > - * fast_clear_state appropriately to ensure that fast clears will > occur. > > - * Allocation of the MCS miptree will be deferred until the first > fast > > - * clear actually occurs or when compressed single sampled buffer is > > - * written by the GPU for the first time. > > + /* Since CCS_E can compress more than just clear color, we create > the > CCS > > + * for it up-front. For CCS_D which only compresses clears, we > create the > > + * CCS on-demand when a clear occurs that wants one. > > */ > > The above comment... > > > - if (intel_tiling_supports_ccs(brw, mt->tiling) && > > - intel_miptree_supports_ccs(brw, mt)) { > > - mt->aux_disable &= ~INTEL_AUX_DISABLE_CCS; > > - assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1); > > - > > - /* On Gen9+ clients are not currently capable of consuming > compressed > > - * single-sampled buffers. Disabling compression allows us to > skip > > - * resolves. > > - */ > > - const bool lossless_compression_disabled = INTEL_DEBUG & > DEBUG_NO_RBC; > > - const bool is_lossless_compressed = > > - unlikely(!lossless_compression_disabled) && > > - brw->gen >= 9 && !mt->is_scanout && > > - intel_miptree_supports_ccs_e(brw, mt); > > - > > - if (is_lossless_compressed) { > > - intel_miptree_alloc_ccs(brw, mt, is_lossless_compressed); > > + if (mt->aux_usage == ISL_AUX_USAGE_CCS_E) { > > + if (!intel_miptree_alloc_ccs(brw, mt)) { > > + intel_miptree_release(&mt); > > + return NULL; > > } > > ...and the above hunk appear twice in the commit. I believe that both > locations are necessary and sufficient. However, you can replace them > with a single location. > > Also, the patch contains two calls to intel_miptree_choose_aux_usage(). > Again, you can replace them with a single call. > > Intstead of having these two things duplicated, I think they should > each happen once along the common code path: > intel_miptree_create_layout(). That should lead to less fragile code, > I hope. > > Facts: > > * All miptree creation paths eventually lead to > intel_miptree_create_layout(). > > * The result of intel_miptree_choose_aux_usage() depends directly on > mt->msaa_layout and indirectly on mt->tiling. As far as I can > tell, it depends on no other miptree state. > > * mt->msaa_layout is set in exactly two places: the top of > intel_miptree_create_layout() and as a side-effect of > intel_miptree_alloc_ccs(). (Why intel_miptree_alloc_ccs, formerly > intel_miptree_alloc_non_msrt_mcs, sets mt->msaa_layout to > INTEL_MSAA_LAYOUT_CMS is a mystery to me. I choose to ignore it, > because I suspect that assignment is nowhere used afterwards. > WARNING: I have left the realm of facts!) > > * mt->tiling is set in exactly 3 locations in intel_mipmap_tree.c: > > a. intel_miptree_create_layout() sometimes sets mt->tiling as one > of its > last actions, when it calls brw_miptree_layout(). "sometimes", > because brw_miptree_layout() sets mt->tiling if and only if > !MIPTREE_LAYOUT_FOR_BO. > > b. intel_miptree_create_for_bo() sets mt->tiling immediately > after calling intel_miptree_create_layout(). > > c. miptree_create() sets mt->tiling in a weird fallback > immediately after it calls intel_miptree_create_layout(). > > Conclusions: > > * You can collapse the two calls of > intel_miptree_choose_aux_usage(), and the two pre-allocations of > the ccs_e, into a single location (which I'll call "pot o' gold") > at the tail of intel_miptree_create_layout() immediately before it > returns, IF... > > * IF the pot o' gold happens after mt->msaa_layout is set. This > is trivial because mt->msaa_layout is set at the top of > intel_miptree_create_layout() and nowhere else, modulo my > willfull ignorance of intel_miptree_alloc_ccs's weirdness. > > * IF the pot o' gold happens after mt->tiling is set. What's > required for this is: > > * Move the mt->tiling assignment in (b) > to occur immediately before the call to > intel_miptree_create_layout(). > Trivial fixup. > > * Fix the weird tiling fallback in item (c) by pushing it into > intel_miptree_create_layout(). Another trivial fixup. > > > I expect that I missed some details. I await your rebuttal. > > > You are correct that they could be placed on the common path, sort of. You > are > wrong that it's a good idea. Why? Modifiers. With modifiers, we have to > override the choice of aux and create the auxiliary surface ourselves so we > don't want it created for us. But we also can't create it with DISABLE_AUX > because we may want it to still do HALIGN16 if we're going to create one > ourselves. So, instead, what we do is to let create_for_bo go ahead and > choose > aux usage, then we stomp it to what we want, and then we set up the CCS. We > could choose aux usage in create_layout but I figured that would hint that aux > usage was something entirely chose by create_layout and not stompped later. I > don't like to imply that.
I fast-forwarded through the patch series, and see the final result in intel_miptree_create_for_dri_image(). Now I see why you want to make last-minute aux decisions for intel_miptree_create_for_dri_image() but not intel_miptree_create(). That was only issue with the patch. So, Reviewed-by: Chad Versace <chadvers...@chromium.org> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev