On Mon, Dec 21, 2015 at 2:24 PM, Marek Olšák <mar...@gmail.com> wrote:
> On Mon, Dec 21, 2015 at 11:20 PM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > > > On Dec 21, 2015 1:37 PM, "Marek Olšák" <mar...@gmail.com> wrote: > >> > >> On Mon, Dec 21, 2015 at 7:48 PM, Jason Ekstrand <ja...@jlekstrand.net> > >> wrote: > >> > > >> > On Dec 21, 2015 9:09 AM, "Connor Abbott" <cwabbo...@gmail.com> wrote: > >> >> > >> >> On Mon, Dec 21, 2015 at 11:45 AM, Marek Olšák <mar...@gmail.com> > wrote: > >> >> > On Mon, Dec 21, 2015 at 4:38 PM, Connor Abbott < > cwabbo...@gmail.com> > >> >> > wrote: > >> >> >> On Mon, Dec 21, 2015 at 6:39 AM, Marek Olšák <mar...@gmail.com> > >> >> >> wrote: > >> >> >>> On Mon, Dec 21, 2015 at 6:48 AM, Jason Ekstrand > >> >> >>> <ja...@jlekstrand.net> > >> >> >>> wrote: > >> >> >>>> > >> >> >>>> On Dec 20, 2015 7:43 PM, "Rob Clark" <robdcl...@gmail.com> > wrote: > >> >> >>>>> > >> >> >>>>> On Sun, Dec 20, 2015 at 10:29 PM, Connor Abbott > >> >> >>>>> <cwabbo...@gmail.com> > >> >> >>>>> wrote: > >> >> >>>>> > On Sun, Dec 20, 2015 at 10:04 PM, Rob Clark > >> >> >>>>> > <robdcl...@gmail.com> > >> >> >>>>> > wrote: > >> >> >>>>> >> On Sun, Dec 20, 2015 at 9:12 PM, Jason Ekstrand > >> >> >>>>> >> <ja...@jlekstrand.net> > >> >> >>>>> >> wrote: > >> >> >>>>> >>> > >> >> >>>>> >>> On Dec 19, 2015 5:55 PM, "Rob Clark" <robdcl...@gmail.com> > >> >> >>>>> >>> wrote: > >> >> >>>>> >>>> > >> >> >>>>> >>>> From: Rob Clark <robcl...@freedesktop.org> > >> >> >>>>> >>>> > >> >> >>>>> >>>> Jason, > >> >> >>>>> >>>> > >> >> >>>>> >>>> How much do you hate this idea? Seems like an easy > >> >> >>>>> >>>> alternative > >> >> >>>>> >>>> to > >> >> >>>>> >>>> using ralloc ctx's to clean up nir variants/clones, which > >> >> >>>>> >>>> would > >> >> >>>>> >>>> let > >> >> >>>>> >>>> us drop the parent memctx for nir_shader_create()/clone(), > >> >> >>>>> >>>> making > >> >> >>>>> >>>> it easier to introduce reference counting. > >> >> >>>>> >>> > >> >> >>>>> >>> I think "hate" is a but strong. I don't like it but it > works. > >> >> >>>>> >>> If we > >> >> >>>>> >>> really > >> >> >>>>> >>> want nir_shader refcounted, we'll have to do something. > >> >> >>>>> >> > >> >> >>>>> >> I suppose the alternate idea of moving the > nir_shader_clone() > >> >> >>>>> >> out > >> >> >>>>> >> of > >> >> >>>>> >> brw_compile_xyz(), and always passing in the clone would be > a > >> >> >>>>> >> cleaner > >> >> >>>>> >> way. It looks like each of the brw_compile_xyz() has > exactly > >> >> >>>>> >> one > >> >> >>>>> >> call-site, so doing the nir_shader_clone() inside doesn't > >> >> >>>>> >> really > >> >> >>>>> >> buy > >> >> >>>>> >> anything. > >> >> >>>> > >> >> >>>> Your forgetting that there may be *cough* other users of this > >> >> >>>> API... > >> >> >>>> We can > >> >> >>>> change those too but I would like the needs of the compiler > users > >> >> >>>> to > >> >> >>>> drive > >> >> >>>> the API, not the cloning. I still have some details to work out > >> >> >>>> there. In > >> >> >>>> any case, it doesn't really matter; we can figure something out. > >> >> >>>> > >> >> >>>>> >>> About refcounting... The more I think about it the more I'm > >> >> >>>>> >>> not > >> >> >>>>> >>> convinced > >> >> >>>>> >>> it's useful. As it stands, we have no use for it an I'm > not > >> >> >>>>> >>> convinced > >> >> >>>>> >>> you > >> >> >>>>> >>> do either. We'll see if I can convince you. :-) > >> >> >>>>> >>> > >> >> >>>>> >>> In the history of i965 using NIR, we've had about three > >> >> >>>>> >>> different ways > >> >> >>>>> >>> of > >> >> >>>>> >>> doing things: > >> >> >>>>> >>> > >> >> >>>>> >>> 1) GLSL is the gold copy and we run glsl_to_nir for every > >> >> >>>>> >>> shader/variant > >> >> >>>>> >>> compile. This is what we did when we first stated using > NIR > >> >> >>>>> >>> because > >> >> >>>>> >>> it was > >> >> >>>>> >>> easy and didn't involve reworking any plumbing. > >> >> >>>>> >>> > >> >> >>>>> >>> 2) Lowered NIR is the gold copy; variants are done entirely > >> >> >>>>> >>> in > >> >> >>>>> >>> the > >> >> >>>>> >>> back-end > >> >> >>>>> >>> IR. This is what we did up until about a month ago. > Because > >> >> >>>>> >>> variants > >> >> >>>>> >>> are > >> >> >>>>> >>> done in the back-end, we can run gksl_to_nir and do all of > >> >> >>>>> >>> our > >> >> >>>>> >>> optimizing > >> >> >>>>> >>> and lowering at link time. Going from NIR to the final > >> >> >>>>> >>> shader > >> >> >>>>> >>> binary > >> >> >>>>> >>> is > >> >> >>>>> >>> then a read-only operation as far as NIR is concerned. > >> >> >>>>> >>> > >> >> >>>>> >>> 3) Optimized but not lowered NIR is the gold copy; variants > >> >> >>>>> >>> are > >> >> >>>>> >>> sometimes > >> >> >>>>> >>> done in NIR. This is the scheme we use now. We call > >> >> >>>>> >>> glsl_to_nir and > >> >> >>>>> >>> do > >> >> >>>>> >>> some of the optimization and lowering at link time but > leave > >> >> >>>>> >>> it > >> >> >>>>> >>> in SSA > >> >> >>>>> >>> form. > >> >> >>>>> >>> When we go to compile the final shader, we make a copy, > apply > >> >> >>>>> >>> variants, do > >> >> >>>>> >>> the final lowering and then go into the back-end IR. > >> >> >>>>> >>> > >> >> >>>>> >>> In each of these cases, we know exactly where we need to > make > >> >> >>>>> >>> a > >> >> >>>>> >>> copy > >> >> >>>>> >>> without > >> >> >>>>> >>> the help of reference counting. In the first, we get a > fresh > >> >> >>>>> >>> copy > >> >> >>>>> >>> each time > >> >> >>>>> >>> so we are free to destroy the copy. In the second, we > never > >> >> >>>>> >>> have to > >> >> >>>>> >>> modify > >> >> >>>>> >>> the NIR so no copy. In the third scheme, we always have to > >> >> >>>>> >>> make > >> >> >>>>> >>> a > >> >> >>>>> >>> copy > >> >> >>>>> >>> because, even if variants are a no-op, we still have to go > >> >> >>>>> >>> out > >> >> >>>>> >>> of SSA > >> >> >>>>> >>> form > >> >> >>>>> >>> and do final lowering. You could say that we could avoid > >> >> >>>>> >>> making > >> >> >>>>> >>> that > >> >> >>>>> >>> copy. > >> >> >>>>> >>> However, the work to determine when we don't need variants > >> >> >>>>> >>> and > >> >> >>>>> >>> can do > >> >> >>>>> >>> all > >> >> >>>>> >>> our lowering up-front is far more than the work saved by > >> >> >>>>> >>> reference > >> >> >>>>> >>> counting. > >> >> >>>>> >>> > >> >> >>>>> >>> How about gallium? Here's how I imagine it would work > >> >> >>>>> >>> (please > >> >> >>>>> >>> correct > >> >> >>>>> >>> me of > >> >> >>>>> >>> I'm wrong): > >> >> >>>>> >>> > >> >> >>>>> >>> 1) In the TGSI case, tgsi_to_nir gets called for each > compile > >> >> >>>>> >>> so > >> >> >>>>> >>> you > >> >> >>>>> >>> get a > >> >> >>>>> >>> fresh mutable shader each time. In this case, you are free > >> >> >>>>> >>> to > >> >> >>>>> >>> do > >> >> >>>>> >>> whatever > >> >> >>>>> >>> you want with the shader without making a copy. > >> >> >>>>> >>> > >> >> >>>>> >>> 2) In the GLSL case, you run glsl_to_nir and do some basic > >> >> >>>>> >>> optimizations at > >> >> >>>>> >>> link time and hold onto the NIR shader. (Hold a reference > of > >> >> >>>>> >>> you'd > >> >> >>>>> >>> like.) > >> >> >>>>> >>> When you go to compile it in the back-end, it needs to do > >> >> >>>>> >>> it's > >> >> >>>>> >>> own > >> >> >>>>> >>> lowering > >> >> >>>>> >>> so it takes a reference and ends up making a copy. > >> >> >>>>> >>> > >> >> >>>>> >>> If this description is anywhere close to correct, then I > >> >> >>>>> >>> don't > >> >> >>>>> >>> think > >> >> >>>>> >>> you > >> >> >>>>> >>> really need it either. Determining whether or not you need > >> >> >>>>> >>> to > >> >> >>>>> >>> copy is > >> >> >>>>> >>> simply "if (comes_from_tgsi)”. Maybe there's something > >> >> >>>>> >>> subtle > >> >> >>>>> >>> about > >> >> >>>>> >>> the > >> >> >>>>> >>> gallium layer that I don't know that makes refcounting the > >> >> >>>>> >>> best > >> >> >>>>> >>> solution. > >> >> >>>>> >>> Please enlighten me of there is. > >> >> >>>>> >> > >> >> >>>>> >> This issue is that we *potentially* have both the state > >> >> >>>>> >> tracker > >> >> >>>>> >> and > >> >> >>>>> >> the driver both doing some of there own variant management. > >> >> >>>>> >> (Which > >> >> >>>>> >> tbh, isn't awesome, it would have been nice if someone > >> >> >>>>> >> realized > >> >> >>>>> >> earlier on that nearly every driver is going to have to do > >> >> >>>>> >> some > >> >> >>>>> >> sort > >> >> >>>>> >> of variant mgmt and figured out a way just to push it all > down > >> >> >>>>> >> to > >> >> >>>>> >> the > >> >> >>>>> >> driver.. but I can't see a good way to get there from here.) > >> >> >>>>> >> > >> >> >>>>> >> With TGSI as the IR, driver just unconditionally does > >> >> >>>>> >> tgsi_dup_tokens().. because of the whole thing where st does > >> >> >>>>> >> variants > >> >> >>>>> >> in some cases, things are defined that driver doesn't own > the > >> >> >>>>> >> copy of > >> >> >>>>> >> the TGSI IR passed in after the fxn call to driver returns. > >> >> >>>>> >> > >> >> >>>>> >> With NIR I was hoping to fix this, esp. since > >> >> >>>>> >> nir_shader_clone() > >> >> >>>>> >> is > >> >> >>>>> >> more heavyweight than tgsi_dup_tokens() (memcpy()). > >> >> >>>>> >> > >> >> >>>>> >> Refcnt'ing is a nice solution so that we can pass the > driver a > >> >> >>>>> >> reference that it owns. In cases where state tracker isn't > >> >> >>>>> >> doing > >> >> >>>>> >> variant mgmt, we pass it the one-and-only ref (avoiding > >> >> >>>>> >> clone). > >> >> >>>>> >> > >> >> >>>>> >> I'd suggested that in cases where st does variant mgmt, that > >> >> >>>>> >> st > >> >> >>>>> >> should > >> >> >>>>> >> do the clone/dup. But that was shot down: > >> >> >>>>> >> > >> >> >>>>> >> > >> >> >>>>> >> > >> >> >>>>> >> > http://lists.freedesktop.org/archives/mesa-dev/2015-October/097748.html > >> >> >>>> > >> >> >>>> It sounds like Marek's argument there is more about lifetime > >> >> >>>> management than > >> >> >>>> anything. He wants gallium modules to be able to create IR, > call > >> >> >>>> into the > >> >> >>>> driver, and then throw it away. In particular, he doesn't want > >> >> >>>> them > >> >> >>>> to have > >> >> >>>> to think about cloning. In a lot of ways it sounds a lot like > what > >> >> >>>> i965 > >> >> >>>> wants too. I really like having brw_compile_foo take a const > >> >> >>>> nir_shader. > >> >> >>>> The difference is that i965 basically always wants to clone > >> >> >>>> whereas a > >> >> >>>> gallium driver may not have to if gallium doesn't care what > >> >> >>>> happens > >> >> >>>> to the > >> >> >>>> shader when it's done. How common is this case? How important > is > >> >> >>>> it > >> >> >>>> to > >> >> >>>> optimize for? I don't know. > >> >> >>>> > >> >> >>>> One other thing that bothers me a bit: From Marek's comment, it > >> >> >>>> sounds like > >> >> >>>> the components want to just pass in IR and be agnostic about > >> >> >>>> whether > >> >> >>>> the > >> >> >>>> driver wants its own copy or wants to change it or whatever. > This > >> >> >>>> seems > >> >> >>>> like an argument for always cloning to me. From the perspective > >> >> >>>> of a > >> >> >>>> gallium module, "I want to hang in to this, I'll keep a > reference" > >> >> >>>> seems > >> >> >>>> exactly the same as "I want to hang onto this, I'll give the > >> >> >>>> driver a > >> >> >>>> copy". > >> >> >>>> How are they actually different given that the driver basically > >> >> >>>> has > >> >> >>>> to > >> >> >>>> modify what you give it in order to do lowering? > >> >> >>>> > >> >> >>>>> > Ugh... I didn't read this at the time, but I don't like > Marek's > >> >> >>>>> > response. My understanding of the situation, based on this > >> >> >>>>> > thread, > >> >> >>>>> > is > >> >> >>>>> > that there are some cases where the st knows that there's > only > >> >> >>>>> > going > >> >> >>>>> > to be one variant and can throw away the (NIR or TGSI) shader > >> >> >>>>> > after it > >> >> >>>>> > hands it to the driver, while at other times it has to hold > >> >> >>>>> > onto > >> >> >>>>> > all > >> >> >>>>> > the variants and only give the driver a read-only copy (or > >> >> >>>>> > duplicate > >> >> >>>> > >> >> >>>> As per above, my interpretation of Marek's comment is that he > >> >> >>>> doesn't > >> >> >>>> want > >> >> >>>> the st to have to think about cloning ever. He wants it to > assume > >> >> >>>> that > >> >> >>>> compilation never modifies the IR so the driver should always > >> >> >>>> clone. > >> >> >>>> You > >> >> >>>> have to keep in mind that Marek is most likely thinking about > >> >> >>>> caching > >> >> >>>> the > >> >> >>>> TGSI rather than doing in-place lowering in it. > >> >> >>>> > >> >> >>>> If I'm understanding Marek correctly, then it sounds like shader > >> >> >>>> compilation > >> >> >>>> should never touch the IR that's passed in. If this is the > case, > >> >> >>>> it > >> >> >>>> sounds > >> >> >>>> like always cloning is the way to go. At least its not *that* > >> >> >>>> expensive. > >> >> >>> > >> >> >>> Note that st/mesa needs to keep the FS IR because of glDrawPixels > >> >> >>> and > >> >> >>> glBitmap, and the VS IR because of edge flags, glRasterPos > >> >> >>> evaluation, > >> >> >>> selection and feedback modes. The last three are done with > >> >> >>> Draw/LLVM > >> >> >>> and only support TGSI. > >> >> >>> > >> >> >>> Therefore, st/mesa always hangs onto the IR and drivers can't > >> >> >>> modify > >> >> >>> it. It also needs VS in TGSI to be able to do everything > correctly. > >> >> >>> > >> >> >>> What other Gallium modules want or not want is not that > important, > >> >> >>> but > >> >> >>> changing the current semantics will require fixing a lot of > places. > >> >> >>> (state trackers - mesa, nine, xa; modules - blitter, draw, hud, > >> >> >>> postprocess, tests, vl) > >> >> >>> > >> >> >>> You really better think about whether changing all those and the > >> >> >>> risk > >> >> >>> of breaking them is worth it. > >> >> >>> > >> >> >>> Marek > >> >> >> > >> >> >> Well, we're talking about passing NIR here, not TGSI, so none of > >> >> >> those > >> >> >> places will need to be updated. NIR is a much more heavyweight IR, > >> >> >> and > >> >> >> copying is much more expensive, so it makes more sense there for > the > >> >> >> st to duplicate it and let the driver own the IR it passes in, to > >> >> >> reduce copying when there's no variant management necessary. > >> >> > > >> >> > My main concern was about TGSI, not so much about NIR. > >> >> > >> >> Yes, exactly -- we don't plan on changing the semantics of passing in > >> >> TGSI, so this only matters when the user passes in a NIR shader. > >> > > >> > I think two different concepts of ownership are getting conflated > here: > >> > Right/responsibility to delete and right to modify. > >> > > >> > The way I understand it, gallium, as it stands, gives neither to the > >> > driver. > >> > A back-end using NIR requires the right to modify but who deletes it > >> > doesn't > >> > ultimately matter. I think it's dangerous to pass one of these rights > >> > to > >> > the driver and not the other but we need to think about both. > >> > > >> > What I'm trying to say is that we have two options here: > >> > > >> > 1) gallium passes IR to the back-end that it is free to modify and is > >> > required to delete when it's done. > >> > > >> > 2) gallium passes read-only IR to the back-end and it always makes a > >> > copy. > >> > >> Not always. The copy is optional. Drivers are encouraged not to make a > >> copy if they don't need it. Or they can keep a copy in a different IR, > >> which is the same scenario for the original IR. > > > > Let's suppose that the driver always has to modify the IR before it can > be > > used. Does it have to make a copy then? I know that may not be the case > for > > TGSI but it is currently the case for NIR. > > If the driver wants to modify TGSI, it does have to make a copy > (tgsi_transform_shader automatically makes a copy). Regarding NIR, > define and use whatever makes sense for it. > Ok, that's what I thought. I'm just trying to make sure I get it all clear. I'm not 100% sure what the right thing to do for NIR is, but knowing what's required for TGSI helps. Thanks! --Jason
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev