Re: DIP44: scope(class) and scope(struct)

Artur Skawina Sat, 24 Aug 2013 14:57:30 -0700

On 08/24/13 22:09, H. S. Teoh wrote:
> On Sat, Aug 24, 2013 at 01:01:12PM +0200, Artur Skawina wrote:
>> On 08/24/13 08:30, Piotr Szturmaj wrote:
>>> H. S. Teoh wrote:
>>>> I've written up a proposal to solve the partially-constructed
>>>> object problem[*] in D in a very nice way by extending scope
>>>> guards:
>>>>
>>>>     http://wiki.dlang.org/DIP44
>>
>>> 2. allow referring to local variables and create the closure. This
>>> causes additional memory allocation and also reference to the
>>> closure must be stored somewhere, perhaps in the class hidden field.
>>> Of course, this is a "no go", I'm writing this here for comparison.
>>
>> That is what he's actually proposing. And, yes, it's not a good idea.
>> Implementing it via runtime delegates, implicit captures/allocs and
>> extra hidden fields inside aggregates would work, but the cost is too
>> high. It's also not much of an improvement over manually registering
>> and calling the delegates. Defining what happens when a ctor fails
>> would be a good idea, having a cleanup method which defaults to
>> `~this`, but can be overridden could help too.
> 
> The issue with that is that the initialization code and the cleanup code
> has to be separated, potentially by a lot of unrelated stuff in between.
> The genius of scope guards is that initialization and cleanup is written
> in one place even though they actually happen in different places, so
> it's very unlikely you will forget to cleanup correctly.


The "scope guards" are actually an ugly hack - but one that is useful
and where the alternatives aren't significantly better. In fact, its
"ugliness" makes it stand out more, which in this case is a positive
trait.

What makes your proposal different is:

a) scope guards are *local*. They are contained within a single (lexical)
   scope. Which means that you can easily spot them, just by looking at
   *one* continuous piece of source code. For most sanely written code
   the distance from where they are defined to where they are executed is
   small enough (ie functions spanning multiple pages are/should be
   relatively uncommon).
   You are proposing defining them in *completely different functions*.
   To see what will actually happen at destruction time one would have
   to read *every single ctor*. 

b) scope guards are just *static syntax sugar*. All they do is let you
   define blocks of code using slightly different syntax and move them
   closer to the location where related operations happen.
   Your proposal introduces *dynamic* behavior - it's no longer just a
   trivial rewriting from one form to another. /Order/ starts to matter.
      
   struct S {
      R r1, r2;

      this(A a) {
          auto cond = pick_one();
          if (cond) { r1 = i1(); scope(this) cl1(); } else { r2 = i2(); 
scope(this) cl2(); }
          //...           
          if (!cond) { r2 = i3(); scope(this) cl3(); } else { r1 = i4(); 
scope(this) cl4(); }
          //...           
      }

      this(B b) { r1 = i1(); scope(this) cl1(); r2 = i2(); scope(this) cl2(); }
      this(C c) { r2 = i2(); scope(this) cl2(); r1 = i1(); scope(this) cl1(); }
   }

   Consider what is going to (has to) happen when this object is destructed/
   finalized. If one path ends up being taken rarely (say, just under memory
   pressure) /and/ contains a race related to the cleanup order, the result can
   be subtle and extremely hard to reproduce bugs. Those would also happen when
   using an explicit scheme (manually registering the delegates etc), but you
   just made it all implicit - so now all of the magic is completely invisible
   and these kinds of bugs became much harder to spot.

>> There are other problems with that DIP, like making it harder to see
>> what's actually going on, by splitting the dtor code and having it
>> interleaved with another separate flow.
> 
> I think it's unhelpful to conflate scope(this) with dtors. Yes there is
> some overlap, but if you treat them separately, then there is no
> problem (assuming that a dtor is actually necessary).

See above. "scope(this)" moves parts of a dtor to inside the ctors (there
can be many of those) *and* introduces reordering.


>> It *is* possible to implement a similar solution without any RT cost,
>> but it would need:
>> a) flow analysis - to figure out the cleanup order, which might not be 
>>     statically known (these cases have to be disallowed)
>> b) a different approach for specifying the cleanup code, so that 
>>     implicit capturing of ctor state doesn't happen and it's not
>>     necessary to read the complete body of every ctor just to find
>>     out what a dtor does.
> 
> I think that's an unhelpful way of thinking about it. What about we
> think of it this way: the ctor is acquiring X number of resources, and
> by wrapping the resource-releasing code in scope(this), we guarantee
> that these resources will be correctly released. Basically, scope(this)
> will take care of invoking the release code whenever the object's
> lifetime is over, whether it's unsuccessful construction, or
> destruction.
> 
> It's just like saying scope guards are useless because it's equivalent
> to a try-catch block anyway (and in fact, that's how the front end
> implements scope guards). One may even argue scope guards are bad
> because the cleanup code is sprinkled everywhere rather than collected
> in one place. But it's not really about whether it's equivalent to

Locality matters. Checking which objects die at the end of a scope requires
you to scan only the current scope, something that has to be done anyway
(RAII etc).
This is very different from having to check all bodies of potentially
many different functions, plus manually verifying that the initialization
order is either the same or "legal" in all of them.


> another language construct; it's about better code maintainability.
> Separating the code that initializes something from the code that cleans
> up something makes it harder to maintain, and more error-prone (e.g.
> initialize 9 things, forget to clean up one of them). Keeping them
> together in the same place makes code correctness clearer.

RAII helps sometimes. Moving the field initializations out of the ctor and 
close to the declaration would be an alternative. But then you need a
smarter compiler and/or to manually specify the order. Note that the
"smarter compiler" is necessary -- to verify the order, because then
it all is just a static transformation/lowering and no dynamic arrays of
delegates nor hidden state is required. Relying on humans to catch
errors does not scale, so omitting the static verification step (which
would make the implementation simpler) is unfortunately not a practical
option.

artur

Re: DIP44: scope(class) and scope(struct)

Reply via email to