Re: assert semantic change proposal

via Digitalmars-d Wed, 06 Aug 2014 05:16:37 -0700

On Tuesday, 5 August 2014 at 21:17:14 UTC, H. S. Teoh viaDigitalmars-d wrote:

On Tue, Aug 05, 2014 at 08:11:16PM +0000, via Digitalmars-dwrote:
On Tuesday, 5 August 2014 at 18:57:40 UTC, H. S. Teoh viaDigitalmars-d
wrote:
>Exactly. I think part of the problem is that people have been>using>assert with the wrong meaning. In my mind, 'assert(x)'>doesn't mean>"abort if x is false in debug mode but silently ignore in>release>mode", as some people apparently think it means. To me, it>means "at
>this point in the program, x is true".  It's that simple.
A language construct with such a meaning is useless as a safety
feature.
I don't see it as a safety feature at all.

Sorry, I should have written "correctness feature". I agree thatit's not very useful for safety per se. (But of course, safetyand correctness are not unrelated.)

If I first have to prove that the condition is true before Icansafely use an assert, I don't need the assert anymore, becauseI've
already proved it.
I see it as future proofing: I may have proven the conditionfor *this*version of the program, but all software will change (unlessit's dead),and change means the original proof may no longer be valid, butthis
part of the code is still written under the assumption that the
condition holds. In most cases, it *does* still hold, so ingeneral
you're OK, but sometimes a change invalidates an axiom that, in
consequence, invalidates the assertion. Then the assertionwill trip(in non-release mode, of course), telling me that my programlogic hasbecome invalid due to the change I made. So I'll have to fixthe
problem so that the condition holds again.

Well, I think it's unlikely that you actually did prove theassert condition, except in trivial situations. This is relatedto the discussion about the ranges example, so I'll respond there.

If it is intended to be an optimization hint, it should beimplementedas a pragma, not as a prominent feature meant to be widelyused. (ButI see that you have a different use case, see my commentbelow.)
And here is the beauty of the idea: rather than polluting mycode withoptimization hints, which are out-of-band (and which aregenerallyunverified and may be outright wrong after the code undergoesseveralrevisions), I am stating *facts* about my program logic thatmust hold-- which therefore fits in very logically with the code itself.It evenself-documents the code, to some extent. Then as an addedbenefit, thecompiler is able to use these facts to emit more efficientcode. So tome, it *should* be a prominent, widely-used feature. I woulduse it, and
use it a *lot*.

I think this is where we disagree mainly: What you call facts issomething I see as intentions that *should* be true, but are not*proven* to be so. Again, see below.

>The optimizer only guarantees (in theory) consistent program
>behaviour if the program is valid to begin with. If the>program is>invalid, all bets are off as to what its "optimized" version>does.
There is a difference between invalid and undefined: A programisinvalid ("buggy"), if it doesn't do what it's programmerintended,while "undefined" is a matter of the language specification.The(wrong) behaviour of an invalid program need not be undefined,and
often isn't in practice.
To me, this distinction doesn't matter in practice, because inpractice,an invalid program produces a wrong result, and a program withundefinedbehaviour also produces a wrong result. I don't care what kindof wrongresult it is; what I care is to fix the program to *not*produce a wrong
result.


Please see my response to Jeremy; the distinction is important:
http://forum.dlang.org/thread/hqxoldeyugkazolll...@forum.dlang.org?page=11#post-eqlyruvwmzbpemvnrebw:40forum.dlang.org

An optimizer may only transform code in a way that keeps theresulting
code semantically equivalent. This means that if the original
"unoptimized" program is well-defined, the optimized one willbe too.
That's a nice property to have, but again, if my programproduces awrong result, then my program produces a wrong result. As alanguageuser, I don't care that the optimizer may change one wrongresult to adifferent wrong result. What I care about is to fix the codeso thatthe program produces the *correct* result. To me, it onlymatters thatthe optimizer does the Right Thing when the program is correctto beginwith. If the program was wrong, then it doesn't matter if theoptimizermakes it a different kind of wrong; the program should be fixedso that
it stops being wrong.

We're not living in an ideal world, unfortunately. It is badenough that programs are wrong as they are written, we don't needthe compiler to transform these programs to do something that isstill wrong, but also completely different. This would make yourgoal of fixing the program very hard to achieve. In an extremecase, a small error in several million lines of code couldmanifest at a completely different place, because you cannot relyon any determinism once undefined behaviour is involved.

>Yes, the people using assert as a kind of "check in debug>mode but
>ignore in release mode" should really be using something else
>instead, since that's not what assert means. I'm honestly>astounded
>that people would actually use assert as some kind of
>non-release-mode-check instead of the statement of truth that>it was
>meant to be.

Well, when this "something else" is introduced, it will need to
replace almost every existing instance of "assert", as thelatter mustonly be used if it is proven that the condition is alwaystrue. To
name just one example, it cannot be used in range `front` and
`popFront` methods to assert that the range is not empty,unless there
is an additional non-assert check directly before it.
I don't follow this reasoning. For .front and .popFront toassert thatthe range is non-empty, simply means that user code thatattempts to dootherwise is wrong by definition, and must be fixed. I don'tcare ifit's wrong as in invalid, or wrong as in undefined, the bottomline is
that code that calls .front or .popFront on an empty range is
incorrectly written, and therefore must be fixed.

Just above you wrote that you "may have proven the condition".But in code like the following, there cannot be a proof:


    @property T front() {
        assert(!empty);
        return _other_range.front;
    }

This is in the standard library. The authors of this piece ofcode cannot have proven that the user of the library only calls`front` on a non-empty range. Now consider the following example(mostly made up, but not unrealistic) that parses a text file(this could be a simple text-based data format):


    // ...
    // some processing
    // ...
    input.popFront();
    // end of line? => swallow and process next line
    if(input.front == '\n') { // <- this is wrong
        input.popFront();
        continue;
    }
    // ...
    // more code that doesn't call `input.popFront`
    // ...
    // more processing of input
    if(!input.empty) {    // <- HERE
        // use input.front
    }

With the above definition of `front`, the second check marked"HERE" can be removed by the compiler. Even worse, if you insert`writeln(input.empty)` before the check for debugging, it mightalso output "false" (depending on how far the compiler goes).

Yes, this code is wrong. But it's an easy mistake to make, itmight not be detected during testing because you only usecorrectly formatted input files, and it might also not lead tocrashes (the buffer is unlikely to end at a boundary to unmappedmemory).

Now the assert - which is supposed to be helping the programmerwrite correct code - has made it _harder_ to detect the cause ofan error.

What's worse is that it also removed a check that was necessary.This check could have been inserted by the programmer because thesection of the code is security relevant, and they didn't want torely on the input file to be correct. The compiler has therebyturned a rather harmless mistake that would under normalcircumstances only lead to an incorrect output into a potentiallyexploitable security bug.

-- snip --
But if I've convinced myself that it is
correct, then I might as well disable the emptiness checks sothat myproduct will deliver top performance -- since that wouldn't bea problem
in a correct program.

The problem is, as I explained above, that it doesn't justdisable the emptiness checks where the asserts are. A simplemistake can have subtle and hard to debug effects all over yourprogram.

In theory, the optimizer could use CTFE to reduce the functioncall, andthereby discover that the code is invalid. We don't have thattoday, but
conceivably, we can achieve that one day.
But taking a step back, there's only so much the compiler cando atcompile-time. You can't stop *every* unsafe usage of somethingwithoutalso making it useless. While the manufacturer of a sharpcutting toolwill presumably do their best to ensure the tool is safe touse, it'simpossible to prevent *every* possible unsafe usage of saidtool. If theuser points the chainsaw to his foot, he will lose his foot,and there'snothing the manufacturer can do to prevent this except shippinganon-functional chainsaw. If the user insists on assertingthings thatare untrue, there will always be a way to bypass the compiler'sstatic
checks and end up with undefined behaviour at runtime.


I wouldn't be so pessimistic ;-)

I guess most assert conditions are simple, mostly justcomparisons or equality checks of one value with a constant. Thisshould be relatively easy to verify with some control/data flowanalysis (which Walter avoided until now, understandably).

But CTFE is on the wrong level. It could only detect some of thefailed conditions. It needs to be checked on a higher lever, asreal correctness proofs. If an assert conditions cannot be proved- because it's always wrong, or just sometimes, or because theknowledge available to the compiler is not enough - it must berejected. Think of it like an extension of type and constchecking.

It would be great if this were possible. In the example of`front` and`popFront`, programs that call these methods on a range thatcouldtheoretically be empty wouldn't compile. This might be usefulforoptimization, but above that it's useful for verifyingcorrectness.
A sufficiently-aggressive optimizer might be able to verifythis atcompile-time by static analysis. But even that has itslimits... for
example:

        MyRange range;
        assert(range.empty);
        if (solveRiemannHypothesis()) // <-- don't know if this is true
                range.addElements(...);

        range.popFront(); // <-- should this compile or not?

It shouldn't, because it's not provable. However, most assertsare far less involved. There could be a specification of what isguaranteed to work, and what all compilers must therefore support.

Unfortunately this is not what has been suggested (and wasevidently
intended from the beginning)...
I don't think static analysis is *excluded* by the currentproposal. Ican see it as a possible future enhancement. But the fact thatwe don't
have it today doesn't mean we shouldn't step in that direction.

I just don't see how we're stepping into that direction at all.It seems like the opposite: instead of trying to prove theassertions statically, they're going to be believed withoutverification.

Re: assert semantic change proposal

Reply via email to