Re: dereferencing null

Chad J Wed, 07 Mar 2012 17:50:16 -0800

On 03/07/2012 10:21 AM, Steven Schveighoffer wrote:

On Wed, 07 Mar 2012 10:10:32 -0500, Chad J
<chadjoan@__spam.is.bad__gmail.com> wrote:

On Wednesday, 7 March 2012 at 14:23:18 UTC, Chad J wrote:

I spoke too soon!
We missed one:

1. You forgot to initialize a variable.
2. Your memory has been corrupted, and some corrupted pointer
now points into no-mem land.
3. You are accessing memory that has been deallocated.
4. null was being used as a sentinal value, and it snuck into
a place where the value should not be a sentinal anymore.

I will now change what I said to reflect this:

I think I see where the misunderstanding is coming from.

I encounter (1) from time to time. It isn't a huge problem because
usually if I declare something the next thing on my mind is
initializing it. Even if I forget, I'll catch it in early testing. It
tends to never make it to anyone else's desk, unless it's a
regression. Regressions like this aren't terribly common though. If
you make my program crash from (1), I'll live.

I didn't even consider (2) and (3) as possibilities. Those are far
from my mind. I think I'm used to VM languages at this point (C#,
Java, Actionscript 3, Haxe, Synergy/DE|DBL, etc). In the VM, (2) and
(3) can't happen. I never worry about those. Feel free to crash these
in D.

I encounter (4) a lot. I really don't want my programs crashed when
(4) happens. Such crashes would be super annoying, and they can happen
at very bad times.


You can use sentinels other than null.

-Steve


Example?

Here, if you want, I'll start with a typical case.  Please make it right.

class UnreliableResource
{
        this(string sourceFile) {...}
        this(uint userId) {...}
        void doTheThing() {...}
}

void main()
{
        // Set this to a sentinal value for cases where the source does
        //   not exist, thus preventing proper initialization of res.
        UnreliableResource res = null;

        // The point here is that obtaining this unreliable resource
        //   is tricky business, and therefore complicated.
        //
        if ( std.file.exists("some_special_file") )
        {
                res = new UnreliableResource("some_special_file");
        }
        else
        {
                uint uid = getUserIdSomehow();
                if ( isValidUserId(uid) )
                {
                        res = new UnreliableResource(uid);
                }
        }

        // Do some other stuff.
        ...
        
        // Now use the resource.
        try
        {
                thisCouldBreakButItWont(res);
        }
        // Fairly safe if we were in a reasonable VM.
        catch ( NullDerefException e )
        {
                writefln("This shouldn't happen, but it did.");
        }
}

void thisCouldBreakButItWont(UnreliableResource res)
{
        if ( res != null )
        {
                res.doTheThing();
        }
        else
        {
                doSomethingUsefulThatCanHappenWhenResIsNotAvailable();
                writefln("Couldn't find the resource thingy.");
                writefln("Resetting the m-rotor.  (NOOoooo!)");
        }
}

Please follow these constraints:

- Do not use a separate boolean variable for determining whether or not'res' could be created. This violates a kind of SSOT(http://en.wikipedia.org/wiki/Single_Source_of_Truth) because it allowscases where the hypothetical "resIsInitialized" variable is true but resisn't actually initialized, or where "resIsInitialized" is false but resis actually initialized. It also doesn't throw catchable exceptionswhen the uninitialized class has methods called on it. In my pansyVM-based languages I always prefer to risk the null sentinal.

- Do not modify the implementation of UnreliableResource. It's notalways possible.

- Try to make the solution something that could, in principle, be placedinto Phobos and reused without a lot of refactoring in the original code.


...

Now I will think about this a bit...

This reminds me a lot of algebraic data types. I kind of want to saysomething like:

auto res = empty | UnreliableResource;

and then unwrap it:

        ...
        thisCantBreakAnymore(res);
}

void thisCantBreakAnymore(UnreliableResource res)
{
        res.doTheThing();
}

void thisCantBreakAnymore(empty)
{
        doSomethingUsefulThatCanHappenWhenResIsNotAvailable();
        writefln("Couldn't find the resource thingy.");
        writefln("Resetting the m-rotor.  (NOOoooo!)");
}

I'm not absolutely sure I'd want to go that path though, and since D isunlikely to do any of those things, I just want to be able to catch anexception if the sentinel value tries to have the "doTheThing()" methodcalled on it.


I can maybe see invariants being used for this:

class UnreliableResource
{
        bool initialized = false;

        invariant
        {
                if (!initialized)
                        throw new Exception("Not initialized.");
        }

        void initialize(string sourceFile)
        {
                ...
        }

        void initialize(uint userId)
        {
                ...
        }

        void doTheThing() {...}
}

But as I think about it, this approach already has a lot of problems:

- It violates the condition that UnreliableResource shouldn't bemodified to solve the problem. Sometimes the class in question isupstream or otherwise not available for modification.


- I have to add this stupid boilerplate to every class.

- There could be a mixin template to ease the boilerplate, but the Dspec states that there can be only one invariant in a class. Using sucha mixin would nix my ability to have an invariant for other things.

- Calling initialize(...) would violate the invariant. It can't beinitialized in the constructor because we need to be able to have theinstance exist temporarily in a state where it is constructed from anullary do-nothing constructor and remains uninitialized until abeneficial codepath initializes it properly.

- It will not be present in release mode. This could be a deal-breakerin some cases.

- Using this means that instances of UnreliableResource should justnever be null, and thus I am required to do an allocation even when theprogram will take codepaths that don't actually use the class. I'musually not concerned too much with premature optimization, butallocations are probably a nasty thing to sprinkle about unnecessarily.

Maybe a proxy struct with opDispatch and such could be used to getaround these limitations?

Ex usage: Initializable!(UnreliableResource) res;

Re: dereferencing null

Reply via email to