Re: Why does nobody seem to think that `null` is a serious problem in D?

Steven Schveighoffer via Digitalmars-d-learn Tue, 20 Nov 2018 11:15:40 -0800

On 11/20/18 1:04 PM, Johan Engelen wrote:

On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis wrote:
For @safe to function properly, dereferencing null _must_ beguaranteed to be memory safe, and for dmd it is, since it will alwayssegfault. Unfortunately, as understand it, it is currently possiblewith ldc's optimizer to run into trouble, since it'll do things likesee that something must be null and therefore assume that it mustnever be dereferenced, since it would clearly be wrong to dereferenceit. And then when the code hits a point where it _does_ try todereference it, you get undefined behavior. It's something that needsto be fixed in ldc, but based on discussions I had with Johan at dconfthis year about the issue, I suspect that the spec is going to have tobe updated to be very clear on how dereferencing null has to behandled before the ldc guys do anything about it. As long as theoptimizer doesn't get involved everything is fine, but as great asoptimizers can be at making code faster, they aren't really writtenwith stuff like @safe in mind.
One big problem is the way people talk and write about this issue. Thereis a difference between "dereferencing" in the language, and readingfrom a memory address by the CPU.

In general, I always consider "dereferencing" the point at which codefollows a pointer to read or write its data. The semantics of modifyingthe type to mean the data vs. the pointer to it, seems less interesting.Types are compiler internal things, the actual reads and writes are whatcause the problems.

But really, it's the act of using a pointer to read/write the data itpoints at which causes the segfault. And in D, we assume that thisaction is @safe because of the MMU protecting the first page.

Confusing language semantics with what the CPU is doing happens often inthe D community and is not helping these debates.
D is proclaiming that dereferencing `null` must segfault but that is notimplemented by any of the compilers. It would require inserting nullchecks upon every dereference. (This may not be as slow as you maythink, but it would probably not make code run faster.)
An example:
```
class A {
     int i;
     final void foo() {
          import std.stdio; writeln(__LINE__);
         // i = 5;
     }
}

void main() {
     A a;
     a.foo();
}
```
In this case, the actual null dereference happens on the last line ofmain. The program runs fine however since dlang 2.077.

Right, the point is that the segfault happens when null pointers areused to get at the data. If you turn something that is ultimately apointer into another type of pointer, then you aren't dereferencing itreally. This happens when you pass *pointer into a function that takes areference (or when you pass around a class reference).

In any case, the prior versions to 2.077 didn't segfault, they just hada prelude in front of every function which asserted that this wasn'tnull (you actually get a nice stack trace).

Now when `foo` is modified such that it writes to member field `i`, theprogram does segfault (writes to address 0).D does not make dereferencing on class objects explicit, which makes itharder to see where the dereference is happening.

Again, the terms are confusing. You just said the dereference happens ata.foo(), right? I would consider the dereference to happen when theobject's data is used. i.e. when you read or write what the pointerpoints at.

So, I think all compiler implementations are not spec compliant on thispoint.

I think if the spec says that dereferencing doesn't mean following apointer to it's data, and reading/writing that data, and it says nulldereferences cause a segfault, then the spec needs to be updated. The@safe segfault is what it should be focused on, not some abstractconcept that exists only in the compiler.


If it means changing the terminology, then we should do that.

I think most people believe that compliance is too costly for the kindof software one wants to write in D; the issue is similar to arraybounds checking that people explicitly disable or work around.For compliance we would need to change the compiler to emit null checkson all @safe dereferences (the opposite direction was chosen in 2.077).It'd be interesting to do the experiment.

The whole point of using the MMU instead of instrumentation is becausewe can avoid the performance penalties and still be safe. The onlyloophole is large structures that may extend beyond the protected data.I would suggest that the compiler inject extra reads of the front of anydata type in that case (when @safe is enabled) to cause a segfault properly.


-Steve

Re: Why does nobody seem to think that `null` is a serious problem in D?

Reply via email to