On 11/20/18 1:04 PM, Johan Engelen wrote:
On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis wrote:

For @safe to function properly, dereferencing null _must_ be guaranteed to be memory safe, and for dmd it is, since it will always segfault. Unfortunately, as understand it, it is currently possible with ldc's optimizer to run into trouble, since it'll do things like see that something must be null and therefore assume that it must never be dereferenced, since it would clearly be wrong to dereference it. And then when the code hits a point where it _does_ try to dereference it, you get undefined behavior. It's something that needs to be fixed in ldc, but based on discussions I had with Johan at dconf this year about the issue, I suspect that the spec is going to have to be updated to be very clear on how dereferencing null has to be handled before the ldc guys do anything about it. As long as the optimizer doesn't get involved everything is fine, but as great as optimizers can be at making code faster, they aren't really written with stuff like @safe in mind.

One big problem is the way people talk and write about this issue. There is a difference between "dereferencing" in the language, and reading from a memory address by the CPU.

In general, I always consider "dereferencing" the point at which code follows a pointer to read or write its data. The semantics of modifying the type to mean the data vs. the pointer to it, seems less interesting. Types are compiler internal things, the actual reads and writes are what cause the problems.

But really, it's the act of using a pointer to read/write the data it points at which causes the segfault. And in D, we assume that this action is @safe because of the MMU protecting the first page.

Confusing language semantics with what the CPU is doing happens often in the D community and is not helping these debates.

D is proclaiming that dereferencing `null` must segfault but that is not implemented by any of the compilers. It would require inserting null checks upon every dereference. (This may not be as slow as you may think, but it would probably not make code run faster.)

An example:
```
class A {
     int i;
     final void foo() {
          import std.stdio; writeln(__LINE__);
         // i = 5;
     }
}

void main() {
     A a;
     a.foo();
}
```

In this case, the actual null dereference happens on the last line of main. The program runs fine however since dlang 2.077.

Right, the point is that the segfault happens when null pointers are used to get at the data. If you turn something that is ultimately a pointer into another type of pointer, then you aren't dereferencing it really. This happens when you pass *pointer into a function that takes a reference (or when you pass around a class reference).

In any case, the prior versions to 2.077 didn't segfault, they just had a prelude in front of every function which asserted that this wasn't null (you actually get a nice stack trace).

Now when `foo` is modified such that it writes to member field `i`, the program does segfault (writes to address 0). D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening.

Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.


So, I think all compiler implementations are not spec compliant on this point.

I think if the spec says that dereferencing doesn't mean following a pointer to it's data, and reading/writing that data, and it says null dereferences cause a segfault, then the spec needs to be updated. The @safe segfault is what it should be focused on, not some abstract concept that exists only in the compiler.

If it means changing the terminology, then we should do that.

I think most people believe that compliance is too costly for the kind of software one wants to write in D; the issue is similar to array bounds checking that people explicitly disable or work around. For compliance we would need to change the compiler to emit null checks on all @safe dereferences (the opposite direction was chosen in 2.077). It'd be interesting to do the experiment.

The whole point of using the MMU instead of instrumentation is because we can avoid the performance penalties and still be safe. The only loophole is large structures that may extend beyond the protected data. I would suggest that the compiler inject extra reads of the front of any data type in that case (when @safe is enabled) to cause a segfault properly.

-Steve
  • Re: Why does nobody seem ... Steven Schveighoffer via Digitalmars-d-learn
    • Re: Why does nobody ... Jordi Gutiérrez Hermoso via Digitalmars-d-learn
      • Re: Why does nob... Steven Schveighoffer via Digitalmars-d-learn
        • Re: Why does... Jonathan M Davis via Digitalmars-d-learn
        • Re: Why does... Johan Engelen via Digitalmars-d-learn
          • Re: Why ... Steven Schveighoffer via Digitalmars-d-learn
            • Re:... Johan Engelen via Digitalmars-d-learn
              • ... Neia Neutuladh via Digitalmars-d-learn
              • ... Johan Engelen via Digitalmars-d-learn
              • ... Patrick Schluter via Digitalmars-d-learn
              • ... Johan Engelen via Digitalmars-d-learn
              • ... Steven Schveighoffer via Digitalmars-d-learn
              • ... Timon Gehr via Digitalmars-d-learn
            • Re:... NoMoreBugs via Digitalmars-d-learn
          • Re: Why ... Jonathan M Davis via Digitalmars-d-learn
          • Re: Why ... Johan Engelen via Digitalmars-d-learn

Reply via email to