Re: null dereference exception vs. segfault?
Pelle: > If NotNull will be in a library, it should probably use enforce, if I > have understood things correctly. External input, and all that. I think > most of phobos does it like this currently. I suspect that Andrei has still to "get" DbC :-) (And your lib is not Phobos.) Bye, bearophile
Re: null dereference exception vs. segfault?
In the meantime I have written the first part of the Bugzilla entry about non-null: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=114391 Bye, bearophile
Re: null dereference exception vs. segfault?
On 08/03/2010 01:08 AM, bearophile wrote: Pelle: struct NotNull(T) if(is(typeof(T.init !is null))) { Is this enough? struct NotNull(T) if (is(T.init is null)) { this(T t) { enforce(t !is null, "Cannot create NotNull from null"); enforce() is bad, use Design by Contract instead (a precondition with an assert inside). Bye, bearophile If NotNull will be in a library, it should probably use enforce, if I have understood things correctly. External input, and all that. I think most of phobos does it like this currently.
Re: null dereference exception vs. segfault?
> Is this enough? > struct NotNull(T) if (is(T.init is null)) { Sorry, I meant: struct NotNull(T) if (T.init is null) { Bye, bearophile
Re: null dereference exception vs. segfault?
Pelle: > struct NotNull(T) if(is(typeof(T.init !is null))) { Is this enough? struct NotNull(T) if (is(T.init is null)) { > this(T t) { > enforce(t !is null, "Cannot create NotNull from null"); enforce() is bad, use Design by Contract instead (a precondition with an assert inside). Bye, bearophile
Re: null dereference exception vs. segfault?
On 08/03/2010 12:32 AM, bearophile wrote: Pelle: I think a good thing would be NonNull!T, but I haven't managed to create one. If this structure exists and becomes good practice to use, maybe we can get the good syntax in D3. In 20 years or so :P Maybe we are talking about two different things, I was talking about nonnull class references/pointers, you seem to talk about nullable values :-) Both can be useful in D, but they are different things. Nullable values are simpler to design, they are just wrapper structs that contain a value plus a boolean, plus if you want some syntax sugar to manage them with a shorter syntax. Bye, bearophile I am talking about non-nullable references indeed. I don't think I mentioned nullable types, really. I also created this, as the simplest NotNull-type concievable: struct NotNull(T) if(is(typeof(T.init !is null))) { private T _instance; this(T t) { enforce(t !is null, "Cannot create NotNull from null"); _instance = t; } T get() { assert (_instance !is null, text("Supposed NotNull!(", T.stringof, ") is null")); return _instance; } alias get this; } This has the obvious bug in that you can declare a nonnull without an initializer and get a null from it. If we ever get @disable this(){} for structs, this struct can become better. I'll probably try it out in some code.
Re: null dereference exception vs. segfault?
Pelle: > I think a good thing would be NonNull!T, but I haven't managed to create > one. If this structure exists and becomes good practice to use, maybe we > can get the good syntax in D3. In 20 years or so :P Maybe we are talking about two different things, I was talking about nonnull class references/pointers, you seem to talk about nullable values :-) Both can be useful in D, but they are different things. Nullable values are simpler to design, they are just wrapper structs that contain a value plus a boolean, plus if you want some syntax sugar to manage them with a shorter syntax. Bye, bearophile
Re: null dereference exception vs. segfault?
On 08/03/2010 12:02 AM, bearophile wrote: Pelle: What I really wish for is non-nullable types, though. Maybe in D3... :P I think there is no enhancement request in Bugzilla about this, I will add one. I think there has been, at least this has been discussed on the newsgroup. To implement this you have to think about the partially uninitialized objects too, this is a paper about it, given a class type T it defines four types (I think the four types are managed by the compiler only, the programmer uses only two of them, nullable class references and nonnullable ones): http://research.microsoft.com/pubs/67461/non-null.pdf If a language defaults to nonnullable references, then you can use this syntax: class T {} T nonnullable_instance = new T; T? nullable_instance; But now it's probably nearly impossible to make D references nonnullable on default, so that syntax can't be used. And I don't what syntax to use yet. Suggestions welcome. Bye, bearophile That is a good syntax indeed. What is also needed is a way of conditionally getting the reference out of the nullable. I think delight uses something like this: T? nullable; if actual = nullable: actual.dostuff; I think a good thing would be NonNull!T, but I haven't managed to create one. If this structure exists and becomes good practice to use, maybe we can get the good syntax in D3. In 20 years or so :P
Re: null dereference exception vs. segfault?
> But now it's probably nearly impossible to make D references nonnullable on > default, so that syntax can't be used. And I don't what syntax to use yet. > Suggestions welcome. One of the few ideas I have had is to use the @ suffix for this: class T {} T nullable_reference; T@ nonnullable_reference = new T@(); struct S {} S nullable_pointer; S@ nonnullable_pointer = new S@(); (Beside nonnullable class references/pointers, another way to catch bugs that I miss in D are the ranged integers of ObjectPascal/Ada. Walter doesn't like them, I think he thinks they are a failed idea, but I don't agree and I don't remember why he thinks so.) Bye, bearophile
Re: null dereference exception vs. segfault?
Pelle: > Null Pointer Exception! Ah, I see. I hate TLA (Three Letter Acronyms). > What I really wish for is non-nullable types, though. Maybe in D3... :P I think there is no enhancement request in Bugzilla about this, I will add one. To implement this you have to think about the partially uninitialized objects too, this is a paper about it, given a class type T it defines four types (I think the four types are managed by the compiler only, the programmer uses only two of them, nullable class references and nonnullable ones): http://research.microsoft.com/pubs/67461/non-null.pdf If a language defaults to nonnullable references, then you can use this syntax: class T {} T nonnullable_instance = new T; T? nullable_instance; But now it's probably nearly impossible to make D references nonnullable on default, so that syntax can't be used. And I don't what syntax to use yet. Suggestions welcome. Bye, bearophile
Re: null dereference exception vs. segfault?
On 08/02/2010 11:27 PM, bearophile wrote: Ryan W Sims: The problem isn't how to check it on a case-by-case basis, there are plenty of ways to check that a given pointer is non-null. The problem is debugging _unexpected_ null dereferences, for which a NPE or its equivalent is very helpful, a segfault is _not_. I don't know what NPE is, but if you program with DbC your nulls are very often found out by asserts, so you have assert errors (that show line number& file name) instead of segfaults. Null Pointer Exception! However, I agree with getting segfaults from them. Otherwise, you will be tempted to use the exception handling mechanisms to catch null pointer exceptions, which is a bad thing. I also agree with the notion of using DbC to find nulls. What I really wish for is non-nullable types, though. Maybe in D3... :P
Re: null dereference exception vs. segfault?
Jonathan M Davis: > As for indexing into an array, the array itself should be null or not. It has > no > size if it's null, so it makes no sense to talk about large arrays which are > null. Technically dynamic arrays in D are represented with a 2-word struct that contains a pointer and length. So empty dynamic arrays are two zero words. In D there is also the literal [] that in my opinion is better to represent an empty array than just "null": http://d.puremagic.com/issues/show_bug.cgi?id=3889 Bye, bearophile
Re: null dereference exception vs. segfault?
Ryan W Sims: > The problem isn't how to check it on a case-by-case basis, there are > plenty of ways to check that a given pointer is non-null. The problem is > debugging _unexpected_ null dereferences, for which a NPE or its > equivalent is very helpful, a segfault is _not_. I don't know what NPE is, but if you program with DbC your nulls are very often found out by asserts, so you have assert errors (that show line number & file name) instead of segfaults. > Sorry, didn't mean to reopen a can of worms, just wanted to be clear. When people that discuss are polite there is no problem in reopening the can now and then :-) Bye, bearophile
Re: null dereference exception vs. segfault?
On Monday, August 02, 2010 08:34:50 Jeffrey Yasskin wrote: > That's good to know. Unfortunately, reading through a null pointer > does cause undefined behavior: it's not a guaranteed segfault. > Consider an object with a large array at the beginning, which pushes > later members past the empty pages at the beginning of the address > space. I don't suppose the D compiler watches for such large objects > and emits actual null checks before indexing into them? There are no null checks. When people have requested in the past that null checks be added (like you'd get in Java), Walter has indicated that he thought that there was no point to them because the OS takes care of them already by giving you a segfault. I'm not personally well-versed enough in exactly what goes on at the hardware or OS level to produce a segfault, so I can't say whether a segfault is absolutely guaranteed. It has been my understanding that it is. As for indexing into an array, the array itself should be null or not. It has no size if it's null, so it makes no sense to talk about large arrays which are null. On top of that, bounds checking is usually done on arrays (off of the top of my head, I don't remember the exact circumstances under which it's removed, but it's almost always there), so you wouldn't be able to index past its end, and if it's an element of the array that you're dereferencing, then whether that element is null or not will determine whether it segfaults. > > The pages that you're looking at there need to be updated for clarity. > > Nice use of the passive voice. Who needs to update them? Is their > source somewhere you or I could send a patch? Submit a bug report to bugzilla: http://d.puremagic.com/issues/ - Jonathan M Davis
Re: null dereference exception vs. segfault?
On 8/2/10 10:33 AM, bearophile wrote: Mafi: If you want a NullPointerException as part of your program flow, you can use enforce() (in std.contracts I think). I don't think catching a NullPointerException in a big code block where you don't know which dereferencing should fail is good style. enforce() is not a panacea (panchrest); as far as I know DMD doesn't inline any function that contains enforce(). So sometimes an assert() is better, especially if it's inside a contract (precondition, etc). DesignByConstrac-style programming is not something that just happens, you have to train yourself for some time for it. Bye, bearophile The problem isn't how to check it on a case-by-case basis, there are plenty of ways to check that a given pointer is non-null. The problem is debugging _unexpected_ null dereferences, for which a NPE or its equivalent is very helpful, a segfault is _not_. Sorry, didn't mean to reopen a can of worms, just wanted to be clear. -- rwsims
Re: null dereference exception vs. segfault?
Mafi: > If you want a NullPointerException as part of your program flow, you can > use enforce() (in std.contracts I think). I don't think catching a > NullPointerException in a big code block where you don't know which > dereferencing should fail is good style. enforce() is not a panacea (panchrest); as far as I know DMD doesn't inline any function that contains enforce(). So sometimes an assert() is better, especially if it's inside a contract (precondition, etc). DesignByConstrac-style programming is not something that just happens, you have to train yourself for some time for it. Bye, bearophile
Re: null dereference exception vs. segfault?
Am 02.08.2010 16:50, schrieb Ryan W Sims: On 8/2/10 1:56 AM, Jonathan M Davis wrote: On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote: The following code fails with a "Bus error" (OSX speak for "Segfault," if I understand correctly). // types.d import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } It's even worse if I do a 'dmd -run types.d', it just fails without even the minimalistic "Bus error." Is this correct behavior? I searched the archives& looked at the FAQ& found workarounds (registering a signal handler), but not a justification, and the threads were from a couple years ago. Wondering if maybe something has changed and there's a problem with my system? -- rwsims You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them. Yes, I know *why* I'm getting a segfault, thank you - I set up the example explicitly to defeat the compiler's null checking to test the behavior. I was startled that there wasn't an exception thrown w/ a stack trace. [snip] Unlike Java, there is no such thing as a NullPointerException in D. You just get segfaults - just like you would in C++. So, if you don't want segfaults from derefencing null references, you need to make sure that they aren't null when you dereference them. - Jonathan M Davis That was my question, thanks. It seemed like such an un-D thing to have happen; I was surprised. I guess w/o the backing of a full virtual machine, it's tricker to catch null dereferences on the fly, but boy it'd be nice to have. Don't want to re-fire the debate here, though. -- rwsims If you want a NullPointerException as part of your program flow, you can use enforce() (in std.contracts I think). I don't think catching a NullPointerException in a big code block where you don't know which dereferencing should fail is good style. Mafi
Re: null dereference exception vs. segfault?
Jeffrey Yasskin: > That's good to know. Unfortunately, reading through a null pointer > does cause undefined behavior: it's not a guaranteed segfault. > Consider an object with a large array at the beginning, which pushes > later members past the empty pages at the beginning of the address > space. I don't suppose the D compiler watches for such large objects > and emits actual null checks before indexing into them? I am not expert enough to give you a good answer about this, but do some tests :-) And later if you want you may say the same things in the main D newsgroup. Bye, bearophile
Re: null dereference exception vs. segfault?
On Mon, Aug 2, 2010 at 1:49 AM, Jonathan M Davis wrote: > On Monday 02 August 2010 00:05:40 Jeffrey Yasskin wrote: >> Even better, you can annotate fail_sometimes with @safe, and it'll >> still access out-of-bounds memory. >> >> Take the following with a grain of salt since I'm really new to the >> language. >> >> gdb says: >> Reason: KERN_PROTECTION_FAILURE at address: 0x0008 >> 0x1e52 in D4test14fail_sometimesFiZv () >> >> which indicates that 'a' is getting initialized to null (possibly by >> process startup 0ing out the stack), and then x is being read out of >> it. You can get exactly the same crashes in C++ by reading member >> variables out of null pointers. The D compiler is supposed to catch >> the uninitialized variable ("It is an error to use a local variable >> without first assigning it a value." in >> http://www.digitalmars.com/d/2.0/function.html), but clearly it's >> missing this one. >> >> I haven't actually found where in the language spec it says that class >> variables are pointers, or what their default values are. I'd expect >> to find this in http://www.digitalmars.com/d/2.0/type.html, but no >> luck. >> >> Looking through the bug tracker ... Walter's response to >> http://d.puremagic.com/issues/show_bug.cgi?id=671 seems to indicate >> that he isn't serious about uninitialized use being an error. It's >> just undefined behavior like in C++. >> >> In any case, the fix for your problem will be to initialize 'a' before >> using it. > > _All_ variables in D are initialized with a default value. There should be > _no_ > undefined behavior with regards to initializations. D is very concientious > about > avoiding undefined behavior. In the case of references and pointers, they are > initialized to null. That's good to know. Unfortunately, reading through a null pointer does cause undefined behavior: it's not a guaranteed segfault. Consider an object with a large array at the beginning, which pushes later members past the empty pages at the beginning of the address space. I don't suppose the D compiler watches for such large objects and emits actual null checks before indexing into them? > The pages that you're looking at there need to be updated for clarity. Nice use of the passive voice. Who needs to update them? Is their source somewhere you or I could send a patch?
Re: null dereference exception vs. segfault?
On 8/2/10 1:56 AM, Jonathan M Davis wrote: On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote: The following code fails with a "Bus error" (OSX speak for "Segfault," if I understand correctly). // types.d import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } It's even worse if I do a 'dmd -run types.d', it just fails without even the minimalistic "Bus error." Is this correct behavior? I searched the archives& looked at the FAQ& found workarounds (registering a signal handler), but not a justification, and the threads were from a couple years ago. Wondering if maybe something has changed and there's a problem with my system? -- rwsims You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them. Yes, I know *why* I'm getting a segfault, thank you - I set up the example explicitly to defeat the compiler's null checking to test the behavior. I was startled that there wasn't an exception thrown w/ a stack trace. [snip] Unlike Java, there is no such thing as a NullPointerException in D. You just get segfaults - just like you would in C++. So, if you don't want segfaults from derefencing null references, you need to make sure that they aren't null when you dereference them. - Jonathan M Davis That was my question, thanks. It seemed like such an un-D thing to have happen; I was surprised. I guess w/o the backing of a full virtual machine, it's tricker to catch null dereferences on the fly, but boy it'd be nice to have. Don't want to re-fire the debate here, though. -- rwsims
Re: null dereference exception vs. segfault?
On Mon, 02 Aug 2010 00:59:42 -0400, Ryan W Sims wrote: The following code fails with a "Bus error" (OSX speak for "Segfault," if I understand correctly). // types.d import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } It's even worse if I do a 'dmd -run types.d', it just fails without even the minimalistic "Bus error." Is this correct behavior? I searched the archives & looked at the FAQ & found workarounds (registering a signal handler), but not a justification, and the threads were from a couple years ago. Wondering if maybe something has changed and there's a problem with my system? I'm not familiar with dmd -run, but you should be aware that asserts are not compiled into release code. Try changing the assert to this: if(a.x != 42) writeln("Wrong x value"); FWIW, D does not have null pointer exceptions, even in debug mode. It's an oft-debated subject, but Walter hasn't ever budged on it. His view is that you should use a debugger to see where your code is failing. We have pointed out countless times that often it's not possible to have a debugger at hand, or even be able to reproduce the issue that caused the segfault while in a different environment. I don't know if we'll ever see null pointer exceptions, but I'd love them in debug mode only, or at least to see a stack trace when it occurs. The latter can be done without Phobos/dmd help if someone can write such a signal handler function. I don't know enough about stack traces to understand how to do it. -Steve
Re: null dereference exception vs. segfault?
Jonathan M Davis: > _All_ variables in D are initialized with a default value. There should be > _no_ > undefined behavior with regards to initializations. D is very concientious > about > avoiding undefined behavior. See also: http://d.puremagic.com/issues/show_bug.cgi?id=3820 Bye, bearophile
Re: null dereference exception vs. segfault?
On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote: > The following code fails with a "Bus error" (OSX speak for "Segfault," > if I understand correctly). > > // types.d > import std.stdio; > > class A { > int x = 42; > } > > void fail_sometimes(int n) { > A a; > if (n == 0) { > a = new A; // clearly a contrived example > } > assert(a.x == 42, "Wrong x value"); > } > > void main() { > fail_sometimes(1); > } > > It's even worse if I do a 'dmd -run types.d', it just fails without even > the minimalistic "Bus error." Is this correct behavior? I searched the > archives & looked at the FAQ & found workarounds (registering a signal > handler), but not a justification, and the threads were from a couple > years ago. Wondering if maybe something has changed and there's a > problem with my system? > > -- > rwsims You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them. If you changed your code to import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a !is null, "a shouldn't be null"); assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } you would get output something like this core.exception.asserter...@types.d(12): a shouldn't be null ./types() [0x804b888] ./types() [0x8049360] ./types() [0x8049399] ./types() [0x804ba54] ./types() [0x804b9b9] ./types() [0x804ba91] ./types() [0x804b9b9] ./types() [0x804b968] /opt/lib32/lib/libc.so.6(__libc_start_main+0xe6) [0xf760bc76] ./types() [0x8049261] Unlike Java, there is no such thing as a NullPointerException in D. You just get segfaults - just like you would in C++. So, if you don't want segfaults from derefencing null references, you need to make sure that they aren't null when you dereference them. - Jonathan M Davis
Re: null dereference exception vs. segfault?
On Monday 02 August 2010 00:05:40 Jeffrey Yasskin wrote: > Even better, you can annotate fail_sometimes with @safe, and it'll > still access out-of-bounds memory. > > Take the following with a grain of salt since I'm really new to the > language. > > gdb says: > Reason: KERN_PROTECTION_FAILURE at address: 0x0008 > 0x1e52 in D4test14fail_sometimesFiZv () > > which indicates that 'a' is getting initialized to null (possibly by > process startup 0ing out the stack), and then x is being read out of > it. You can get exactly the same crashes in C++ by reading member > variables out of null pointers. The D compiler is supposed to catch > the uninitialized variable ("It is an error to use a local variable > without first assigning it a value." in > http://www.digitalmars.com/d/2.0/function.html), but clearly it's > missing this one. > > I haven't actually found where in the language spec it says that class > variables are pointers, or what their default values are. I'd expect > to find this in http://www.digitalmars.com/d/2.0/type.html, but no > luck. > > Looking through the bug tracker ... Walter's response to > http://d.puremagic.com/issues/show_bug.cgi?id=671 seems to indicate > that he isn't serious about uninitialized use being an error. It's > just undefined behavior like in C++. > > In any case, the fix for your problem will be to initialize 'a' before > using it. _All_ variables in D are initialized with a default value. There should be _no_ undefined behavior with regards to initializations. D is very concientious about avoiding undefined behavior. In the case of references and pointers, they are initialized to null. There's not really such a thing as using a variable without initializing it, because variables are default initialized if you don't initialize them yourself. The _one_ exception would be if you explicitly initialized a variable to void: int[] a = void; In that case, you are _explicitly_ telling the compiler not to default initialize the variable. That _can_ lead to undefined behavior and is definitely unsafe. As such, it is intended solely for the purposes of optimizing code where absolutely necessary. So, you really shouldn't have any variables in your code that weren't initialized, even if you didn't initialize them explicitly. The pages that you're looking at there need to be updated for clarity. - Jonathan M Davis
Re: null dereference exception vs. segfault?
Even better, you can annotate fail_sometimes with @safe, and it'll still access out-of-bounds memory. Take the following with a grain of salt since I'm really new to the language. gdb says: Reason: KERN_PROTECTION_FAILURE at address: 0x0008 0x1e52 in D4test14fail_sometimesFiZv () which indicates that 'a' is getting initialized to null (possibly by process startup 0ing out the stack), and then x is being read out of it. You can get exactly the same crashes in C++ by reading member variables out of null pointers. The D compiler is supposed to catch the uninitialized variable ("It is an error to use a local variable without first assigning it a value." in http://www.digitalmars.com/d/2.0/function.html), but clearly it's missing this one. I haven't actually found where in the language spec it says that class variables are pointers, or what their default values are. I'd expect to find this in http://www.digitalmars.com/d/2.0/type.html, but no luck. Looking through the bug tracker ... Walter's response to http://d.puremagic.com/issues/show_bug.cgi?id=671 seems to indicate that he isn't serious about uninitialized use being an error. It's just undefined behavior like in C++. In any case, the fix for your problem will be to initialize 'a' before using it. On Sun, Aug 1, 2010 at 9:59 PM, Ryan W Sims wrote: > The following code fails with a "Bus error" (OSX speak for "Segfault," if I > understand correctly). > > // types.d > import std.stdio; > > class A { > int x = 42; > } > > void fail_sometimes(int n) { > A a; > if (n == 0) { > a = new A; // clearly a contrived example > } > assert(a.x == 42, "Wrong x value"); > } > > void main() { > fail_sometimes(1); > } > > It's even worse if I do a 'dmd -run types.d', it just fails without even the > minimalistic "Bus error." Is this correct behavior? I searched the archives > & looked at the FAQ & found workarounds (registering a signal handler), but > not a justification, and the threads were from a couple years ago. Wondering > if maybe something has changed and there's a problem with my system? > > -- > rwsims >
null dereference exception vs. segfault?
The following code fails with a "Bus error" (OSX speak for "Segfault," if I understand correctly). // types.d import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } It's even worse if I do a 'dmd -run types.d', it just fails without even the minimalistic "Bus error." Is this correct behavior? I searched the archives & looked at the FAQ & found workarounds (registering a signal handler), but not a justification, and the threads were from a couple years ago. Wondering if maybe something has changed and there's a problem with my system? -- rwsims