On Fri, May 10, 2013 at 03:02:12PM -0700, Walter Bright wrote: > On 5/10/2013 2:31 PM, H. S. Teoh wrote: > >Note how much boilerplate is necessary to make the code work > >*correctly*. > > It's worse than that. Experience shows that this rat's nest style of > code often is incorrect because it is both complex and never tested.
Yeah, some of those if's are very difficult to trigger, and usually nested so deep in the call tree that most people just don't bother trying to trigger it. Besides, the lack of built-in unittests in C means that even if somebody *did* test it at one point, it's very unlikely that the 15 people who came along later and modified the code will repeat the same test. And even if they did, it was probably not a *thorough* test... Once I was trying to track down a baffling bug that causes a daemon to suddenly stop responding for no discernible reason. We spent many hours trying to figure out what went wrong, but didn't get very far. The first clue we found was that kill -11 didn't do anything. Now, we have a segfault handler that writes the stacktrace to a log when the daemon segfaults, you see, and when debugging we often deliberately use kill -11 to segfault the daemon then look at the log to find out what it was doing at the time of the signal. This usually worked, but not this time. The signal seemed to be completely ignored. Only kill -9 is capable of making the stuck process go away. At first we thought it was a stray call to signal() or sigaction() that removed the stack trace handler, but closer inspection suggested that this was not the case. It turns out that this mysterious "stuck" state was caused by the stack trace code -- but not in any of the usual ways. In order to produce the trace, it uses fprintf to write info to the log, and fprintf in turn calls malloc at various points to allocate the necessary buffers to do that. Now, if for some reason free() segfaults (e.g., you pass in an illegal pointer), then libc is still holding the internal malloc mutex lock when the OS sends the SEGV to the process, so when the stack trace handler then calls fprintf, which in turn calls malloc, it deadlocks. Further SIGSEGV's won't help, since it only makes the deadlock worse. All of this came about because we had overlooked the POSIX spec that certain functions are unsafe to call inside signal-handler context. But then again... who hasn't?! (Hands up, those of you who knew that fprintf has undefined behaviour inside a signal handler. Yeah, I thought so.) Eventually we had to rewrite the stack trace handler to only use write() to a pre-opened socket to a logging daemon, since otherwise it was impossible to actually write the stack trace anywhere without risking undefined behaviour. And none of this has even begun to address the original bug of why free() was passed an illegal pointer in the first place. Isn't it fun when most of the time you spend debugging is actually to fix the error-handler rather than the actual bug? > While D doesn't make it more testable, at least it makes it simple, > and hence more likely to be correct. It makes a big difference when the language itself supports certain constructs like exceptions or scope guards. Scope guards cut away almost all of the boilerplate cruft in the equivalent C if-and-goto construct, making the attached statement so simple that it's most likely correct, as you said. It also eliminates the need to sprinkle various parts of that code across 2 or 3 different places in an overly-long function with unclear execution path, that in C is almost guaranteed to become buggy after passing through the grubby hands of the next 5 unfortunate coders assigned to work on the code. And while the scope guard itself may be buggy (DMD bug, say), it does get tested very often -- every D program that uses it constitutes a test case -- so any such bugs are quickly noticed and weeded out. Seriously, D has so spoiled me I can't stand programming in another language these days. :-P T -- EMACS = Extremely Massive And Cumbersome System