Re: assert semantic change proposal
On Monday, 4 August 2014 at 03:22:51 UTC, Andrei Alexandrescu wrote: On 8/3/14, 6:59 PM, David Bregman wrote: w.r.t the one question about performance justification: I'm not necessarily asking for research papers and measurements, but based on these threads I'm not aware that there is any justification at all. For all I know this is all based on a wild guess that it will help performance "a lot", like someone who optimizes without profiling first. That certainly isn't enough to justify code breakage and massive UB injection, is it? I hope we can agree on that much at least! I think at this point (without more data) a bit of trust in one's experience would be needed. I've worked on performance on and off for years, and so has Walter. We have plenty of war stories that inform our expertise in the matter, including weird stuff like "swap these two enum values and you'll get a massive performance regressions although code is correct either way". I draw from numerous concrete cases that the right/wrong optimization at the right/wrong place may as well be the difference between winning and losing. Consider the recent php engine that gets within 20% of hhvm; heck, I know where to go to make hhvm 20% slower with 50 lines of code (compare at 2M+). Conversely, gaining those 20% were months multiplied by Facebook's best engineers. Efficiency is hard to come by and easy to waste. I consider Walter's take on "assert" a modern, refreshing take on an old pattern that nicely preserves its spirit, and a good opportunity and differential advantage for D. If anything, these long threads have strengthened that belief. It has also clarified to me that: (a) We must make sure we don't transform @safe code into unsafe code; in the first approximation that may simply mean assert() has no special meaning in release mode. Also bounds checking would need to probably be not elided by assert. I consider these challenging but in good, gainful ways. (b) Deployment of optimizations must be carefully staggered and documented. Andrei First of all, thank you for the reply. I agree with nearly everything you say. I also have significant experience with code optimization. I greatly enjoyed the talk you gave on C++ optimization, partly because it validated what I've spent so much of my own efforts doing. I think we reach different conclusions from our experience though, my feeling is that typical asserts are unlikely to contain much info that can give a speedup. This is not to say that the compiler can't be helped by extra information, on the contrary I wholeheartedly believe it can. However I would guess this will usually require the asserts to be specifically written for that purpose, using inside knowledge about the kinds of information the optimizer is capable of using. In the end there isn't a substitute for measurement, so if we rely on experience we're both just guessing. Is it really justified to say that we're going to break stuff on a hunch it'll help performance? Considering the downsides to reusing existing asserts, what if you're wrong about performance? If new, specialized asserts need to be written anyways, we might as well use a new keyword and avoid all the downsides, essentially giving the best of both worlds. Also, I'm still curious about how you are evaluating the performance tradeoff in the first place, or do you even see it as a tradeoff? Is your estimation of the downside so small that any performance increase at all is sufficient to justify semantic change, UB injection and code breakage? If so then I see why you treat it as a forgone conclusion, certainly in a large enough codebase there will be some asserts here and there that allow you to shave off some instructions.
Re: And in the disruptive technologies section…
On Sunday, 3 August 2014 at 10:01:42 UTC, Russel Winder via Digitalmars-d wrote: The numba package (and llvmpy below it) is rapidly getting to production use stage, this means all those people using Python for data analysis (and there are a lot of them) will no longer be searching for Cython/C/C ++/D to speed up their codes, they'll just @autojit their Python code to generate LLVM based native code for the performance critical sections. This will mean this arena of programming will only use C/C++/Fortran for ready made libraries and not for anything new. I suspect this will make PyD and similar more or less redundant. On the up side, it is further emphasizing that LLVM is the short- and medium-term future of native code generation and it is good that there is LDC, and that it is (almost) up to date with D versions. Good data layout and avoiding to rely on AA is necessary to be really fast. Pyhon won't perform as fast as C/C++/D.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 02:56:35 UTC, David Bregman wrote: On Monday, 4 August 2014 at 02:40:49 UTC, deadalnix wrote: I allow myself to chime in. I don't have much time to follow the whole thing, but I have this in my mind for quite a while. First thing first, the proposed behavior is what I had in mind for SDC since pretty much day 1. It already uses hint for the optimizer to tell it the branch won't be taken, but I definitively want to go further. Not everyone had that definition in mind when writing their asserts. By definition, when an assert has been removed in release that would have failed in debug, you are in undefined behavior land already. So there is no reason not to optimize. By the new definition, yes. But is it reasonable to change the definition, and then retroactively declare previous code broken? Maybe the ends justify the means in this case but it certainly isn't obvious that they do. I don't understand why breaking code is sacrilege one time, and the next time can be done without any justifications. The fact that the compiler can optimize based on assert is not new in D world. Maybe it wasn't advertized properly, but it always was an option. If one want to make sure a check is done, one can use expect.
Re: scope guards
On 4 August 2014 13:44, Dicebot via Digitalmars-d < digitalmars-d@puremagic.com> wrote: > On Monday, 4 August 2014 at 03:15:32 UTC, Manu via Digitalmars-d wrote: > >> Well, then they're not particularly useful in practise. I'm finding that I >> can rarely blanket an operation across all exceptions. >> The nature of exceptions is that they are of a particular type, so why >> have >> no access to that concept when trying to respond to them... >> > > You may have coding style particularly alien to scope guards :) Those are > very convenient to use as a simple an generic alternative to RAII, > especially when interfacing with C libraries. Sure, scope() may be useful for this, but it seems in my experience that destructors almost always perform this without any additional code at the callsite. C libraries are exactly the only case where I've managed to use scope() successfully. This seems like a waste. A key talking point of the language finds use almost exclusively when used in conjunction with another language... :/ I find that most often one wants to catch majority of exceptions only in > somewhat high level parts of code (i.e. main loop) and rest is just cleanup > code - perfect fit for scope guards. > I can't think of many instances where I would want to catch an exception at the main loop unless it was completely unexpected, like out of memory or something. Almost all exceptions I throw are in relation to bad input data, and they are to be caught at a slightly higher level of input processing. My code has become try/catch-tastic, and I really don't like looking at it. It rather sickens me and reminds me of Java, and I'm strongly tempted to just abandon my experiment and return to C-style error handling with sentinel values. So... why not make scope guards more useful? It wouldn't be hard. scope(failure, MyException e) is completely non-destructive, and adds significant power to the concept. scope(success) is probably one I don't see use case case for though > Yeah, I can't imagine a use for it either.
Re: scope guards
On Monday, 4 August 2014 at 03:15:32 UTC, Manu via Digitalmars-d wrote: Well, then they're not particularly useful in practise. I'm finding that I can rarely blanket an operation across all exceptions. The nature of exceptions is that they are of a particular type, so why have no access to that concept when trying to respond to them... You may have coding style particularly alien to scope guards :) Those are very convenient to use as a simple an generic alternative to RAII, especially when interfacing with C libraries. I find that most often one wants to catch majority of exceptions only in somewhat high level parts of code (i.e. main loop) and rest is just cleanup code - perfect fit for scope guards. scope(success) is probably one I don't see use case case for though
Re: assert semantic change proposal
On 8/3/14, 8:22 PM, Andrei Alexandrescu wrote: (a) We must make sure we don't transform @safe code into unsafe code; in the first approximation that may simply mean assert() has no special meaning in release mode. ... in @safe code! -- Andrei
Re: assert semantic change proposal
On 8/3/14, 6:59 PM, David Bregman wrote: w.r.t the one question about performance justification: I'm not necessarily asking for research papers and measurements, but based on these threads I'm not aware that there is any justification at all. For all I know this is all based on a wild guess that it will help performance "a lot", like someone who optimizes without profiling first. That certainly isn't enough to justify code breakage and massive UB injection, is it? I hope we can agree on that much at least! I think at this point (without more data) a bit of trust in one's experience would be needed. I've worked on performance on and off for years, and so has Walter. We have plenty of war stories that inform our expertise in the matter, including weird stuff like "swap these two enum values and you'll get a massive performance regressions although code is correct either way". I draw from numerous concrete cases that the right/wrong optimization at the right/wrong place may as well be the difference between winning and losing. Consider the recent php engine that gets within 20% of hhvm; heck, I know where to go to make hhvm 20% slower with 50 lines of code (compare at 2M+). Conversely, gaining those 20% were months multiplied by Facebook's best engineers. Efficiency is hard to come by and easy to waste. I consider Walter's take on "assert" a modern, refreshing take on an old pattern that nicely preserves its spirit, and a good opportunity and differential advantage for D. If anything, these long threads have strengthened that belief. It has also clarified to me that: (a) We must make sure we don't transform @safe code into unsafe code; in the first approximation that may simply mean assert() has no special meaning in release mode. Also bounds checking would need to probably be not elided by assert. I consider these challenging but in good, gainful ways. (b) Deployment of optimizations must be carefully staggered and documented. Andrei
Re: scope guards
On 4 August 2014 12:04, Mike Parker via Digitalmars-d < digitalmars-d@puremagic.com> wrote: > On 8/4/2014 12:28 AM, Manu via Digitalmars-d wrote: > >> I'm trying to make better use of scope guards, but I find myself belting >> out try/catch statements almost everywhere. >> I'm rather disappointed, because scope guards are advertised to offer >> the promise of eliminating try/catch junk throughout your code, and I'm >> just not finding that to be the practical reality. >> >> I think the core of the problem is that scope(failure) is >> indiscriminate, but I want to filter it for particular exceptions. The >> other issue is that you still need a catch() if you don't actually want >> the program to terminate, which implies a try... :/ >> > > Scope guards are for when you don't need to handle exceptions. If you need > the exceptions, use try...catch. I don't think it would be a good idea to > have two different means of handling exceptions. > Well, then they're not particularly useful in practise. I'm finding that I can rarely blanket an operation across all exceptions. The nature of exceptions is that they are of a particular type, so why have no access to that concept when trying to respond to them...
Re: assert semantic change proposal
On Monday, 4 August 2014 at 02:40:49 UTC, deadalnix wrote: I allow myself to chime in. I don't have much time to follow the whole thing, but I have this in my mind for quite a while. First thing first, the proposed behavior is what I had in mind for SDC since pretty much day 1. It already uses hint for the optimizer to tell it the branch won't be taken, but I definitively want to go further. Not everyone had that definition in mind when writing their asserts. By definition, when an assert has been removed in release that would have failed in debug, you are in undefined behavior land already. So there is no reason not to optimize. By the new definition, yes. But is it reasonable to change the definition, and then retroactively declare previous code broken? Maybe the ends justify the means in this case but it certainly isn't obvious that they do. I don't understand why breaking code is sacrilege one time, and the next time can be done without any justifications.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 02:31:36 UTC, John Carter wrote: On Monday, 4 August 2014 at 02:18:12 UTC, David Bregman wrote: His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect. In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code. No. My experience says deeper optimization comes from deeper understanding of the dataflow, with deeper understanding of the dataflow comes stricter warnings about defective usage. Yes, that isn't what is being proposed though. This is about optimization, not warnings or errors. ie. A Good compiler writer, as Walter and the gcc guys clearly are, don't just slap in an optimization pass out of nowhere. They are all too painfully aware that if their optimization pass breaks anything, they will be fending off thousands of complaints that "Optimization X broke". If you read the earlier threads, you will see Walter freely admits this will break code. Actually he says that such code is already broken. This doesn't involve new warnings, it will just break silently. It would be very difficult to do otherwise (see Daniel Gibson's reply to your post).
Re: assert semantic change proposal
On Monday, 4 August 2014 at 00:34:30 UTC, Andrei Alexandrescu wrote: On 8/3/14, 4:51 PM, Mike Farnsworth wrote: This all seems to have a very simple solution, to use something like: expect() GCC for example has an intrinsic, __builtin_expect() that is used to notify the compiler of a data constraint it can use in optimization for branches. Why not make something like this a first-class citizen in D (and even expand the concept to more than just branch prediction)? __builtin_expect is actually not that. It still generates code when the expression is false. It simply uses the static assumption to minimize jumps and maximize straight execution for the true case. -- Andrei Yes, that's why I pointed out expanding it to actually throw an exception when the expectation isn't meant. I guess that's really more like assume() that has been mentioned? At EA we used two versions of an assertion: assert() which compiled out in non-debug builds, etc; and verify() that was kept in non-debug builds but just boiled back down to the condition. The latter was when we relied on the side-effects of the logic (used the condition in a real runtime branch), but really we wanted to know if it ever took the else so to speak as that was an error we never wanted to ship with. FFIW, at my current job, in C++ we use assert() that compiles out for final builds (very performance-driven code, so even the conditions tested have to scat), and we also have likely() and unlikely() macros that take advantage of __builtin_expect(). There are only a few places where we do both, where the assertion may be violated and we still want to recover nicely from it, but still don't want the performance suck of the else case code polluting the instruction cache.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 02:31:36 UTC, John Carter wrote: But since an optimization has to be based on additional hard information, they have, with every new version of gcc, used that information both for warnings and optimization. Hmm. Not sure I made that clear. ie. Yes, it is possible that a defect may be injected by an optimization that assumes an assert is true when it isn't. However, experience suggests that many (maybe two full orders of magnitude) more defects will be flagged. ie. In terms of defect reduction it's a big win rather than a loss. The tragedy of C optimization and static analysis is that the language is so loosely defined in terms of how it is used, the compiler has very little to go on. This proposal looks to me to be a Big Win, because it gifts the compiler (and any analysis tools) with a huge amount of eminently usable information.
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 21:57:08 UTC, John Carter wrote: On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Hmm. I really really do like that idea. I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. One "near term" implication is to permit deeper static checking of the code. I allow myself to chime in. I don't have much time to follow the whole thing, but I have this in my mind for quite a while. First thing first, the proposed behavior is what I had in mind for SDC since pretty much day 1. It already uses hint for the optimizer to tell it the branch won't be taken, but I definitively want to go further. By definition, when an assert has been removed in release that would have failed in debug, you are in undefined behavior land already. So there is no reason not to optimize.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 02:18:12 UTC, David Bregman wrote: His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect. In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code. No. My experience says deeper optimization comes from deeper understanding of the dataflow, with deeper understanding of the dataflow comes stricter warnings about defective usage. ie. A Good compiler writer, as Walter and the gcc guys clearly are, don't just slap in an optimization pass out of nowhere. They are all too painfully aware that if their optimization pass breaks anything, they will be fending off thousands of complaints that "Optimization X broke". Compiler users always blame the optimizer long before they blame their crappy code. Watching the gcc mailing list over the years, those guys bend over backwards to prevent that happening. But since an optimization has to be based on additional hard information, they have, with every new version of gcc, used that information both for warnings and optimization.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 01:26:10 UTC, Daniel Gibson wrote: Am 04.08.2014 03:17, schrieb John Carter: But that's OK. Because I bet 99.999% of those warnings will be pointing straight at bone fide defects. Well, that would make the problem more acceptable.. However, it has been argued that it's very hard to warn about code that will be eliminated, because that code often only become dead or redundant due to inlining, template instantiation, mixin, ... and you can't warn in those cases. So I doubt that the compiler will warn every time it removes checks that are considered superfluous because of a preceding assert(). Cheers, Daniel It is possible, just not as a default enabled warning. Some compilers offers optimization diagnostics which can be enabled by a switch, I'm quite fond of those as it's a much faster way to go through a list of compiler highlighted failed/successful optimizations rather than being forced to check the asm output after every new compiler version or minor code refactoring. In my experience, it actually works fine in huge projects, even if there are false positives you can analyse what changes from the previous version as well as ignoring modules which you know is not performance critical.
Re: assert semantic change proposal
On Monday, 4 August 2014 at 01:19:28 UTC, Andrei Alexandrescu wrote: On 8/3/14, 6:17 PM, John Carter wrote: Well, I'm the dogsbody who has the job of upgrading the toolchain and handling the fallout of doing so. So I have been walking multimegaline code bases through every gcc version in the last 15 years. Truth. This man speaks it. Great post, thanks! Andrei His post basically says that his real life experience leads him to believe that a static analyzer based on using information from asserts will very likely generate a ton of warnings/errors, because real life code is imperfect. In other words, if you use that information to optimize instead, you are going to get a ton of bugs, because the asserts are inconsistent with the code. So his post completely supports the conclusion that you've disagreed with, unless this has convinced you and you're switching sides now (could it be?) :)
Re: scope guards
On 8/4/2014 12:28 AM, Manu via Digitalmars-d wrote: I'm trying to make better use of scope guards, but I find myself belting out try/catch statements almost everywhere. I'm rather disappointed, because scope guards are advertised to offer the promise of eliminating try/catch junk throughout your code, and I'm just not finding that to be the practical reality. I think the core of the problem is that scope(failure) is indiscriminate, but I want to filter it for particular exceptions. The other issue is that you still need a catch() if you don't actually want the program to terminate, which implies a try... :/ Scope guards are for when you don't need to handle exceptions. If you need the exceptions, use try...catch. I don't think it would be a good idea to have two different means of handling exceptions. --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com
Re: assert semantic change proposal
On Monday, 4 August 2014 at 01:17:36 UTC, Andrei Alexandrescu wrote: On 8/3/14, 5:57 PM, David Bregman wrote: On Monday, 4 August 2014 at 00:24:19 UTC, Andrei Alexandrescu wrote: On 8/3/14, 3:26 PM, David Bregman wrote: On Sunday, 3 August 2014 at 22:15:52 UTC, Andrei Alexandrescu wrote: One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Would you care to address the questions about performance raised in the OP? I thought I just did. You made some generic statements about performance being good. This is obvious and undisputed. You did not answer any concerns raised in the OP. I am left to wonder if you even read it. I did read it. Forgive me, but I don't have much new to answer to it. It seems you consider the lack of a long answer accompanied by research and measurements offensive, and you also find my previous answers arrogant. This, to continue what I was mentioning in another post, is the kind of stuff I find difficult to answer meaningfully. Well, I don't want this to devolve to ad hominem level. I never used the word offensive by the way, though I will admit to being temporarily offended by your description of my carefully constructed post as a self important rehash :) Basically, I didn't find your reply useful because, as I said, you were simply stating a generality about performance (which I agree with), and not addressing any concerns at all. If you don't have time to address this stuff right now, I completely understand, you are an important and busy person. But please don't give a generality or dodge the question, and then pretend the issue is addressed. This is what I call arrogant and it is worse than no reply at all. w.r.t the one question about performance justification: I'm not necessarily asking for research papers and measurements, but based on these threads I'm not aware that there is any justification at all. For all I know this is all based on a wild guess that it will help performance "a lot", like someone who optimizes without profiling first. That certainly isn't enough to justify code breakage and massive UB injection, is it? I hope we can agree on that much at least!
Re: assert semantic change proposal
Am 04.08.2014 03:17, schrieb John Carter: As you get... * more and more optimization passes that rely on asserts, * in particular pre and post condition asserts within the standard libraries, * you are going to have flocks of user code that used to compile without warning * and ran without any known defect... ...suddenly spewing error messages and warnings. But that's OK. Because I bet 99.999% of those warnings will be pointing straight at bone fide defects. Well, that would make the problem more acceptable.. However, it has been argued that it's very hard to warn about code that will be eliminated, because that code often only become dead or redundant due to inlining, template instantiation, mixin, ... and you can't warn in those cases. So I doubt that the compiler will warn every time it removes checks that are considered superfluous because of a preceding assert(). Cheers, Daniel
Re: assert semantic change proposal
On 8/3/14, 5:57 PM, David Bregman wrote: On Monday, 4 August 2014 at 00:24:19 UTC, Andrei Alexandrescu wrote: On 8/3/14, 3:26 PM, David Bregman wrote: On Sunday, 3 August 2014 at 22:15:52 UTC, Andrei Alexandrescu wrote: One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Would you care to address the questions about performance raised in the OP? I thought I just did. You made some generic statements about performance being good. This is obvious and undisputed. You did not answer any concerns raised in the OP. I am left to wonder if you even read it. I did read it. Forgive me, but I don't have much new to answer to it. It seems you consider the lack of a long answer accompanied by research and measurements offensive, and you also find my previous answers arrogant. This, to continue what I was mentioning in another post, is the kind of stuff I find difficult to answer meaningfully. Andrei
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: 2. Semantic change. The proposal changes the meaning of assert(), which will result in breaking existing code. Regardless of philosophizing about whether or not the code was "already broken" according to some definition of assert, the fact is that shipping programs that worked perfectly well before may no longer work after this change. Subject to the caveat suggesting having two assert's with different names and different meanings, I am in the position to comment on this one from experience. So assuming we do have a "hard assert" that is used within the standard libraries and a "soft assert" in user code (unless they explicitly choose to use the "hard assert") What happens? Well, I'm the dogsbody who has the job of upgrading the toolchain and handling the fallout of doing so. So I have been walking multimegaline code bases through every gcc version in the last 15 years. This is relevant because on every new version they have added stricter warnings, and more importantly, deeper optimizations. It's especially the deeper optimizations that are interesting here. They are often better data flow analysis which result in more "insightful" warnings. So given I'm taking megalines of C/C++ code from a warnings free state on gcc version N to warnings free on version N+1, I'll make some empirical observations. * They have _always_ highlighted dodgy / non-portable / non-standard compliant code. * They have quite often highlighted existing defects in the code. * They have quite often highlighted error handling code as "unreachable", because it is... and the only sane thing to do is delete it. * They have often highlighted the error handling code of "defensive programmers" as opposed to DbC programmers. Why? Because around 30% of the code of a defensive programmer is error handling crud that has never been executed, not even in development and hence is untested and unmaintained. The clean up effort was often fairly largish, maybe a week or two, but always resulted in better code. Customer impacting defects introduced by the new optimizations have been a) Very very rare. b) Invariably from really bad code that was blatantly defective, non-standard compliant and non-portable. So what do I expect, from experience from Walter's proposed change? Another guy in this thread complained about the compiler suddenly relying on thousands of global axioms from the core and standard libraries. Yup. Exactly what is going to happen. As you get... * more and more optimization passes that rely on asserts, * in particular pre and post condition asserts within the standard libraries, * you are going to have flocks of user code that used to compile without warning * and ran without any known defect... ...suddenly spewing error messages and warnings. But that's OK. Because I bet 99.999% of those warnings will be pointing straight at bone fide defects. And yes, this will be a regular feature of life. New version of compiler, new optimization passes, new warnings... That's OK, clean 'em up, and a bunch of latent defects won't come back as customer complaints.
Re: assert semantic change proposal
On 8/3/14, 6:17 PM, John Carter wrote: Well, I'm the dogsbody who has the job of upgrading the toolchain and handling the fallout of doing so. So I have been walking multimegaline code bases through every gcc version in the last 15 years. Truth. This man speaks it. Great post, thanks! Andrei
Re: assert semantic change proposal
Am 04.08.2014 03:02, schrieb Andrei Alexandrescu: On 8/3/14, 5:40 PM, Daniel Gibson wrote: Ok, so you agree that there's a downside and code (that you consider incorrect, but that most probably exists and works ok so far) will *silently* break (as in: the compiler won't tell you about it). Yes, I agree there's a downside. I missed the part where you agreed there's an upside :o). I see a small upside in the concept of "syntax that tells the compiler it can take something for granted for optimization and that causes an error in debug mode". For me this kind of optimization is similar to GCC's __builtin_expect() to aid branch-prediction: probably useful to get even more performance, but I guess I wouldn't use it in everyday code. However, I see no upside in redefining an existent keyword (that had a different meaning.. or at least behavior.. before and in most programming languages) to achieve this. /Maybe/ an attribute for assert() would also be ok, so we don't need a new keyword: @optimize assert(x); or @hint, @promise, @always, @for_real, whatever. Cheers, Daniel
Re: assert semantic change proposal
On 8/3/14, 5:40 PM, Daniel Gibson wrote: Ok, so you agree that there's a downside and code (that you consider incorrect, but that most probably exists and works ok so far) will *silently* break (as in: the compiler won't tell you about it). Yes, I agree there's a downside. I missed the part where you agreed there's an upside :o). So when should this change be introduced? In 2.x or in D3? More aggressive optimizations should be introduced gradually in future releases of the D compilers. I think your perception of the downside is greater, and that of the upside is lesser, than mine. I don't really like the idea of introducing a silently breaking change in a minor version - and it destroys the trust into future decisions for D. I understand. At some point there are judgment calls to be made that aren't going to please everybody. Andrei
Re: assume, assert, enforce, @safe
On 8/3/14, 11:55 AM, Kagamin wrote: On Saturday, 2 August 2014 at 17:36:46 UTC, David Bregman wrote: OK, I'm done. It's clear now that you're just being intellectually dishonest in order to "win" what amounts to a trivial argument. So much for professionalism. Haha, this time it's not as bad as it was in catch syntax discussion. I even thought then they are blackmailed or something like that. It's really only this kind of stuff that has Walter and myself worried. We understand spirits can get heated during a debate but the problem with such comebacks that really hold no punches is they instantly degrade the level of conversation, invite answers in kind, and are very difficult to respond to in meaningful ways. From what I can tell after many years of having at this, there's a sort of a heat death of debate in which questions are asked in a definitive, magisterial manner (bearing an odd implied binding social contract), and any response except the desired one is instantly dismissed as simply stupid, intellectually dishonest, or, as it were, coming under duress. I can totally relate to people who hold a conviction strong enough to have difficulty acknowledging a contrary belief as even remotely reasonable, as I've fallen for that many times and I certainly will in the future. Improving awareness of it only improves the standing of debate for everyone involved. For my money, consider Walter's response: What I see is Microsoft attempting to bring D's assert semantics into C++. :) As I've mentioned before, there is inexorable pressure for this to happen, and it will happen. I find it to the point, clear, and funny. Expanded it would go like "I see more similarities than differences, and a definite convergence dictated by market pressure." I find it highly inappropriate to qualify that response as intellectually dishonest even after discounting for a variety of factors, and an apology would be in order. Andrei
Re: assert semantic change proposal
On Monday, 4 August 2014 at 00:24:19 UTC, Andrei Alexandrescu wrote: On 8/3/14, 3:26 PM, David Bregman wrote: On Sunday, 3 August 2014 at 22:15:52 UTC, Andrei Alexandrescu wrote: One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Would you care to address the questions about performance raised in the OP? I thought I just did. You made some generic statements about performance being good. This is obvious and undisputed. You did not answer any concerns raised in the OP. I am left to wonder if you even read it. Availing ourselves of a built-in "assert" that has a meaning and informativeness unachievable to e.g. a C/C++ macro is a very important and attractive competitive advantage compared to these and other languages. Not really, you can redefine the C macro to behave exactly as proposed, using compiler specific commands to invoke undefined behavior. Didn't you say in the other thread that you tried exactly that? That might be possible, but one thing I was discussing with Walter (reverse flow analysis) may be more difficult with the traditional definition of assert. Also I'm not sure whether the C and C++ standards require assert to do nothing in NDEBUG builds. Walter has always meant assert the way he discusses it today. Has he (and subsequently he and I) been imprecise in documenting it? Of course, but that just means it's Tuesday. That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Please comment on the concerns raised by the OP. Probably not - there's little motivation to do so. The original post is little else than a self-important rehash of a matter in which everybody has stated their opinion, several times, in an exchange that has long ran its course. Having everyone paste their thoughts once again seems counterproductive. Wow. Don't pretend like the questions are all "asked and answered". The concerns are legitimate, but the responses so far have been mostly arrogant handwaving. The fact that you believe you answered the performance concerns by merely stating "performance is important to make D competitive" is case in point. There has been no evidence presented that there are any nontrivial performance gains to be had by reusing information from asserts. There has been no case made that the performance gains (if they exist) justify code breakage and other issues. There has been no effort to determine if there are alternate ways to achieve the goals which satisfy all groups. I could go on, and on, but I refer you back to the OP. I really believe this whole thing could be handled much better, it doesn't have to be a zero sum game between the two sides of this issue. That's why I bothered to write the post, to try to achieve that.
Re: assert semantic change proposal
Am 04.08.2014 02:30, schrieb Andrei Alexandrescu: On 8/3/14, 4:01 PM, Timon Gehr wrote: On 08/04/2014 12:15 AM, Andrei Alexandrescu wrote: I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. Agreed. No, please hold on. Walter is not a supernatural being. There's something to be said about vision and perspective. Walter has always meant assert the way he discusses it today. This argument has no merit. Please stop bringing it up. Actually it does offer value: for a large fragment of the discussion, Walter has been confused that people have a very different understanding of assert than his. Yes, this kinda helps understanding Walters point. But as his point has only been communicated to *you*, not D users in general, you (and Walter) could be more understanding towards them being surprised and confused by this change of asserts()'s semantics. Instead you insist that your interpretation of what assert() should *mean* is the ultimate truth, even though assert() doesn't *behave* like that way in current D or any popular programming language I know of. BTW, TCPL ("KnR") (Second Edition) defines assert as: "The assert macro is used to add diagnostics to programs: void assert(int expression) If expression is zero when assert(expression) is executed, the assert macro will print on stderr a message, (...) It then calls abort to terminate execution. (...) If NDEBUG is defined at the time is included, the assert macro is ignored." Of course KnR is not the ultimate truth, but it shows that there have been definitions (by clever people!) of assert() that contradict yours for a long time. Cheers, Daniel
Re: assert semantic change proposal
Am 04.08.2014 02:28, schrieb Andrei Alexandrescu: On 8/3/14, 3:35 PM, Daniel Gibson wrote: Am 04.08.2014 00:15, schrieb Andrei Alexandrescu: That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Gaining an edge over the competition? Yes, as I explained. "A new DMD release broke my code in a totally unexpected way and people tell me it's because I'm using assert() wrong. This has been discussed several times, and I agree there's downside. All I want to do is raise awareness of the upside, which is solid but probably less obvious to some. There's no need to trot again in response the downside that has been mentioned many times already. Ok, so you agree that there's a downside and code (that you consider incorrect, but that most probably exists and works ok so far) will *silently* break (as in: the compiler won't tell you about it). So when should this change be introduced? In 2.x or in D3? I don't really like the idea of introducing a silently breaking change in a minor version - and it destroys the trust into future decisions for D. Cheers, Daniel
Re: assert semantic change proposal
On 8/3/14, 4:24 PM, Martin Krejcirik wrote: On 3.8.2014 21:47, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Couldn't this new assert behaviour be introduced as a new optimization switch ? Say -Oassert ? It would be off by default and would work both in debug and release mode. That sounds like a good idea for careful introducing of assert-driven optimizations. -- Andrei
Re: assert semantic change proposal
On 8/3/14, 4:51 PM, Mike Farnsworth wrote: This all seems to have a very simple solution, to use something like: expect() GCC for example has an intrinsic, __builtin_expect() that is used to notify the compiler of a data constraint it can use in optimization for branches. Why not make something like this a first-class citizen in D (and even expand the concept to more than just branch prediction)? __builtin_expect is actually not that. It still generates code when the expression is false. It simply uses the static assumption to minimize jumps and maximize straight execution for the true case. -- Andrei
Re: assert semantic change proposal
On 8/3/14, 4:01 PM, Timon Gehr wrote: On 08/04/2014 12:15 AM, Andrei Alexandrescu wrote: I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. Agreed. No, please hold on. Walter is not a supernatural being. There's something to be said about vision and perspective. Walter has always meant assert the way he discusses it today. This argument has no merit. Please stop bringing it up. Actually it does offer value: for a large fragment of the discussion, Walter has been confused that people have a very different understanding of assert than his. Andrei
Re: assert semantic change proposal
On 8/3/14, 3:35 PM, Daniel Gibson wrote: Am 04.08.2014 00:15, schrieb Andrei Alexandrescu: That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Gaining an edge over the competition? Yes, as I explained. "A new DMD release broke my code in a totally unexpected way and people tell me it's because I'm using assert() wrong. This has been discussed several times, and I agree there's downside. All I want to do is raise awareness of the upside, which is solid but probably less obvious to some. There's no need to trot again in response the downside that has been mentioned many times already. I've been using it like this in C/C++/Java/Python and D since forever and *now*, after >10years, they changed D's meaning of assert() to somehow imply assume() and optimize my safety-checks away.. they pretend it was always planned like this but they never got around to tell anyone until recently. It took me a week to find this freaking bug!" Doesn't really sound like the kind of advantage over the competition that many people would appreciate. If some rant like this (from Reddit or whatever) is the first impression someone gets of D, he's unlikely to ever take a closer look, regardless of the many merits the language actually has. From what I remember there has been good reception (both on reddit and at work) of increased aggressiveness of compiler optimizations. (Yes, Johhannes Pfau actually brought a similar argument somewhere in the depth of one of the other threads) D could still get this kind of edge over the competition by doing these kind of optimizations when another keyword (like assume()) is used - without breaking any code. I don't think D will add assume(). Andrei
Re: assert semantic change proposal
On 8/3/14, 3:26 PM, David Bregman wrote: On Sunday, 3 August 2014 at 22:15:52 UTC, Andrei Alexandrescu wrote: One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Would you care to address the questions about performance raised in the OP? I thought I just did. Availing ourselves of a built-in "assert" that has a meaning and informativeness unachievable to e.g. a C/C++ macro is a very important and attractive competitive advantage compared to these and other languages. Not really, you can redefine the C macro to behave exactly as proposed, using compiler specific commands to invoke undefined behavior. Didn't you say in the other thread that you tried exactly that? That might be possible, but one thing I was discussing with Walter (reverse flow analysis) may be more difficult with the traditional definition of assert. Also I'm not sure whether the C and C++ standards require assert to do nothing in NDEBUG builds. Walter has always meant assert the way he discusses it today. Has he (and subsequently he and I) been imprecise in documenting it? Of course, but that just means it's Tuesday. That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Please comment on the concerns raised by the OP. Probably not - there's little motivation to do so. The original post is little else than a self-important rehash of a matter in which everybody has stated their opinion, several times, in an exchange that has long ran its course. Having everyone paste their thoughts once again seems counterproductive. Andrei
Re: assert semantic change proposal
Well, this is not just about branch prediction, but about "let the compiler assume the condition is always true and let it eliminate code that handles other cases". I /think/ that in C with GCC assume() could be implemented (for release mode, otherwise it's just like assert()) like #define assume(cond) if(!(cond)) __builtin_unreachable() I'm not sure what kind of optimizations GCC does based on "unreachable", though. However, something like expect() /might/ be a useful addition to the language as well. Maybe as an annotation for if()/else? Cheers, Daniel Am 04.08.2014 01:51, schrieb Mike Farnsworth: This all seems to have a very simple solution, to use something like: expect() GCC for example has an intrinsic, __builtin_expect() that is used to notify the compiler of a data constraint it can use in optimization for branches. Why not make something like this a first-class citizen in D (and even expand the concept to more than just branch prediction)? That way you don't have to hijack the meaning of assert(), but optimizations can be made based on the condition. __buitin_expect() in gcc usually just figures the expected condition is fulfilled the vast majority of the time, but it could be expanded to make a lack of fulfillment trigger an exception (in non-release mode). And the compiler is always free to optimize with the assumption the expectation is met. On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: I am creating this thread because I believe the other ones [1,6] have gotten too bogged down in minutiae and the big picture has gotten lost. ... References: [1]: http://forum.dlang.org/thread/lrbpvj$mih$1...@digitalmars.com [2]: http://dlang.org/overview.html [3]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html [4]: http://blog.regehr.org/archives/213 [5]: http://en.wikipedia.org/wiki/Heisenbug [6]: http://forum.dlang.org/thread/jrxrmcmeksxwlyuit...@forum.dlang.org
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 23:51:30 UTC, Mike Farnsworth wrote: This all seems to have a very simple solution, to use something like: expect() GCC for example has an intrinsic, __builtin_expect() that is used to notify the compiler of a data constraint it can use in optimization for branches. Why not make something like this a first-class citizen in D (and even expand the concept to more than just branch prediction)? That way you don't have to hijack the meaning of assert(), but optimizations can be made based on the condition. __buitin_expect() in gcc usually just figures the expected condition is fulfilled the vast majority of the time, but it could be expanded to make a lack of fulfillment trigger an exception (in non-release mode). And the compiler is always free to optimize with the assumption the expectation is met. Indeed, having a new function instead of hijacking assert would seem to be the obvious solution. That's really interesting about the possibility of conveying probabilistic information to the compiler. Of course, we'd need different functions for constant axioms and probabilistic ones: we could use (for example) assume() for constants and expect() for probabilities.
Re: Guide for dmd development @ Win64?
On Saturday, 2 August 2014 at 20:49:12 UTC, Orvid King wrote: I actually use a shell script which I run from git's bash shell. It updates, builds, and installs DMD, druntime, and phobos. It currently is setup to build a 64-bit DMD with MSVC, and will build and install both the 32 and 64-bit druntime and phobos libraries. The dmd3 directory that I install to is basically a copy of the dmd2 directory created by the installer, except that I've deleted everything except the windows and src folders. https://gist.github.com/Orvid/7b254c307c701318488a Hm, I am using Visual Studio Express 2013, does that make any difference?
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 23:24:08 UTC, Martin Krejcirik wrote: On 3.8.2014 21:47, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Couldn't this new assert behaviour be introduced as a new optimization switch ? Say -Oassert ? It would be off by default and would work both in debug and release mode. That would be an improvement over the current proposal in my opinion, but I see some issues. One is the general argument against more compiler switches: complexity, and people will always enable stuff that seems like it might give the fastest code. Another is how do you mix and match code which is meant to be compiled with or without the switch? I suppose it could also be used in complement to a new function that has the proposed behavior regardless of switches. Regardless, it goes to show there exists a design space of possible alternatives to the proposal.
Re: assert semantic change proposal
This all seems to have a very simple solution, to use something like: expect() GCC for example has an intrinsic, __builtin_expect() that is used to notify the compiler of a data constraint it can use in optimization for branches. Why not make something like this a first-class citizen in D (and even expand the concept to more than just branch prediction)? That way you don't have to hijack the meaning of assert(), but optimizations can be made based on the condition. __buitin_expect() in gcc usually just figures the expected condition is fulfilled the vast majority of the time, but it could be expanded to make a lack of fulfillment trigger an exception (in non-release mode). And the compiler is always free to optimize with the assumption the expectation is met. On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: I am creating this thread because I believe the other ones [1,6] have gotten too bogged down in minutiae and the big picture has gotten lost. ... References: [1]: http://forum.dlang.org/thread/lrbpvj$mih$1...@digitalmars.com [2]: http://dlang.org/overview.html [3]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html [4]: http://blog.regehr.org/archives/213 [5]: http://en.wikipedia.org/wiki/Heisenbug [6]: http://forum.dlang.org/thread/jrxrmcmeksxwlyuit...@forum.dlang.org
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:57:24 UTC, Ola Fosheim Grøstad wrote: Turning asserts in program+libaries into globally available axioms is insane. I guess this is the heart of the difference in the way DbC programmers program. I know that program proving is impossibly hard, so my asserts are a kind of short cut on it. I use asserts to define my architecture. So that I can say, " In my architecture, I know, by design, certain eventualities will never occur. I don't expect any compiler to be able prove that for me (although it would be nice if it could), but I certainly will be relying on these facets of the architecture." When I assert, I'm stating "In my architecture, as I designed it, this will always be true, and everything in the code downstream of here can AND DOES rely on this. My code explicitly relies on these simplifying assumptions, and will go hideously wrong if those assumptions are false So why can't the compiler rely on them too? Of course it can, as every single line I write after the assert is absolutely relying on the assert being true." My asserts are never "I believe this is true". They are _always_ "In this design, the following must be true, as I'm about to code absolutely relying on this fact." And if the compiler chooses to rely on it too... I can absolutely gaurantee you that differing optimizations will be the least of my worries if the expression is false. However, that said, it is very clear to me that this is a very different usage of "assert" to what many of colleagues do. Hence my suggestion we make explicit by differing names what usage we mean. Oh, and I will just thow this happy observation into the mix... in case you think this sort of optimization is too revolutionary... http://www.airs.com/blog/archives/120
Re: std.jgrandson
On Sunday, 3 August 2014 at 19:36:43 UTC, Sönke Ludwig wrote: Do you have a specific case in mind where the data format doesn't fit the process used by vibe.data.serialization? The data format iteration part *is* abstracted away there in basically a kind of traits structure (the "Serializer"). When serializing, the data always gets written in the order defined by the input value, while during deserialization the serializer defines how aggregates are iterated. This seems to fit all of the data formats that I had in mind. For example we use special binary serialization format for structs where serialized content is actually a valid D struct - after updating internal array pointers one can simply do `cast(S*) buffer.ptr` and work with it normally. Doing this efficiently requires breadth-first traversal and keeping track of one upper level to update the pointers. This does not fit very well with classical depth-first recursive traversal usually required by JSON-structure formats.
Re: assert semantic change proposal
On 3.8.2014 21:47, David Bregman wrote: > Walter has proposed a change to D's assert function as follows [1]: > "The compiler can make use of assert expressions to improve > optimization, even in -release mode." Couldn't this new assert behaviour be introduced as a new optimization switch ? Say -Oassert ? It would be off by default and would work both in debug and release mode. -- mk
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:57:24 UTC, Ola Fosheim Grøstad wrote: On Sunday, 3 August 2014 at 22:18:29 UTC, John Carter wrote: My view, which I think corresponds with Walter's and Betrand Meyer's, is that asserts define what correct behaviour is. No. The propositions describe what the correct behaviour ought to be. The asserts request them to be proved. And the sooner you know that, preferably at compile time, the better. And to do that you need a theorem prover capable of solving NP-Hard problems. So you need a vry intelligent programmer to write provably correct code without any special tools. Continuing past such an assert inevitably results in defective, possibly catastrophic, possibly flaky behaviour. And Walter thinks it would a great idea to make that catastrophic behaviour occur with a much greater probability and also every time you execute your program, undetected, not only in the select few cases where slight overlap in conditions were detected. So basically if your program contains an assert that states that the program should stop working in 30 years from now, it is a good idea to make it fail randomly right away. That's the view that Andrei, Don and Walter has expressed very explicitly. People who thinks this is a great idea defies reason. They only learn from failure. You have to realize that a deduction engine cannot tolerate a single contradiction in axioms. If there is a single contradiction it can basically deduce anything, possibly undetected. Turning asserts in program+libaries into globally available axioms is insane. John proposes a separate function, so I think you two are in agreement on what really matters. Let's try to avoid going too deep into tangents, unlike the other threads - it didn't work well last time.
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:51:58 UTC, John Carter wrote: The obvious simple correct resolution is to give them two different names. Agreed, but it remains to see if those in favor of the original proposal will agree also.
Re: assert semantic change proposal
On 08/04/2014 12:51 AM, John Carter wrote: On Sunday, 3 August 2014 at 22:19:16 UTC, Ola Fosheim Grøstad wrote: But go ahead. This will lead to a fork. What should fork is the two opposing intentions for assert. They should have two different names and different consequences. Yes. :)
Re: assert semantic change proposal
On 08/04/2014 12:15 AM, Andrei Alexandrescu wrote: I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. Agreed. No, please hold on. Walter is not a supernatural being. Walter has always meant assert the way he discusses it today. This argument has no merit. Please stop bringing it up. That said, should we proceed carefully about realizing this advantage? Of course; that's a given. That is reasonable. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Note that this is achievable without claiming the Humpty Dumpty privilege once again. Furthermore the potential for the development of concepts is actually usually larger if concepts stay properly separated from the beginning. E.g. the current proposal already has the issue that an assumption of unreachability cannot be expressed in the straightforward way: switch(x){ // ... default: assert(0); // cannot be optimized away in -release }
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:18:29 UTC, John Carter wrote: My view, which I think corresponds with Walter's and Betrand Meyer's, is that asserts define what correct behaviour is. No. The propositions describe what the correct behaviour ought to be. The asserts request them to be proved. And the sooner you know that, preferably at compile time, the better. And to do that you need a theorem prover capable of solving NP-Hard problems. So you need a vry intelligent programmer to write provably correct code without any special tools. Continuing past such an assert inevitably results in defective, possibly catastrophic, possibly flaky behaviour. And Walter thinks it would a great idea to make that catastrophic behaviour occur with a much greater probability and also every time you execute your program, undetected, not only in the select few cases where slight overlap in conditions were detected. So basically if your program contains an assert that states that the program should stop working in 30 years from now, it is a good idea to make it fail randomly right away. That's the view that Andrei, Don and Walter has expressed very explicitly. People who thinks this is a great idea defies reason. They only learn from failure. You have to realize that a deduction engine cannot tolerate a single contradiction in axioms. If there is a single contradiction it can basically deduce anything, possibly undetected. Turning asserts in program+libaries into globally available axioms is insane.
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:19:16 UTC, Ola Fosheim Grøstad wrote: But go ahead. This will lead to a fork. What should fork is the two opposing intentions for assert. They should have two different names and different consequences. On Sunday, 3 August 2014 at 22:18:29 UTC, John Carter wrote: It comes down to two opposing view of what we use asserts for. To give a more concrete example of this... in the team I work with we have the following issue. When the DbC type programmers turn "asserts are fatal" on, we get asserts firing all over the place in non-DbC programmers code. On closer inspection these come down to things like "stupid factory didn't connect cable A to device B, the installation instructions are clear, the cable always should be attached in the production model". And the solution is one of... * Find a Device B and plug cable A into it. * There is a bug somewhere in the firmware. * There is a bug in the firmware of device B * You have a debugger in the entrails of device B, so the heartbeat stopped. * Something somewhere increased the latency so the timeout fired, maybe increase timeout.. Whereas for DbC programmers a pre-condition assert firing meant _very_ directly that the function that _directly_ invoked me is clearly defective in this manner. The bug is undoubtably there, there may be a bug elsewhere as well, but it is undoubtably a bug in my calling routine if it let defective values propagate as far as me. Or if a postcondition assert fired, it means, _this_ function is defective in this manner. The simple harsh fact is DbC type programmers mean completely different things to non DbC type programmers by "assert", yet unfortunately it is mapped on to the same thing with the same name. The obvious simple correct resolution is to give them two different names.
Re: assert semantic change proposal
Am 04.08.2014 00:45, schrieb Dmitry Olshansky: 04-Aug-2014 02:35, Daniel Gibson пишет: Am 04.08.2014 00:15, schrieb Andrei Alexandrescu: That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Gaining an edge over the competition? "A new DMD release broke my code in a totally unexpected way and people tell me it's because I'm using assert() wrong. I've been using it like this in C/C++/Java/Python and D since forever and *now*, after >10years, they changed D's meaning of assert() to somehow imply assume() and optimize my safety-checks away.. they pretend it was always planned like this but they never got around to tell anyone until recently. It took me a week to find this freaking bug!" Wait a sec - it's not any better even today, it already strips them away. So unless debugging "-release -O -inline" programs is considered a norm nothing changes. It strips them away, but it doesn't eliminate other code based on the (removed) assertion. See my other post for an example.
Re: assert semantic change proposal
04-Aug-2014 02:35, Daniel Gibson пишет: Am 04.08.2014 00:15, schrieb Andrei Alexandrescu: That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Gaining an edge over the competition? "A new DMD release broke my code in a totally unexpected way and people tell me it's because I'm using assert() wrong. I've been using it like this in C/C++/Java/Python and D since forever and *now*, after >10years, they changed D's meaning of assert() to somehow imply assume() and optimize my safety-checks away.. they pretend it was always planned like this but they never got around to tell anyone until recently. It took me a week to find this freaking bug!" Wait a sec - it's not any better even today, it already strips them away. So unless debugging "-release -O -inline" programs is considered a norm nothing changes. -- Dmitry Olshansky
Re: assert semantic change proposal
Am 04.08.2014 00:15, schrieb Andrei Alexandrescu: That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Gaining an edge over the competition? "A new DMD release broke my code in a totally unexpected way and people tell me it's because I'm using assert() wrong. I've been using it like this in C/C++/Java/Python and D since forever and *now*, after >10years, they changed D's meaning of assert() to somehow imply assume() and optimize my safety-checks away.. they pretend it was always planned like this but they never got around to tell anyone until recently. It took me a week to find this freaking bug!" Doesn't really sound like the kind of advantage over the competition that many people would appreciate. If some rant like this (from Reddit or whatever) is the first impression someone gets of D, he's unlikely to ever take a closer look, regardless of the many merits the language actually has. (Yes, Johhannes Pfau actually brought a similar argument somewhere in the depth of one of the other threads) D could still get this kind of edge over the competition by doing these kind of optimizations when another keyword (like assume()) is used - without breaking any code. Cheers, Daniel
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 22:15:52 UTC, Andrei Alexandrescu wrote: On 8/3/14, 2:57 PM, John Carter wrote: On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Hmm. I really really do like that idea. I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. Agreed. I hope that agree was referring to some bits from the other paragraphs, and that you don't seriously agree with that blatantly self contradictory statement about unforseeable unforseens :p One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Would you care to address the questions about performance raised in the OP? Availing ourselves of a built-in "assert" that has a meaning and informativeness unachievable to e.g. a C/C++ macro is a very important and attractive competitive advantage compared to these and other languages. Not really, you can redefine the C macro to behave exactly as proposed, using compiler specific commands to invoke undefined behavior. Didn't you say in the other thread that you tried exactly that? Walter has always meant assert the way he discusses it today. Has he (and subsequently he and I) been imprecise in documenting it? Of course, but that just means it's Tuesday. That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Please comment on the concerns raised by the OP.
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 21:57:08 UTC, John Carter wrote: consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. The implications are foreseen. Any assert that will depend on any kind of notion of progress over time risk blowing up random logic undetected with a decent optimizer (> dmd). But go ahead. This will lead to a fork.
Re: assert semantic change proposal
On 8/3/14, 2:57 PM, John Carter wrote: On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Hmm. I really really do like that idea. I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. Agreed. One related point that has been discussed only a little is the competitive aspect of it all. Generating fast code is of paramount importance for D's survival and thriving in the market. Competition in language design and implementation is acerbic and only getting more cutthroat. In the foreseeable future efficiency will become more important at scale seeing as data is growing and frequency scaling has stalled. Availing ourselves of a built-in "assert" that has a meaning and informativeness unachievable to e.g. a C/C++ macro is a very important and attractive competitive advantage compared to these and other languages. Walter has always meant assert the way he discusses it today. Has he (and subsequently he and I) been imprecise in documenting it? Of course, but that just means it's Tuesday. That said, should we proceed carefully about realizing this advantage? Of course; that's a given. But I think it's very important to fully understand the advantages of gaining an edge over the competition. Andrei
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 20:05:22 UTC, bachmeier wrote: 3. Undefined behavior. Actually I have had an extensive battle within my own workplace on this subject and I think I have a reasonable insight in to both points of view. It comes down to two opposing view of what we use asserts for. My view, which I think corresponds with Walter's and Betrand Meyer's, is that asserts define what correct behaviour is. If an assert fires, your program is fundamentally defective in a manner that can only be corrected by a new version of the program. And the sooner you know that, preferably at compile time, the better. Continuing past such an assert inevitably results in defective, possibly catastrophic, possibly flaky behaviour. In the opposing view, an assert statement is a debug aid. In the same category as a logging printf. If it fires, it's "Huh. That's interesting. I didn't think that would happen, but OK, it does. Cool." Alas, these two uses have been given the same name. assert. One resolution would be to create two assert interfaces, one that the compiler pays attention to, and one that is just a "Huh. That's interesting, I didn't expect that."
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 21:57:08 UTC, John Carter wrote: One "near term" implication is to permit deeper static checking of the code. Both in terms of "Well, actually there is a code path in which the assert expression could be false, flag it with a warning" and in terms of "There is a code path which is unused / incorrect / erroneous is the assert expression is true", flag as an error/warning. Furthermore, in the presence of the deeper compile time function evaluation, I suspect we will get deeper and even more suprising consequences from this decision. Suddenly we have, at compile time, an expression we know to be true, always, at run time. Thus where possible, the compiler can infer as much as it can from this. The implications of that will be very very interesting and far reaching. I totally agree, static analysis tools should consider information contained in asserts. In the case of C/C++, I believe many of the analysis tools already do this. That doesn't mean it's a good idea for this information to be used for optimization though, for reasons explained in the OP. As I said, this choice will have very far reaching and unforeseen and unforeseeable consequences but that's OK, since it is fundamentally the correct choice, those consequences will be correct too. This is mystical sounding gibberish. If the consequences are unforseen and unforseeable, then by definition you cannot forsee that they are correct.
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." Hmm. I really really do like that idea. I suspect it is one of those ideas of Walter's that has consequences that reach further than anyone foresees. but that's OK, because it is fundamentally the correct course of action, it's implications foreseen and unforeseen will be correct. One "near term" implication is to permit deeper static checking of the code. Both in terms of "Well, actually there is a code path in which the assert expression could be false, flag it with a warning" and in terms of "There is a code path which is unused / incorrect / erroneous is the assert expression is true", flag as an error/warning. Furthermore, in the presence of the deeper compile time function evaluation, I suspect we will get deeper and even more suprising consequences from this decision. Suddenly we have, at compile time, an expression we know to be true, always, at run time. Thus where possible, the compiler can infer as much as it can from this. The implications of that will be very very interesting and far reaching. As I said, this choice will have very far reaching and unforeseen and unforeseeable consequences but that's OK, since it is fundamentally the correct choice, those consequences will be correct too.
Re: checkedint call removal
On 08/03/2014 11:03 PM, Paolo Invernizzi wrote: ... But most yes: to me, an undefined behaviour is the situation where I've developed the code for having 'a' in one place, and I have 'b'. If this is not, literally undefined behaviour, I donno how I should name it. ... You could name it a bug, or a programming error. Undefined behaviour is something specific. If a program contains undefined behaviour, this means that a conforming implementation can generate any arbitrary behaviour. ... everything seems to work fine for a good amount of time until the snowball comes down, violates catastrophically the program logic and boom. Again, in other cases, the -release program will operate as expected. If I've grasped your point in the previous reply, yes, maybe. ... Ok. This was Johannes' point.
Re: assert semantic change proposal
Am 03.08.2014 22:05, schrieb bachmeier: Thanks for the summary. I apologize for the uninformed question, but is it possible to explain how the change wrt assert will break existing code? Those details are probably buried in the extensive threads you've referenced. I ask because my understanding of assert has always been that you should use it to test your programs but not rely on it at runtime. Example: assert(x !is null); if(x !is null) { x.foo = 42; } in release mode, the assert() will (as to Walters plans) tell the compiler, that x cannot be null, so it will optimize the "if(x !is null)" away. Currently this would not happen and this code won't segfault, with that optimization it would, in case in release mode x is null after all in some circumstance that hasn't occured during testing. Walters stance on this is, that if x is null under some circumstance, the program is buggy anyway and in an undefined state. Furthermore he interprets "assert()" as "the programmer asserts (as in promises) that the given condition is true". Other people say that assert(), as it's implemented in many programming languages (including D up to now), just is a runtime check with a funny name that can be deactivated (e.g. for/in release builds). Some have proposed to have an assume() that does what Walter wants assert() to do. Cheers, Daniel
Re: assert semantic change proposal
On Sunday, 3 August 2014 at 20:05:22 UTC, bachmeier wrote: Thanks for the summary. I apologize for the uninformed question, but is it possible to explain how the change wrt assert will break existing code? Those details are probably buried in the extensive threads you've referenced. I ask because my understanding of assert has always been that you should use it to test your programs but not rely on it at runtime. Yes, it was discussed in the threads. The basic way it happens is something like this: assert(x); if(!x) { // some side effect on the program // the optimizer will remove this path under the proposal } It's much more insidious if the assert and the if are separated by some distance, such as being in different functions or even modules. for example: assert(x < 1000); // this assert is wrong, but has never been hit during testing. unfortunate but real life programs are not bug free. someFunction(x); now suppose someFunction is a library sort of function, coded in "defensive programming" style. It does something important so it validates its input first to make sure nothing bad happens. But now someFunction is inlined, and code is generated with the (wrong) assumption that x<1000. The input validation checks are removed by the optimizer. As a result, someFunction runs with invalid input, and [user's harddrive is formatted, hacker gains root access, etc]. There are other ways too. The code does not explicitly need to have an if statement checking for !x to be broken by this - any implicit checks, any kind of control flow structures can be broken just the same.
Re: checkedint call removal
On Sunday, 3 August 2014 at 16:29:18 UTC, Timon Gehr wrote: On 08/03/2014 05:00 PM, Paolo Invernizzi wrote: On Sunday, 3 August 2014 at 14:10:29 UTC, Timon Gehr wrote: On 08/03/2014 03:01 PM, Paolo Invernizzi wrote: On Sunday, 3 August 2014 at 10:49:39 UTC, Timon Gehr wrote: On 08/03/2014 11:15 AM, Paolo Invernizzi wrote: and someone is seeing and exploiting that. (Undefined behaviour introduced in this way may be exploitable.) If the assertion triggers, that's not undefined behaviour: It will from now on be in -release, this is what this is all about. If an assert triggers, and you want go anyway with '-release', you should remove it from the code, that's my point. No problem at all with inferring optimisations from it. it's a bug, already implanted in the code, that the assertion is surfacing _avoiding_ undefined behaviour occurring from the really next line of code. ... Most bugs do not lead to undefined behaviour. E.g. you can write buggy code in a language that does not have the concept. But most yes: to me, an undefined behaviour is the situation where I've developed the code for having 'a' in one place, and I have 'b'. If this is not, literally undefined behaviour, I donno how I should name it. Security holes are not related to assertions at all, they are related in unpredictable state that the program has reached, outside of the view of the programmers. And what do you think a wrong assertion is the manifestation of? (Hint: The assertion talks about the program state.) Again, the assertion will trigger on that, so I'm not entering the 'donno what happened here' state: the program will halt. No security risk at all, also if the assert expression is crappy code. If you want to keep the safety net also in '-release' just stop using asserts and use enforce, at least you are clearly signalling the developer intention. Or, simply, don't use '-release', but that to me it's a little abuse of the use of assert... Assertions are only checks that the reasoning about the flow and the conditions is going in the way that was originally intended. If you have failures, But that is not the scenario. You don't turn on -release in order to disable assertions that you _know_ are failing. and you want to cope with them, you MUST remove the failing assertions from the code, and turn them in specific code to cope with the faulty condition. Something like 'warning mate! the commented assertion is triggered from time to time, Why do you assume they had noticed that the assertion was wrong? I don't understand now: I notice it because the program terminate with a stack trace telling me an assertion was triggered. What are you meaning? I guess it's a pretty common practise to have testers stress the application with asserts on, prior to put it in production with assertion disabled once you are confident that the program is not buggy. That was a stereotypical example; You were not in the position to introduce a stereotypical example. If Johannes says that some code will break, and you say "I don't buy this", then you cannot introduce an example to argue against, where the additional circumstances are such that you have an easy point. This is a straw man. I don't think it's a straw man, because I've not changed anything that Johannes has said. You need to argue against the existence of the broken code. I.e. you need to argue for an (approximate) universal. I'm arguing, like others, that to me the code was already broken, but it seems that we can't agree on what broken means. If you introduce an example, you need to argue that the 'example' is actually the only (likely) way that code could be broken by this change. I.e. you'd need to argue that wrong assertions are always caught before they are part of a build where they are disabled. That's what usually happens, if you have a good test coverage, and a good testing period. (You might also pursue the alternative direction of admitting that this is a possibility, but saying that it is very unlikely to happen, and then justify that opinion. ("That's true, this might happen, but I don't think this change will lead to many catastrophes, because...")) The point was simply that I don't think that such a change will summon voices claiming that the language has broken again some codebase. That is an easy rebuttal, because it's caused by a, IMHO, bad practical usage of the assert statement. what I was trying to argue it's that also if we do some dirty tricks to keep the train on the rails, (That's still part of the made up scenario.) if the program logic is flowed [sic] Maybe it hasn't been flawed in the -release build before because the faulty assertion was the only thing that was faulty, but nobody knew that it was faulty before the released system went down. Ok, now I've understood: are you arguing, (correct me if I'm wrong, I'm not searching a strawman, r
Re: std.jgrandson
On Sunday, 3 August 2014 at 17:40:48 UTC, Andrei Alexandrescu wrote: On 8/3/14, 10:19 AM, Sean Kelly wrote: I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input. What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters? This is tricky with a range. With an event-based parser I'd have events for object and array begin / end, but with a range you end up having an element that's a token, which is pretty weird. For encoded characters (and you need to make sure you handle surrogate pairs in your decoder) I'd still provide some means of decoding on demand. If nothing else, decode lazily when the user asks for the string value. That way the user isn't paying to decode strings he isn't interested in. No allocation works for tokenization, but parsing is a whole different matter. So the lowest layer should allow me to iterate across symbols in some way. Yah, that would be the tokenizer. But that will halt on comma and colon and such, correct? That's a tad lower than I'd want, though I guess it would be easy enough to build a parser on top of it. When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time. That's tricky. Once you scan for 2 specific characters you may as well scan for a couple more, the added cost is negligible. In contrast, scanning once for finding termination and then again for decoding purposes will definitely be a lot more expensive. I think I'm getting a bit confused. For the JSON parser I wrote, the parser performs full validation but leaves the content as-is, then provides a routine to decode values from their string representation if the user wishes to. I'm not sure where scanning figures in here. Andrei
Re: std.jgrandson
On 8/3/14, 11:03 AM, Sönke Ludwig wrote: Am 03.08.2014 17:14, schrieb Andrei Alexandrescu: [snip] Ah okay, *phew* ;) But in that case I'd actually think about leaving off the backslash decoding in the low level parser, so that slices could be used for immutable inputs in all cases - maybe with a name of "rawString" for the stored data and an additional "string" property that decodes on the fly. This may come in handy when the first comparative benchmarks together with rapidjson and the like are done. Yah, that's awesome. There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it. I see. Suppose that opDispatch would be dropped, would anything speak against "alias this"ing _payload to avoid the need for the manually defined operators? Correct. In fact the conversion was there but I removed it for the sake of opDispatch. What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.) The two major points are probably that it's possible to use "final switch" on the type tag if it's an enum, So I just tried this: http://dpaste.dzfl.pl/eeadac68fac0. Sadly, the cast doesn't take. Without the cast the enum does compile, but not the switch. I submitted https://issues.dlang.org/show_bug.cgi?id=13247. and the type id can be easily stored in both integer and string form (which is not as conveniently possible with a TypeInfo). I think here pointers to functions "win" because getting a string (or anything else for that matter) is an indirect call away. std.variant has been among the first artifacts I wrote for D. It's a topic I've been dabbling in for a long time in a C++ context (http://goo.gl/zqUwFx), with always almost-satisfactory results. I told myself if I get to implement things in D properly, then this language has good potential. Replacing the integral tag I'd always used with a pointer to function is, I think, net progress. Things turned out fine, save for the switch matter. An enum based tagged union design also currently has the unfortunate property that the order of enum values and that of the accepted types must be defined consistently, or bad things will happen. Supporting UDAs on enum values would be a possible direction to fix this: enum JsonType { @variantType!string string, @variantType!(JsonValue[]) array, @variantType!(JsonValue[string]) object } alias JsonValue = TaggedUnion!JsonType; But then there are obviously still issues with cyclic type references. So, anyway, this is something that still requires some thought. It could also be designed in a way that is backwards compatible with a pure "Algebraic", so it shouldn't be a blocker for the current design. I think something can be designed along these lines if necessary. Andrei
Re: std.jgrandson
On 8/3/2014 2:16 AM, Andrei Alexandrescu wrote: We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Here are a few differences compared to vibe.d's library. I think these are desirable to have in that library as well: * Parsing strings is decoupled into tokenization (which is lazy and only needs an input range) and parsing proper. Tokenization is lazy, which allows users to create their own advanced (e.g. partial/lazy) parsing if needed. The parser itself is eager. * There's no decoding of strings. * The representation is built on Algebraic, with the advantages that it benefits from all of its primitives. Implementation is also very compact because Algebraic obviates a bunch of boilerplate. Subsequent improvements to Algebraic will also reflect themselves into improvements to std.jgrandson. * The JSON value (called std.jgrandson.Value) has no named member variables or methods except for __payload. This is so there's no clash between dynamic properties exposed via opDispatch. Well that's about it. What would it take for this to become a Phobos proposal? Destroy. Andrei If your looking for serialization from statically known type layouts then I believe my JSON (de)serialization code (https://github.com/Orvid/JSONSerialization) might actually be of interest to you, as it uses no intermediate representation, nor does it allocate when it converts an object to JSON. As far as I know, even when only compiled with DMD, it's among the fastest JSON (de)serialization libraries. Unless it needs to convert a floating point number to a string, in which case I suppose you could certainly use a local buffer to write to, but at the moment it just converts it to a normal string that gets written to the output range. It also supports (de)serializing from, what I called at the time, dynamic types, such as std.variant, which isn't actually supported because that code is only there because I needed it for something else, and wasn't using std.variant at the time.
Re: assert semantic change proposal
Thanks for the summary. I apologize for the uninformed question, but is it possible to explain how the change wrt assert will break existing code? Those details are probably buried in the extensive threads you've referenced. I ask because my understanding of assert has always been that you should use it to test your programs but not rely on it at runtime. On Sunday, 3 August 2014 at 19:47:27 UTC, David Bregman wrote: I am creating this thread because I believe the other ones [1,6] have gotten too bogged down in minutiae and the big picture has gotten lost. Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." I would like to raise a series of questions, comments, and potential objections to this proposal which I hope will help clarify the big picture. 1. Who and Why? What is the impetus behind this proposal? What is the case for it? Walter made strong statements such as "there is inexorable pressure for this", and "this will happen", and I am wondering where this is coming from. Is it just Walter? If not, who or what is pushing this idea? (the 'yea' side, referred to below) 2. Semantic change. The proposal changes the meaning of assert(), which will result in breaking existing code. Regardless of philosophizing about whether or not the code was "already broken" according to some definition of assert, the fact is that shipping programs that worked perfectly well before may no longer work after this change. Q2a. In other areas, code breakage has recently been anathema. Why is this case different? Q2b. Has any attempt been made to estimate the impact of this change on existing code? Has code breakage been considered in making this proposal? 2c. I note that the proposal also breaks with (at least) one of D's stated "Major Design Goals".[2] ("Where D code looks the same as C code, have it either behave the same or issue an error.") 3. Undefined behavior. The purpose of the proposal is to improve code generation, and this is accomplished by allowing the compiler to generate code with arbitrary (undefined) behavior in the case that the assertion does not hold. Undefined behavior is well known to be a source of severe problems, such as security exploits[3,4], and so-called "heisenbugs"[5]. 3a. An alternate statement of the proposal is literally "in release mode, assert expressions introduce undefined behavior into your code in if the expression is false". 3b. Since assert is such a widely used feature (with the original semantics, "more asserts never hurt"), the proposal will inject a massive amount of undefined behavior into existing code bases, greatly increasing the probability of experiencing problems related to undefined behavior. Q3c. Have the implications of so much additional undefined behavior been sufficiently considered and weighed with the performance benefits of the proposal? Q3d. How can the addition of large amounts of undefined behavior be reconciled with D's Major Design Goals #2,3,5,15,17? [2]? 3f. I note that it has been demonstrated in the other threads that the proposal as it stands can even break the memory safety guarantee of @safe code. 4. Performance. Q4a. What level of performance increases are expected of this proposal, for a representative sample of D programs? Q4b. Is there any threshold level of expected performance required to justify this proposal? For example, if a study determined that the average program could expect a speedup of 0.01% or less, would that still be considered a good tradeoff against the negatives? Q4c. Have any works or studies, empirical or otherwise, been done to estimate the expected performance benefit? Is there any evidence at all for a speedup sufficient to justify this proposal? Q4d. When evaluating the potential negative effects of the proposal on their codebase, D users may decide it is now too risky to compile with -release. (Even if their own code has been constructed with the new assert semantics in mind, the libraries they use might not). Thus the effect of the proposal would actually be to decrease the performance of their program instead of increase it. Has this been considered in the evaluation of tradeoffs? 5. High level goals The feedback so far demonstrates that the proposal is controversial at least. While I do not endorse democratic or design-by-committee approaches to language design, I do think it is relevant if a large subset of users have issues with a proposal. Note that this is not bikeshedding, I believe it has now been sufficiently demonstrated there are real concerns about real negative effects of the proposal. 5a. Is this proposal the best way to go or is there an alternative that would achieve the same goals while satisfying both sides? 5b. Has the 'yea' side been sufficiently involved in this discussion? Are they aware of the tradeoffs? M
Re: std.jgrandson
03-Aug-2014 23:54, Dmitry Olshansky пишет: 03-Aug-2014 21:40, Andrei Alexandrescu пишет: A simplified pseudo-code of JSON-parser inner loop is then: if(cur == '[') startArray(); else if(cur == '{'){ Aw. Stray brace.. -- Dmitry Olshansky
Re: std.jgrandson
03-Aug-2014 21:40, Andrei Alexandrescu пишет: On 8/3/14, 10:19 AM, Sean Kelly wrote: I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input. What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters? SAX-style would imply that array is "parsed" by calling 6 user-defined callbacks inside of a parser: startArray, endArray, startObject, endObject, id and value. A simplified pseudo-code of JSON-parser inner loop is then: if(cur == '[') startArray(); else if(cur == '{'){ startObject(); else if(cur == '}') endObject(); else if(cur == ']') endArray(); else{ if(expectObjectKey){ id(parseAsIdentifier()); } else value(parseAsValue()); } This is as barebones as it can get and is very fast in practice esp. in context of searching/extracting/matching specific sub-tries of JSON documents. -- Dmitry Olshansky
Re: std.jgrandson
On 8/3/14, 11:37 AM, Sönke Ludwig wrote: Am 03.08.2014 17:34, schrieb Andrei Alexandrescu: On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday. This may be the crux w.r.t. the vibe.data.json implementation. My schedule will be very crowded this month, so I could only really start to work on it beginning of September. But apart from the mentioned points, I think your implementation is already the closest thing to what I have in mind, so I'm all for going the clean slate route (I'll have to do a lot in terms of deprecation work in vibe.d anyway). What would be your estimated time of finishing? Would anyone want to take vibe.data.json and std.jgrandson, put them in a crucible, and have std.data.json emerge from it in a timely manner? My understanding is that everyone involved would be cool with that. Andrei
Re: std.jgrandson
Am 03.08.2014 20:57, schrieb w0rp: On Sunday, 3 August 2014 at 18:37:48 UTC, Sönke Ludwig wrote: The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values). My issue with is is that if you ask for a key in an object which doesn't exist, you get an 'undefined' value back, just like JavaScript. I'd rather that be propagated as a RangeError, which is more consistent with associative arrays in the language and probably more correct. Yes, this is what I meant with the JavaScript part of API. In addition to opIndex(), there should of course also be a .get(key, default_value) style accessor and the "in" operator. A minor issue is being able to create a Json object which isn't a valid Json object by itself. I'd rather the initial value was just 'null', which would match how pointers and class instances behave in the language. This is what I meant with not being an issue by itself. But having such a special value of course has its pros and cons, and I could personally definitely also live with JSON values being initialized to JSON "null", if somebody hacks Algebraic to support that kind of use case.
assert semantic change proposal
I am creating this thread because I believe the other ones [1,6] have gotten too bogged down in minutiae and the big picture has gotten lost. Walter has proposed a change to D's assert function as follows [1]: "The compiler can make use of assert expressions to improve optimization, even in -release mode." I would like to raise a series of questions, comments, and potential objections to this proposal which I hope will help clarify the big picture. 1. Who and Why? What is the impetus behind this proposal? What is the case for it? Walter made strong statements such as "there is inexorable pressure for this", and "this will happen", and I am wondering where this is coming from. Is it just Walter? If not, who or what is pushing this idea? (the 'yea' side, referred to below) 2. Semantic change. The proposal changes the meaning of assert(), which will result in breaking existing code. Regardless of philosophizing about whether or not the code was "already broken" according to some definition of assert, the fact is that shipping programs that worked perfectly well before may no longer work after this change. Q2a. In other areas, code breakage has recently been anathema. Why is this case different? Q2b. Has any attempt been made to estimate the impact of this change on existing code? Has code breakage been considered in making this proposal? 2c. I note that the proposal also breaks with (at least) one of D's stated "Major Design Goals".[2] ("Where D code looks the same as C code, have it either behave the same or issue an error.") 3. Undefined behavior. The purpose of the proposal is to improve code generation, and this is accomplished by allowing the compiler to generate code with arbitrary (undefined) behavior in the case that the assertion does not hold. Undefined behavior is well known to be a source of severe problems, such as security exploits[3,4], and so-called "heisenbugs"[5]. 3a. An alternate statement of the proposal is literally "in release mode, assert expressions introduce undefined behavior into your code in if the expression is false". 3b. Since assert is such a widely used feature (with the original semantics, "more asserts never hurt"), the proposal will inject a massive amount of undefined behavior into existing code bases, greatly increasing the probability of experiencing problems related to undefined behavior. Q3c. Have the implications of so much additional undefined behavior been sufficiently considered and weighed with the performance benefits of the proposal? Q3d. How can the addition of large amounts of undefined behavior be reconciled with D's Major Design Goals #2,3,5,15,17? [2]? 3f. I note that it has been demonstrated in the other threads that the proposal as it stands can even break the memory safety guarantee of @safe code. 4. Performance. Q4a. What level of performance increases are expected of this proposal, for a representative sample of D programs? Q4b. Is there any threshold level of expected performance required to justify this proposal? For example, if a study determined that the average program could expect a speedup of 0.01% or less, would that still be considered a good tradeoff against the negatives? Q4c. Have any works or studies, empirical or otherwise, been done to estimate the expected performance benefit? Is there any evidence at all for a speedup sufficient to justify this proposal? Q4d. When evaluating the potential negative effects of the proposal on their codebase, D users may decide it is now too risky to compile with -release. (Even if their own code has been constructed with the new assert semantics in mind, the libraries they use might not). Thus the effect of the proposal would actually be to decrease the performance of their program instead of increase it. Has this been considered in the evaluation of tradeoffs? 5. High level goals The feedback so far demonstrates that the proposal is controversial at least. While I do not endorse democratic or design-by-committee approaches to language design, I do think it is relevant if a large subset of users have issues with a proposal. Note that this is not bikeshedding, I believe it has now been sufficiently demonstrated there are real concerns about real negative effects of the proposal. 5a. Is this proposal the best way to go or is there an alternative that would achieve the same goals while satisfying both sides? 5b. Has the 'yea' side been sufficiently involved in this discussion? Are they aware of the tradeoffs? Mostly what I've seen is Walter defending the yea side from the perspective that the decision has already been made. Maybe if the yea side was consulted, they might easily agree to an alternative way of achieving the improved optimization goal, such as creating a new function that has the proposed semantics. References: [1]: http://forum.dlang.org/thread/lrbpvj$mih$1...@digitalmars.com [2]: http://dlang.org/overview.html [3
Re: std.jgrandson
Am 03.08.2014 20:44, schrieb Dicebot: On Sunday, 3 August 2014 at 08:04:40 UTC, Johannes Pfau wrote: API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJson Before going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation. At the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion. Do you have a specific case in mind where the data format doesn't fit the process used by vibe.data.serialization? The data format iteration part *is* abstracted away there in basically a kind of traits structure (the "Serializer"). When serializing, the data always gets written in the order defined by the input value, while during deserialization the serializer defines how aggregates are iterated. This seems to fit all of the data formats that I had in mind.
Re: std.jgrandson
On 8/3/14, 11:08 AM, Johannes Pfau wrote: Am Sun, 03 Aug 2014 09:17:57 -0700 schrieb Andrei Alexandrescu : On 8/3/14, 8:51 AM, Johannes Pfau wrote: Variant uses TypeInfo internally, right? No. https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L210 That's a query for the TypeInfo. https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L371 That could be translated to a comparison of pointers to functions. https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L696 That, too, could be translated to a comparison of pointers to functions. It's a confision Let me clarify this. What Variant does is to use pointers to functions instead of integers. The space overhead (one word) is generally the same due to alignment issues. Also the handler function concept will always have more overhead than a simple tagged union. It is certainly useful if you want to store any type, but if you only want a limited set of types there are more efficient implementations. I'm not sure at all actually. The way I see it a pointer to a function offers most everything an integer does, plus universal functionality by actually calling the function. What it doesn't offer is ordering of small integers, but that can be easily arranged at a small cost. Andrei
Re: checkedint call removal
On Saturday, 2 August 2014 at 21:36:11 UTC, Tobias Pankrath wrote: Don't you agree, that a program that throws AssertError in non -release* build is broken? * this is not the opposite of debug Let's go to the definitions: Total correctness: The program can be proved to always terminate normally and return the correct result. Total correctness is extremely difficult to achieve on a regular computer. Partial correctness: The program can be proved to always return the correct result if it terminates normally. Partial correctness is what you should aim for. What goes on if it does not terminate depends on many factors. E.g. the FDIV bug in the Intel CPU: http://en.wikipedia.org/wiki/Pentium_FDIV_bug The FDIV bug is a good example that extensive testing is not enough and that you need either exhaustive testing or formal proofs to assert correctness. Equally clearly, the FDIV bug is not the result of an incorrect program. It is the result of an incorrect "axiom" in the CPU (the execution environment).
Re: std.jgrandson
On Sunday, 3 August 2014 at 18:37:48 UTC, Sönke Ludwig wrote: Am 03.08.2014 17:34, schrieb Andrei Alexandrescu: 6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o). My requirements would be the same, except for 6. The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values). My issue with is is that if you ask for a key in an object which doesn't exist, you get an 'undefined' value back, just like JavaScript. I'd rather that be propagated as a RangeError, which is more consistent with associative arrays in the language and probably more correct. A minor issue is being able to create a Json object which isn't a valid Json object by itself. I'd rather the initial value was just 'null', which would match how pointers and class instances behave in the language.
Re: assume, assert, enforce, @safe
On Saturday, 2 August 2014 at 17:36:46 UTC, David Bregman wrote: OK, I'm done. It's clear now that you're just being intellectually dishonest in order to "win" what amounts to a trivial argument. So much for professionalism. Haha, this time it's not as bad as it was in catch syntax discussion. I even thought then they are blackmailed or something like that.
Re: std.jgrandson
On Sunday, 3 August 2014 at 08:04:40 UTC, Johannes Pfau wrote: API looks great but I'd like to see some simple serialize/deserialize functions as in vibed: http://vibed.org/api/vibe.data.json/deserializeJson http://vibed.org/api/vibe.data.json/serializeToJson Before going this route one needs to have a good vision how it may interact with imaginary std.serialization to avoid later deprecation. At the same time I have recently started to think that dedicated serialization module that decouples aggregate iteration from data storage format is in most cases impractical for performance reasons - different serialization methods imply very different efficient iteration strategies. Probably it is better to define serialization compile-time traits instead and require each `std.data.*` provider to implement those on its own in the most effective fashion.
Re: std.jgrandson
Am 03.08.2014 17:34, schrieb Andrei Alexandrescu: On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday. This may be the crux w.r.t. the vibe.data.json implementation. My schedule will be very crowded this month, so I could only really start to work on it beginning of September. But apart from the mentioned points, I think your implementation is already the closest thing to what I have in mind, so I'm all for going the clean slate route (I'll have to do a lot in terms of deprecation work in vibe.d anyway). 2. Avoid UTF decoding. 3. Offer a lazy token stream as a basis for a non-lazy parser. A lazy general parser would be considerably more difficult to write and would only serve a small niche. On the other hand, a lazy tokenizer is easy to write and make efficient, and serve as a basis for user-defined specialized lazy parsers if the user wants so. 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same. 5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. 6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o). Sönke, what do you think? My requirements would be the same, except for 6. The "undefined" state in the vibe.d version was necessary due to early API decisions and it's more or less a prominent part of it (specifically because the API was designed to behave similar to JavaScript). In hindsight, I'd definitely avoid that. However, I don't think its existence (also in the form of Algebraic.init) is an issue per se, as long as such values are properly handled when converting the runtime value back to a JSON string (i.e. skipped or treated as null values).
Re: Associative Ranges
On Sunday, 3 August 2014 at 08:50:47 UTC, Marc Schütz wrote: On Sunday, 3 August 2014 at 06:19:12 UTC, Freddy wrote: On Friday, 1 August 2014 at 23:57:37 UTC, Freddy wrote: I just curious, do Associative Ranges exist. If so where can i find them. I started thinking about them when i asked this question: http://forum.dlang.org/thread/vauuognmhvtjrktaz...@forum.dlang.org I started a phobos fork for this, what do you think so far https://github.com/Superstar64/phobos/blob/60d3472b1056b298319976f105aa3b9b3f165e97/std/range.d#L1357-1420 Nice! A few comments after a cursory glance: 1) "aligns" sounds strange, use "corresponds" instead. 2) It would be preferable that associative ranges mimic built-in associative arrays as closely as possible, i.e. the range members should be called `byKey` and `byValue`, and it should allow iteration over the associative range directly, instead of using the `range` member. 3) For the value and key ranges, there should be a guarantee that they can be zipped through, i.e. that the elements in them are in the same order so keys and values correspond to each other. The built-in associative arrays provide `byKey` and `byValue`, which satisfy this condition. Also won't the implicit range make it hard to implement function like map (should you map to whole range or to the value)
Re: Associative Ranges
On Sunday, 3 August 2014 at 08:50:47 UTC, Marc Schütz wrote: On Sunday, 3 August 2014 at 06:19:12 UTC, Freddy wrote: On Friday, 1 August 2014 at 23:57:37 UTC, Freddy wrote: I just curious, do Associative Ranges exist. If so where can i find them. I started thinking about them when i asked this question: http://forum.dlang.org/thread/vauuognmhvtjrktaz...@forum.dlang.org I started a phobos fork for this, what do you think so far https://github.com/Superstar64/phobos/blob/60d3472b1056b298319976f105aa3b9b3f165e97/std/range.d#L1357-1420 Nice! A few comments after a cursory glance: 1) "aligns" sounds strange, use "corresponds" instead. 2) It would be preferable that associative ranges mimic built-in associative arrays as closely as possible, i.e. the range members should be called `byKey` and `byValue`, and it should allow iteration over the associative range directly, instead of using the `range` member. 3) For the value and key ranges, there should be a guarantee that they can be zipped through, i.e. that the elements in them are in the same order so keys and values correspond to each other. The built-in associative arrays provide `byKey` and `byValue`, which satisfy this condition. Alright i fixed it, although should the implict range be a forward range, an input range or something else and how do you implement front,popFront, and empty on a built-in associative array.
Re: std.jgrandson
Am Sun, 03 Aug 2014 09:17:57 -0700 schrieb Andrei Alexandrescu : > On 8/3/14, 8:51 AM, Johannes Pfau wrote: > > > > Variant uses TypeInfo internally, right? > > No. > https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L210 https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L371 https://github.com/D-Programming-Language/phobos/blob/master/std/variant.d#L696 Also the handler function concept will always have more overhead than a simple tagged union. It is certainly useful if you want to store any type, but if you only want a limited set of types there are more efficient implementations.
Re: std.jgrandson
Am 03.08.2014 17:14, schrieb Andrei Alexandrescu: On 8/3/14, 2:38 AM, Sönke Ludwig wrote: A few thoughts based on my experience with vibe.data.json: 1. No decoding of strings appears to mean that "Value" also always contains encoded strings. This seems the be a leaky and also error prone leaky abstraction. For the token stream, performance should be top priority, so it's okay to not decode there, but "Value" is a high level abstraction of a JSON value, so it should really hide all implementation details of the storage format. Nonono. I think there's a confusion. The input strings are not UTF decoded for the simple need there's no need (all tokenization decisions are taken on the basis of ASCII characters/code units). The backslash-prefixed characters are indeed decoded. An optimization I didn't implement yet is to use slices of the input wherever possible (when the input is string, immutable(byte)[], or immutable(ubyte)[]). That will reduce allocations considerably. Ah okay, *phew* ;) But in that case I'd actually think about leaving off the backslash decoding in the low level parser, so that slices could be used for immutable inputs in all cases - maybe with a name of "rawString" for the stored data and an additional "string" property that decodes on the fly. This may come in handy when the first comparative benchmarks together with rapidjson and the like are done. 2. Algebraic is a good choice for its generic handling of operations on the contained types (which isn't exposed here, though). However, a tagged union type in my experience has quite some advantages for usability. Since adding a type tag possibly affects the interface in a non-backwards compatible way, this should be evaluated early on. There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it. I see. Suppose that opDispatch would be dropped, would anything speak against "alias this"ing _payload to avoid the need for the manually defined operators? What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.) The two major points are probably that it's possible to use "final switch" on the type tag if it's an enum, and the type id can be easily stored in both integer and string form (which is not as conveniently possible with a TypeInfo). (...) The way I see it, good work on tagged unions must be either integrated within std.variant (either by modifying Variant/Algebraic or by adding new types to it). I am very strongly opposed to adding a tagged union type only for JSON purposes, which I'd consider essentially a usability bug in std.variant, the opposite of dogfooding, etc. Definitely agree there. An enum based tagged union design also currently has the unfortunate property that the order of enum values and that of the accepted types must be defined consistently, or bad things will happen. Supporting UDAs on enum values would be a possible direction to fix this: enum JsonType { @variantType!string string, @variantType!(JsonValue[]) array, @variantType!(JsonValue[string]) object } alias JsonValue = TaggedUnion!JsonType; But then there are obviously still issues with cyclic type references. So, anyway, this is something that still requires some thought. It could also be designed in a way that is backwards compatible with a pure "Algebraic", so it shouldn't be a blocker for the current design.
Re: GDC UDA Attributes (was 'checkedint call removal')
On 08/03/14 17:14, Iain Buclaw via Digitalmars-d wrote: > On 2 August 2014 14:54, Artur Skawina via Digitalmars-d >> But I'm not sure if exposing `attribute` like that would be >> a good idea (until now I was always using a static import, so >> name clashes were not a problem); I'd probably rename it to >> `__attribute__`. > > This can be done, we can deprecate attribute in favour of > __attribute__ and add new enums for shortcut paths to access the > internal attributes. > > One way could be as above, alternatively if name clashing is a > problem, then we can always prefix with GNU_ > > --- > enum GNU_forceinline = __attribute__("forceinline"); > /* etc... */ > > auto GNU_target(A...)(A args) > if(A.length > 0 && is(A[0] == string)) > { > return __attribute__("target", args); > } > --- > > Then in user code: > > --- > import gcc.attribute; > > @GNU_forceinline int foobar(); > @GNU_target("sse3") float4 mySSE3Func(); > --- No; the advantage of using these magic UDA is that they don't need to be vendor-specific. IOW one can define "inline" as the GDC magic attribute, or nothing. Maybe even some other compiler can support it. And the source code remains portable. @GNU_forceinline @LLVM_forceinline @DM_forceinline @SDC_forceinline int foobar(); // ... artur
Re: checkedint call removal
On 08/03/14 17:06, Walter Bright via Digitalmars-d wrote: > On 8/2/2014 1:06 PM, Artur Skawina via Digitalmars-d wrote: >> There's nothing wrong with `assume`, it's very useful for optimizations. >> But it's too dangerous to tack `assume` onto `assert`. If they are kept >> separate then it's at least possible to carefully audit every 'assume'. >> People *will* use them for micro-optimizations, and they *will* make >> mistakes. > > This seems contradictory. You say it is fine to have assume affect > optimization, i.e. insert bugs if the assumes are wrong, but not fine to > check at runtime that the assumes are correct? I'm saying there's nothing wrong with having an assume(-like) directive; by default `assume` should of course check the condition at RT just like `assert` does [1]. I'm against _redefining_ `assert` to mean `assume`. In practice there will always be a possibility of an assert failing at RT due to eg a different environment or inputs[2]. If programs always were perfect and absolutely bug-free, asserts and diagnostics would never be required at all... Do you really think that it is unreasonable to expect that, in a language called *D*, a very well known decades old C concept isn't redefined? That wouldn't be appropriate even for a language called "Mars"... Of course everybody initially expects that an 'assert' in D acts like the C equivalent. There isn't anything in the "spec" (ie dlang.org) that even hints at D's assert being different. ("[...] if the result is false, an AssertError is thrown. If the result is true, then no exception is thrown.[...]"). I just looked at the "contracts" page for the first time ever, and found this: "assert in function bodies works by throwing an AssertError, which can be caught and handled". That could reasonably be interpreted to mean that a failing assertion in D still leaves the program in a well defined valid (!) state. Most people who are already aware of `assert` will know a definition like this one: http://pubs.opengroup.org/onlinepubs/009695399/functions/assert.html "assert - insert program diagnostics [...] The assert() macro shall insert diagnostics into programs". Nobody expects that disabling diagnostics introduces undefined behavior. Hence, slightly inaccurate or not 100% up-to-date asserts are not considered a real problem. And they are not a problem. Except in D? artur [1] An `__assume` that never checks at RT would also be useful for low-level code, where failure is not an option. [2] Asserts are not for /directly/ validating inputs, yes, but inputs are often necessary to get the program to an invalid state.
Re: std.jgrandson
On 8/3/14, 10:19 AM, Sean Kelly wrote: I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input. What to do about arrays and objects, which would naturally allocate arrays and associative arrays respectively? What about strings with backslash-encoded characters? No allocation works for tokenization, but parsing is a whole different matter. So the lowest layer should allow me to iterate across symbols in some way. Yah, that would be the tokenizer. When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time. That's tricky. Once you scan for 2 specific characters you may as well scan for a couple more, the added cost is negligible. In contrast, scanning once for finding termination and then again for decoding purposes will definitely be a lot more expensive. I suggest splitting number into float and integer types. In a language like D where these are distinct internal bfulifbucivrdfvhhjnrunrgultdjbjutypes, it can be valuable to know this up front. Yah, that kept on sticking like a sore thumb throughout. Is there support for output? I see the makeArray and makeObject routines... Ideally, there should be a way to serialize JSON against an OutputRange with optional formatting. Not yet, and yah those should be in. Andrei
Re: std.jgrandson
On 8/3/14, 9:49 AM, Daniel Gibson wrote: Am 03.08.2014 09:16, schrieb Andrei Alexandrescu: We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Is the name supposed to stay or just a working title? Just a working title, but of course if it were wildly successful... but then again it's not. -- Andrei
Re: std.jgrandson
I don't want to pay for anything I don't use. No allocations should occur within the parser and it should simply slice up the input. So the lowest layer should allow me to iterate across symbols in some way. When I've done this in the past it was SAX-style (ie. a callback per type) but with the range interface that shouldn't be necessary. The parser shouldn't decode or convert anything unless I ask it to. Most of the time I only care about specific values, and paying for conversions on everything is wasted process time. I suggest splitting number into float and integer types. In a language like D where these are distinct internal types, it can be valuable to know this up front. Is there support for output? I see the makeArray and makeObject routines... Ideally, there should be a way to serialize JSON against an OutputRange with optional formatting.
Re: std.jgrandson
Am 03.08.2014 09:16, schrieb Andrei Alexandrescu: We need a better json library at Facebook. I'd discussed with Sönke the possibility of taking vibe.d's json to std but he said it needs some more work. So I took std.jgrandson to proof of concept state and hence ready for destruction: http://erdani.com/d/jgrandson.d http://erdani.com/d/phobos-prerelease/std_jgrandson.html Is the name supposed to stay or just a working title? "std.j*grandson*" (being the successor of "std.j*son*") is of course a funny play of words, but it's not really obvious on the first sight what it does. i.e. if someone skims the std. modules in the documentation, looking for json, he'd probably not think that this is the new json module. std.json2 or something like that would be more obvious. Cheers, Daniel
Re: checkedint call removal
On 08/03/2014 05:00 PM, Paolo Invernizzi wrote: On Sunday, 3 August 2014 at 14:10:29 UTC, Timon Gehr wrote: On 08/03/2014 03:01 PM, Paolo Invernizzi wrote: On Sunday, 3 August 2014 at 10:49:39 UTC, Timon Gehr wrote: On 08/03/2014 11:15 AM, Paolo Invernizzi wrote: because every few milliseconds an assert is triggered Right, and software does not have security holes because otherwise they would obviously be exploited every few milliseconds during in-house testing. That is a totally different matter: Well, no. security holes are about things that the programmer is _totally missing_, The programmer(s!) may be _totally missing_ the conditions that lead to an assertion failure. In fact, unless assertions are intentionally misused, this is always the case. and someone is seeing and exploiting that. (Undefined behaviour introduced in this way may be exploitable.) If the assertion triggers, that's not undefined behaviour: It will from now on be in -release, this is what this is all about. it's a bug, already implanted in the code, that the assertion is surfacing _avoiding_ undefined behaviour occurring from the really next line of code. ... Most bugs do not lead to undefined behaviour. E.g. you can write buggy code in a language that does not have the concept. Security holes are not related to assertions at all, they are related in unpredictable state that the program has reached, outside of the view of the programmers. ... And what do you think a wrong assertion is the manifestation of? (Hint: The assertion talks about the program state.) Assertions are only checks that the reasoning about the flow and the conditions is going in the way that was originally intended. If you have failures, But that is not the scenario. You don't turn on -release in order to disable assertions that you _know_ are failing. and you want to cope with them, you MUST remove the failing assertions from the code, and turn them in specific code to cope with the faulty condition. Something like 'warning mate! the commented assertion is triggered from time to time, Why do you assume they had noticed that the assertion was wrong? That was a stereotypical example; You were not in the position to introduce a stereotypical example. If Johannes says that some code will break, and you say "I don't buy this", then you cannot introduce an example to argue against, where the additional circumstances are such that you have an easy point. This is a straw man. You need to argue against the existence of the broken code. I.e. you need to argue for an (approximate) universal. If you introduce an example, you need to argue that the 'example' is actually the only (likely) way that code could be broken by this change. I.e. you'd need to argue that wrong assertions are always caught before they are part of a build where they are disabled. (You might also pursue the alternative direction of admitting that this is a possibility, but saying that it is very unlikely to happen, and then justify that opinion. ("That's true, this might happen, but I don't think this change will lead to many catastrophes, because...")) what I was trying to argue it's that also if we do some dirty tricks to keep the train on the rails, (That's still part of the made up scenario.) if the program logic is flowed [sic] Maybe it hasn't been flawed in the -release build before because the faulty assertion was the only thing that was faulty, but nobody knew that it was faulty before the released system went down. you can have an avalanche effect in some cases: Note again, 'some cases'. Those are again different cases than Johannes had in mind. everything seems to work fine for a good amount of time until the snowball comes down, violates catastrophically the program logic and boom. Again, in other cases, the -release program will operate as expected. An analogy, to be taken with a grain of salt: Planes crash sometimes, even though this is not part of their intended design. It would still be a bad idea to install bombs in some planes that explode when they seem about to crash according to _some_ of many simple tests of measured parameters proposed by one of (possibly) _many_ designers of the plane, especially if those tests were proposed without knowledge that this was going to be their purpose.
Re: checkedint call removal
On Sunday, 3 August 2014 at 15:16:31 UTC, Andrei Alexandrescu wrote: On 8/3/14, 8:10 AM, Walter Bright wrote: We could establish a rule for @safe that function arguments that are pointers must be pointers to valid memory, not past the end. I think that's a good stance. -- Andrei Agreed, see my other post. In fact, if I remember correctly this is not the first time that a variant of this question pops up and I think we already came to this conclusion at least once. Of course, this also entails that it must be impossible to obtain such a pointer in @safe code. Right now, there are a still a few holes in that regard (see Bugzilla), e.g. someArray[$ .. $].ptr. Cheers, David
Re: std.jgrandson
On 8/3/14, 8:51 AM, Johannes Pfau wrote: Am Sun, 03 Aug 2014 08:34:20 -0700 schrieb Andrei Alexandrescu : On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. [...] 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same. I think for the lowest level interface we could avoid allocation completely: The tokenizer could always return slices to the raw string, even if a string contains backslash-encode sequences or if the token is a number. Simply expose that as token.rawValue. Then add a function, Token.decodeString() and token.decodeNumber() to actually decode the numbers. decodeString could additionally support decoding into a buffer. That works but not e.g. for File.byLine which reuses its internal buffer. But it's a neat idea for arrays of immutable bytes. If the input is not sliceable, read the input into an internal buffer first and slice that buffer. At that point the cost of decoding becomes negligible. The main usecase for this is if you simply stream lots of data and you only want to parse very little of it and skip over most content. Then you don't need to decode the strings. Awesome. This is also true if you only write a JSON formatter: No need to decode and encode the strings. But wouldn't that still need to encode \n, \r, \t, \v? 5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. Variant uses TypeInfo internally, right? No. Andrei
Re: Official PPA for dmd
That's unfortunate. Anyone know why? On Sun, Aug 3, 2014 at 4:35 AM, Jordi Sayol via Digitalmars-d < digitalmars-d@puremagic.com> wrote: > El 01/08/14 21:34, Andrew Pennebaker via Digitalmars-d ha escrit: > > I'm happy to see an official .DEB for installing DMD! Could we please > host this in a PPA, to make it easier for Debian/Ubuntu users to install? > > dmd backend license is not compatible with PPA. > > -- > Jordi Sayol > -- Cheers, Andrew Pennebaker www.yellosoft.us
Re: std.jgrandson
Am Sun, 03 Aug 2014 08:34:20 -0700 schrieb Andrei Alexandrescu : > On 8/3/14, 2:38 AM, Sönke Ludwig wrote: > [snip] > > We need to address the matter of std.jgrandson competing with > vibe.data.json. Clearly at a point only one proposal will have to be > accepted so the other would be wasted work. > > [...] > > 4. Avoid string allocation. String allocation can be replaced with > slices of the input when these two conditions are true: (a) input > type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there > are no backslash-encoded sequences in the string, i.e. the input > string and the actual string are the same. I think for the lowest level interface we could avoid allocation completely: The tokenizer could always return slices to the raw string, even if a string contains backslash-encode sequences or if the token is a number. Simply expose that as token.rawValue. Then add a function, Token.decodeString() and token.decodeNumber() to actually decode the numbers. decodeString could additionally support decoding into a buffer. If the input is not sliceable, read the input into an internal buffer first and slice that buffer. The main usecase for this is if you simply stream lots of data and you only want to parse very little of it and skip over most content. Then you don't need to decode the strings. This is also true if you only write a JSON formatter: No need to decode and encode the strings. > > 5. Build on std.variant through and through. Again, anything that > doesn't work is a usability bug in std.variant, which was designed > for exactly this kind of stuff. Exposing the representation such that > user code benefits of the Algebraic's primitives may be desirable. > Variant uses TypeInfo internally, right? I think as long as it uses TypeInfo it can't replace all use-cases for a standard tagged union.
Re: checkedint call removal
On Sunday, 3 August 2014 at 15:06:56 UTC, Walter Bright wrote: On 8/2/2014 1:06 PM, Artur Skawina via Digitalmars-d wrote: There's nothing wrong with `assume`, it's very useful for optimizations. But it's too dangerous to tack `assume` onto `assert`. If they are kept separate then it's at least possible to carefully audit every 'assume'. People *will* use them for micro-optimizations, and they *will* make mistakes. This seems contradictory. You say it is fine to have assume affect optimization, i.e. insert bugs if the assumes are wrong, Yes but not fine to check at runtime that the assumes are correct? No, it would be fine for assume to do runtime checks on debug. I.e. The semantics you want assert to have, would be fine for assume. But I'd also expect assume to be used much more conservatively than assert. And the naming is much clearer this way.
Any emacs expert in the house?
We need work on this: http://stackoverflow.com/questions/25089090/emacs-d-mode-cannot-handle-backquoted-backslashes Andrei
Re: Short overview on D
On Sunday, 3 August 2014 at 13:27:40 UTC, Rikki Cattermole wrote: On 4/08/2014 12:30 a.m., Bayan Rafeh wrote: On Sunday, 3 August 2014 at 11:56:32 UTC, Rikki Cattermole wrote: On 3/08/2014 11:53 p.m., Bayan Rafeh wrote: Small question. Can anyone give me an example of when one would use a parametrized block as opposed to a parameterized class or method? mixin templates take the context for which they are mixed in. Basically: mixin template FooT() { void FooT() { //... writeln(bar); } } void myfunc() { string bar; //... mixin FooT; } Templated classes/structs/unions/method or functions are used for if you want to modify code gen itself. I'm not sure if we're going to be using templates much so I'm going to include a couple of examples you gave, plus the link that Gary posted which is better than anything I could write right now. Do you think that's enough? If you're using writeln, you're using templates. Pretty much, once you start using them, you'll realize inherently how powerful they are. That's why I believe it should be more then just touched upon :) Very well then. I will try my best to do them justice. And I see what you mean with the with statement. It certainly is an intriguing concept, and I would love to see something like that. It's quite underused. It'll be quite exciting to see it being actually used in frameworks. I'll add it in there on the off chance that someone designing a framework will stumble upon this tutorial and make use of if. Anonymous: I see. C++ first it is.
Re: std.jgrandson
On Sunday, 3 August 2014 at 15:14:43 UTC, Andrei Alexandrescu wrote: 3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point. Interesting. Well if experience with opDispatch is negative then it should probably not be used here, or only offered on an opt-in basis. I support this opinion. opDispatch looks cool with JSON objects when you implement it but it results in many subtle quirks when you consider something like range traits for example - most annoying to encounter and debug. It is not worth the gain.
Re: std.jgrandson
On 8/3/14, 2:38 AM, Sönke Ludwig wrote: [snip] We need to address the matter of std.jgrandson competing with vibe.data.json. Clearly at a point only one proposal will have to be accepted so the other would be wasted work. Following our email exchange I decided to work on this because (a) you mentioned more work is needed and your schedule was unclear, (b) we need this at FB sooner rather than later, (c) there were a few things I thought can be improved in vibe.data.json. I hope that taking std.jgrandson to proof spurs things into action. Would you want to merge some of std.jgrandson's deltas into a new proposal std.data.json based on vibe.data.json? Here's a few things that I consider necessary: 1. Commit to a schedule. I can't abandon stuff in wait for the perfect design that may or may not come someday. 2. Avoid UTF decoding. 3. Offer a lazy token stream as a basis for a non-lazy parser. A lazy general parser would be considerably more difficult to write and would only serve a small niche. On the other hand, a lazy tokenizer is easy to write and make efficient, and serve as a basis for user-defined specialized lazy parsers if the user wants so. 4. Avoid string allocation. String allocation can be replaced with slices of the input when these two conditions are true: (a) input type is string, immutable(byte)[], or immutable(ubyte)[]; (b) there are no backslash-encoded sequences in the string, i.e. the input string and the actual string are the same. 5. Build on std.variant through and through. Again, anything that doesn't work is a usability bug in std.variant, which was designed for exactly this kind of stuff. Exposing the representation such that user code benefits of the Algebraic's primitives may be desirable. 6. Address w0rp's issue with undefined. In fact std.Algebraic does have an uninitialized state :o). Sönke, what do you think? Andrei
scope guards
I'm trying to make better use of scope guards, but I find myself belting out try/catch statements almost everywhere. I'm rather disappointed, because scope guards are advertised to offer the promise of eliminating try/catch junk throughout your code, and I'm just not finding that to be the practical reality. I think the core of the problem is that scope(failure) is indiscriminate, but I want to filter it for particular exceptions. The other issue is that you still need a catch() if you don't actually want the program to terminate, which implies a try... :/ One thing that may be leveraged to eliminate most catch blocks is the existing ability to return from scope guard blocks, allowing to gracefully return from a function while unwinding, akin to a catch. The problem then is that you can't handle specific exceptions. I'm thinking this would make all the difference: scope(failure, MyException e) // only executed for exceptions of type MyException { writeln(e.msg); // can refer to the exception in this failure block return failureValue; // and can gracefully return from the function too } That would eliminate about 80% of my try/catch blocks. The remaining suffer from the problem where I want to respond to exceptions NOT of a specific type, ie, clean up in the case of an unexpected/unknown exception. scope(failure, ~MyException) { // clean up, because some unexpected exception occurred that I don't/can't handle. } Is there already some mechanism to do this? I couldn't find anything in the docs. It seems like an obvious thing to want to do.
Re: checkedint call removal
On 8/3/14, 8:10 AM, Walter Bright wrote: On 8/2/2014 1:23 PM, Andrei Alexandrescu wrote: Assume we choose that, there's still murky ground: @system fun(int[] p) { gun(p.ptr + p.length); } @safe gun(int* p) { if (p) *p = 42; } This passes semantic checking but is unsafe and unsafety is in the @safe code. Well, that's fine, we might say. The problem is this works against our stance that "inspect @system code by hand, @safe code will take care of itself". The problem is that pointers just past the end have this weird property "the pointer is okay but not for dereferencing". We could establish a rule for @safe that function arguments that are pointers must be pointers to valid memory, not past the end. I think that's a good stance. -- Andrei
Re: std.jgrandson
On 8/3/14, 2:38 AM, Sönke Ludwig wrote: A few thoughts based on my experience with vibe.data.json: 1. No decoding of strings appears to mean that "Value" also always contains encoded strings. This seems the be a leaky and also error prone leaky abstraction. For the token stream, performance should be top priority, so it's okay to not decode there, but "Value" is a high level abstraction of a JSON value, so it should really hide all implementation details of the storage format. Nonono. I think there's a confusion. The input strings are not UTF decoded for the simple need there's no need (all tokenization decisions are taken on the basis of ASCII characters/code units). The backslash-prefixed characters are indeed decoded. An optimization I didn't implement yet is to use slices of the input wherever possible (when the input is string, immutable(byte)[], or immutable(ubyte)[]). That will reduce allocations considerably. 2. Algebraic is a good choice for its generic handling of operations on the contained types (which isn't exposed here, though). However, a tagged union type in my experience has quite some advantages for usability. Since adding a type tag possibly affects the interface in a non-backwards compatible way, this should be evaluated early on. There's a public opCast(Payload) that gives the end user access to the Payload inside a Value. I forgot to add documentation to it. What advantages are to a tagged union? (FWIW: to me Algebraic and Variant are also tagged unions, just that the tags are not 0, 1, ..., n. That can be easily fixed for Algebraic by defining operations to access the index of the currently-stored type.) 2.b) I'm currently working on a generic tagged union type that also enables operations between values in a natural generic way. This has the big advantage of not having to manually define operators like in "Value", which is error prone and often limited (I've had to make many fixes and additions in this part of the code over time). I did notice that vibe.json has quite a repetitive implementation, so reducing it would be great. The way I see it, good work on tagged unions must be either integrated within std.variant (either by modifying Variant/Algebraic or by adding new types to it). I am very strongly opposed to adding a tagged union type only for JSON purposes, which I'd consider essentially a usability bug in std.variant, the opposite of dogfooding, etc. 3. Use of "opDispatch" for an open set of members has been criticized for vibe.data.json before and I agree with that criticism. The only advantage is saving a few keystrokes (json.key instead of json["key"]), but I came to the conclusion that the right approach to work with JSON values in D is to always directly deserialize when/if possible anyway, which mostly makes this is a moot point. Interesting. Well if experience with opDispatch is negative then it should probably not be used here, or only offered on an opt-in basis. This approach has a lot of advantages, e.g. reduction of allocations, performance of field access and avoiding typos when accessing fields. Especially the last point is interesting, because opDispatch based field access gives the false impression that a static field is accessed. Good point. The decision to minimize the number of static fields within "Value" reduces the chance of accidentally accessing a static field instead of hitting opDispatch, but there are still *some* static fields/methods and any later addition of a symbol must now be considered a breaking change. Right now the idea is that the only named member is __payload. Well then there's opXxxx as well. The idea is/was to add all other functionality as free functions. 3.b) Bad interaction of UFCS and opDispatch: Functions like "remove" and "assume" certainly look like they could be used with UFCS, but opDispatch destroys that possibility. Yah, agreed. The bummer is people coming from Python won't be able to continue using the same style without opDispatch. 4. I know the stance on this is often "The D module system has enough facilities to disambiguate" (which is not really a valid argument, but rather just the lack of a counter argument, IMO), but I highly dislike the choice to leave off any mention of "JSON" or "Json" in the global symbol names. Using the module either requires to always use a renamed import or a manual alias, or the resulting source code will always leave the reader wondering what kind of data is actually handled. Handling multiple "value" types in a single piece of code, which is not uncommon (e.g. JSON + BSON/ini value/...) would always require explicit disambiguation. I'd certainly include the "JSON" or "Json" part in the names. Good point, I agree. 5. Whatever happens, *please* let's aim for a module name of std.data.json (similar to std.digest.*), so that any data formats added later are nicely organized. All existing data format support (XML + CSV) doesn't f
Re: GDC UDA Attributes (was 'checkedint call removal')
On 2 August 2014 14:54, Artur Skawina via Digitalmars-d wrote: > + > +enum noinline = attribute("noinline"); > +enum inline = attribute("forceinline"); > +enum flatten = attribute("flatten"); > > But I'm not sure if exposing `attribute` like that would be > a good idea (until now I was always using a static import, so > name clashes were not a problem); I'd probably rename it to > `__attribute__`. > This can be done, we can deprecate attribute in favour of __attribute__ and add new enums for shortcut paths to access the internal attributes. One way could be as above, alternatively if name clashing is a problem, then we can always prefix with GNU_ --- enum GNU_forceinline = __attribute__("forceinline"); /* etc... */ auto GNU_target(A...)(A args) if(A.length > 0 && is(A[0] == string)) { return __attribute__("target", args); } --- Then in user code: --- import gcc.attribute; @GNU_forceinline int foobar(); @GNU_target("sse3") float4 mySSE3Func(); --- Regards Iain.