Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
On Tue, 27 May 2014 06:42:41 -1000 Andrei Alexandrescu via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: http://www.reddit.com/r/programming/comments/26m8hy/scott_meyers_dconf_2014_keynote_the_last_thing_d/ https://news.ycombinator.com/newest (search that page, if not found click More and search again) https://www.facebook.com/dlang.org/posts/855022447844771 https://twitter.com/D_Programming/status/471330026168651777 Fortunately, for the most part, I think that weve avoided the types of inconsistencies that Scott describes for C++, but we do definitely have some of our own. The ones that come to mind at the moment are: 1. The order of the dimensions of multi-dimensional static arrays is backwards in comparison to what most everyone expects. int[4][5][6] foo; is the same as int foo[6][5][4]; and has the same dimensions as auto bar = new int[][][](6, 5, 4); The reasons for it stem from the fact that the compiler reads types outward from the variable name (which is very important to understand in C because of its function pointer syntax but not so important in D). However, once we did const(int)* foo; and didnt allow (int)const* foo; I think that we threw that particular bit of consistency with C/C++ out the window, and we really should have just made static array dimensions be read from left-to-right. Unfortunately, I dont think that we can fix that at this point, because doing so would cause silent breakage (or at minimum, would be silent until RangeErrors were thrown at runtime). 2. Were inconsistent with dynamic array dimensions. auto foo = new int[5]; is the same as auto foo = new int[](5); but once you get into multi-dimensional arrays, its just confusing, because auto foo = new int[4][5][6]; does _not_ declare a multi-dimensional dynimac array but rather a dynamic array of length 6 which contains a multi-dimensonal static array of dimensions 4 and 5. Instead, what you need to do is auto foo = new int[][][](4, 5, 6); IMHO, we should have made it illegal to have dynamic array dimensions inside of the brackets rather than the parens, but I dont know if we can change that. It wouldnt be silent breakage, but it _would_ make it so that a lot of existing code would be broken - especially because so many people put the array dimensions between the brackets for single-dimension dynamic arrays. 3. const, immutable, and inout on the left-hand side of a function declaration are unfortunately legal. This inevitably trips people up, because they think that the attribute applies to the return type, when it applies to the function itself. This is to make the function attributes consistent, because all of the others can go on either side, but the result is that its essentially bad practice to ever put any attribute on the left-hand side which could apply to the return type, because it looks like a bug. If we just made it illegal for those attributes to go on the left, the problem would be solved, and the result would be far less confusing and bug-prone. I think that we can make that change with minimal breakage (since its already bad practice to put them no the left-hand side), but AFAIK, Walter is against the idea. 4. There are some cases (such as with static constructors and unittest blocks) that the attributes have to go on the left for some reason. I dont remember the reasons for it, but its an inconsistency which definitely trips up even seasoned D programmers from time to time. 5. The fact that pure is called pure is very problematic at this point as far as explaining things to folks goes. We should probably consider renaming it to something like @noglobal, but Im not sure that that would go over very well given the amount of breakage involved. It _does_ require a lot of explaining though. 6. The situation with ranges and string is kind of ugly, with them being treated as ranges of code points. I dont know what the correct solution to this is, since treating them as ranges of code units promotes efficiency but makes code more error-prone, whereas treating them as ranges of graphemes would just cost too much. Ranges of code points is _mostly_ correct but still incorrect and _more_ efficient than graphemes but still quite a bit less efficient than code units. So, its kind of like its got the best and worst of both worlds. The current situation causes inconsistencies with everything else (forcing us to use isNarrowString all over the place) and definitely requires frequent explaining, but it does prevent some classes of problems. So, I dont know. I used to be in favor of the current situation, but at this point, if we could change it, I think that Id argue in faver of just treating them as ranges of code units and then have wrappers for ranges of code points or graphemes. It seems like the current situation promotes either using ubyte[] (if you care about efficiency) or the new grapheme facilities in std.uni if you care about
Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
Okay. That seriously got munged. Let's try that again... On Tue, 27 May 2014 06:42:41 -1000 Andrei Alexandrescu via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: http://www.reddit.com/r/programming/comments/26m8hy/scott_meyers_dconf_2014_keynote_the_last_thing_d/ https://news.ycombinator.com/newest (search that page, if not found click More and search again) https://www.facebook.com/dlang.org/posts/855022447844771 https://twitter.com/D_Programming/status/471330026168651777 Fortunately, for the most part, I think that we've avoided the types of inconsistencies that Scott describes for C++, but we do definitely have some of our own. The ones that come to mind at the moment are: 1. The order of the dimensions of multi-dimensional static arrays is backwards in comparison to what most everyone expects. int[4][5][6] foo; is the same as int foo[6][5][4]; and has the same dimensions as auto bar = new int[][][](6, 5, 4); The reasons for it stem from the fact that the compiler reads types outward from the variable name (which is very important to understand in C because of its function pointer syntax but not so important in D). However, once we did const(int)* foo; and didn't allow (int)const* foo; I think that we threw that particular bit of consistency with C/C++ out the window, and we really should have just made static array dimensions be read from left-to-right. Unfortunately, I don't think that we can fix that at this point, because doing so would cause silent breakage (or at minimum, would be silent until RangeErrors were thrown at runtime). 2. We're inconsistent with dynamic array dimensions. auto foo = new int[5]; is the same as auto foo = new int[](5); but once you get into multi-dimensional arrays, it's just confusing, because auto foo = new int[4][5][6]; does _not_ declare a multi-dimensional dynamic array but rather a dynamic array of length 6 which contains a multi-dimensonal static array of dimensions 4 and 5. Instead, what you need to do is auto foo = new int[][][](4, 5, 6); IMHO, we should have made it illegal to have dynamic array dimensions inside of the brackets rather than the parens, but I don't know if we can change that. It wouldn't be silent breakage, but it _would_ make it so that a lot of existing code would be broken - especially because so many people put the array dimensions between the brackets for single-dimension dynamic arrays. 3. const, immutable, and inout on the left-hand side of a function declaration are unfortunately legal. This inevitably trips people up, because they think that the attribute applies to the return type, when it applies to the function itself. This is to make the function attributes consistent, because all of the others can go on either side, but the result is that it's essentially bad practice to ever put any attribute on the left-hand side which could apply to the return type, because it looks like a bug. If we just made it illegal for those attributes to go on the left, the problem would be solved, and the result would be far less confusing and bug-prone. I think that we can make that change with minimal breakage (since it's already bad practice to put them no the left-hand side), but AFAIK, Walter is against the idea. 4. There are some cases (such as with static constructors and unittest blocks) that the attributes have to go on the left for some reason. I don't remember the reasons for it, but it's an inconsistency which definitely trips up even seasoned D programmers from time to time. 5. The fact that pure is called pure is very problematic at this point as far as explaining things to folks goes. We should probably consider renaming it to something like @noglobal, but I'm not sure that that would go over very well given the amount of breakage involved. It _does_ require a lot of explaining though. 6. The situation with ranges and string is kind of ugly, with them being treated as ranges of code points. I don't know what the correct solution to this is, since treating them as ranges of code units promotes efficiency but makes code more error-prone, whereas treating them as ranges of graphemes would just cost too much. Ranges of code points is _mostly_ correct but still incorrect and _more_ efficient than graphemes but still quite a bit less efficient than code units. So, it's kind of like it's got the best and worst of both worlds. The current situation causes inconsistencies with everything else (forcing us to use isNarrowString all over the place) and definitely requires frequent explaining, but it does prevent some classes of problems. So, I don't know. I used to be in favor of the current situation, but at this point, if we could change it, I think that I'd argue in faver of just treating them as ranges of code units and then have wrappers for ranges of code points or graphemes. It seems like the current situation promotes either using ubyte[] (if you
Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
On Wed, 28 May 2014 16:07:08 -0700 Walter Bright via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: Some of the inconsistencies you mentioned and Brian mentioned in his talk are actually the result of consistencies. I know this is a bit of a difficult thing to wrap one's head around, but having something be mathematically consistent and humanly consistent are often at severe odds. I don't disagree, but I also think that we need to be very careful when they're at odds, because it tends to result in buggy code when the rules are inconsistent from the human's perspective. In some cases, it's best to better educate the programmer, whereas in others, it's better to just make it consistent for the programmer - especially when you're dealing with a case where being consistent with one thing means being inconsistent with another. Overall, I think that we've done a decent job of it, but there are definitely places (e.g. static array declarations) where I think we botched it. - Jonathan M Davis
Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
On Thu, 29 May 2014 08:23:26 +0200 Timon Gehr via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: In any case, simply reversing the order for static array types using an ad-hoc rewrite rule would be a huge wart, even more severe than the other points you raised, and we definitely wouldn't be trading one kind of consistency for another. In every other case, array dimensions are read from left-to-right, and thanks to const(int)* foo; we already threw out the whole idea of types really being read outward from the variable name, and outside of static arrays, I don't think that we have anything that would even care if we declared that types were always read left-to-right. If we had always had static array dimensions be read left-to-right in their declarations, I very much doubt that you would have much of anyone complaining about it being inconsistent. If anything, that's _more_ consistent with everything else. It's just that that doesn't fit with how C/C++ compilers read types. The only reason that I don't argue strongly for changing it is the fact that it would break every existing program which uses multi-dimensional static arrays, and the breakage would be easy to miss at compile time. So, unfortunately, I think that we're stuck. But aside from arguing that it's how C/C++ reads types, I don't see much of an argument for why it makes any sense for static array dimensions be read from right-to-left in declarations. - Jonathan M Davis
Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
On Thu, 29 May 2014 01:31:44 -0700 Ali Çehreli via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On 05/29/2014 12:59 AM, Jonathan M Davis via Digitalmars-d-announce wrote: So, unfortunately, I think that we're stuck. You make it sound like there is a problem. ;) I don't see much of an argument for why it makes any sense for static array dimensions be read from right-to-left in declarations. Language does not say anything about how people read declarations. Both static array dimensions and indexing are consistent currently in D. When declaring, it is always Type[length] when indexing it is always arr[index] It's consistent until you have multiple dimensions. Then you end up with the dimensions being listed right-to-left for static array declarations and left-to-right in all other cases. Note that there is no such thing as a multi-dimensional array in C, C++, or D. Hence, there is no reading from any direction; there is a simple and consistent syntax. ??? C, C++, and D all have multi-dimensional arrays. e.g. int a[5][6]; // C/C++ int[6][5] a; // D int** a; // C/C++ int[][] a; // D int* a[5]; // C/C++ int[5][] a; // D I don't see how you could argue that they don't have multi-dimensional arrays. - Jonathan M Davis
Re: Scott Meyers' DConf 2014 keynote The Last Thing D Needs
On Thu, 29 May 2014 07:32:48 -0700 Ali Çehreli via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On 05/29/2014 03:00 AM, Jonathan M Davis via Digitalmars-d-announce wrote: I don't see how you could argue that they don't have multi-dimensional arrays. Their specs don't have such a thing. It is possible to have arrays where elements are arrays but that does not make those concepts language constructs. And how as an array of arrays _not_ a multi-dimensional array? As far as I can tell, they're exactly the same thing just phrased differently. - Jonathan M Davis
Re: Adam D. Ruppe's D Cookbook now available!
On Fri, 30 May 2014 11:48:56 + Chris via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Friday, 30 May 2014 at 11:46:35 UTC, w0rp wrote: I received my copy this morning, earlier than I thought I would. I shall check it out over the weekend. I suspect I'll probably know a lot of the things in the book, but I'm the type who likes to watch introductory lectures because there's always something I didn't see before. You're right, of course. There's _always_ something you can learn, even if you think you know it all. What I find sometimes is that even if I know most things about something, I still forget things that I knew (particularly if I don't use that knowledge regularly), so reading something like this could jog your memory about things and thus improve your knowledge, even if you actually had known all of it at some point (though the odds are still that there are at least a few things in it that you never learned, even if you know a lot). - Jonathan M Davis
Re: Real time captioning of D presentations
On Mon, 02 Jun 2014 10:00:17 -0700 Walter Bright via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On 6/2/2014 8:46 AM, Iain Buclaw via Digitalmars-d-announce wrote: However, what you can't do is change the accent to one that you may better understand. I know a lot of europeans sometimes don't quite follow me sometimes. :) Captioning also helps people who aren't native english speakers. And native English speakers as well. It's not all that infrequent that I end up temporarily turning on subtitles in a movie that I'm watching, because the actor didn't say the line clearly enough. There's no reason why a talk would be any different in that regard - especially since it only gets one take. - Jonathan M Davis
Re: Interview at Lang.NEXT
On Wed, 04 Jun 2014 07:33:01 + Joakim via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Wednesday, 4 June 2014 at 06:19:05 UTC, Andrei Alexandrescu wrote: http://www.reddit.com/r/programming/comments/27911b/conversation_with_andrei_alexandrescu_all_things/ wtf, the Mid Quality video is 1280x720 resolution HD video, guess they think every programmer has a super-fast internet connection. ;) The mp4 for Android/iPhone is a bandwidth-friendly 640x360 resolution. Well, regardless of the internet connection speeds, I would have considered 720p to be mid quality, but I work with video for a living and tend to be a bit of a videophile. Between work and doing stuff like messing with encoding settings for transcoding Blu-rays, I've pretty much been ruined. I practically can't watch DVDs anymore, and even Blu-rays frequently look pretty bad to me. But obviously, streaming high quality video over the internet can be expensive (and networks tend to behave badly even when you're just streaming a lot of video locally). - Jonathan M Davis
Re: Interview at Lang.NEXT
On Thu, 05 Jun 2014 09:30:44 +0200 Andrei Alexandrescu via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On 6/5/14, 7:59 AM, Nick Sabalausky wrote: So let me get this straight: There are programmers out there who find the occasional type annotations on some declarations to be significantly more work than following a convention of nearly *quadrupling* the amount of code they have to write? Two to three lines of tests for every one line of real code is considered rapid development, saving developer time, just getting things done, etc? And all that's considered a style of coding? You're right, I really don't understand that style of coding at all. ;) Don't get me wrong, I am pretty big on unittests, but even still: If people are trying to save developer time by replacing each minor type annotation with several extra unittests (which are less reliable anyway - greater room for human error), then something's gone horribly wrong. It's usually quite hard to explain such differences in coding stile to people that are used to static typing. That doesn't surprise me. It's also very difficult to explain 2+2==5 to people who are accustomed to basic arithmetic. ;) I have to confess this echoes a few similar confusions I have about the use and advocacy of dynamically-typed languages. One argument I've heard a while back was that static type errors are not proportional response and that static types only detect the most trivial of bugs, so why bother at all. But then the heavy-handed approach to unittesting espoused by dynamic languages, of which arguably a good part would be automated by a static type system, seems to work against that argument. Indeed. It just makes no sense to claim that using dynamic typing is simpler and easier when you're then forced to write a bunch of test code just to catch bugs that the compiler in a statically typed language would have caught for you anyway. Though I confess what horrifies me the most about dynamic languages is code like this if(cond) var = hello world; else var = 42; The fact that an if statement could change the type of a variable is just atrocious IMHO. Maybe I've just spent too much of my time in statically typed languages, but I just do not understand the draw that dynamically typed languages have for some people. They seem to think that avoiding a few simple things that you have to do in your typical statically typed language is somehow a huge improvement when it causes them so many serious problems that static languages just don't have. - Jonathan M Davis
Re: Chuck Allison's talk is up
On Thu, 05 Jun 2014 23:51:42 + Olivier Henley via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: ... Sorry I know its annoying to have someone telling you guys what to do. I would rather post a sticky thread, referencing Dicebot's channel, myself but I'm brand new here and don't have any credentials to do so. There is no such thing as a sticky thread here. This isn't a normal web forum. It's a newsgroup with multiple front-ends to it. You can view it as a newsgroup, a mailing list, or via the web forum. The result is that it has no features which aren't compatible with all three - including sticky threads. - Jonathan M Davis
Re: DMD 2.066 Alpha
On Fri, 13 Jun 2014 12:00:39 -0700 Andrei Alexandrescu via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On 6/13/14, 10:15 AM, Nick Sabalausky wrote: On 6/13/2014 12:49 PM, Andrei Alexandrescu wrote: Being able to negate the final: label is nice to have but not a must. Adding a keyword for that doesn't scale - it would mean we'd need to add one keyword to undo each label. No it doesn't mean that. virtual is very well established industry-wide as the anti-final. Just because we accept that does not mean we can't still do @~pure, @nothrow(false) or whatever for the ones which don't already have well-established names. I don't see it necessary to add the virtual as a keyword to D. -- If we were going to go with final by default, then adding virtual would make a lot of sense IMHO - especially given that's what people expect from other languages and the fact that virtual would the likely be used far more often than final. Without that however, marking a function as virtual becomes a lot less critical, and it becomes a question of whether the familiarity of using virtual instead of !final or final(false) (or whatever we come up with) is worth adding another keyword and having it in addition to !final or final(false) or whatever (since presumably, we'd still need that for generic code even with virtual). And actually, having virtual would then open the door for !virtual or virtual(false) or whatever in addition to !final or final(false), etc. So, while having virtual would be nice, it's probably complicating things too much from the user's perspective when combined with the ability to explicitly negate attributes. - Jonathan M Davis
Re: D:YAML 0.5 (also, D:YAML 0.4.5, TinyEndian 0.1)
On Wednesday, 6 August 2014 at 17:09:50 UTC, Kiith-Sa wrote: D:YAML is a YAML parser and emitter for D. Thanks a lot for working on this. I actually really hate YAML, but I'm forced to work with it sometimes, and this library saved me from having to write a parser for it myself. - Jonathan M Davis
Re: DMD v2.066.0-rc1
On Thursday, 7 August 2014 at 17:05:29 UTC, Manu via Digitalmars-d-announce wrote: I've never encountered anybody try and use MSC from the command line in about 15 years professionally. LOL. That's almost always how I use VS when I'm forced to use it at work. As soon as I figured out that I could build from the command line using VS, I stopped opening it unless I had to in order to run the debugger. But I'm not even vaguely a typical Windows developer. I'm pretty hardcore Linux, all things considered. - Jonathan M Davis
Re: DMD v2.066.0-rc1
On Monday, 11 August 2014 at 16:29:10 UTC, Nick Sabalausky wrote: On 8/9/2014 10:57 AM, Dicebot wrote: actually avoided learning anything out of the default comfort zone and called that _professional attitude_. People have some truly bizarre ideas about what constitutes professionalism. At a previous job I had, at one particular developer's meeting with one of the brass (it was a weekly meeting that primarily served to make this particular manager/co-owner feel like she was being useful - not that she ever was - by sticking her fingers where they didn't belong), by pure chance all the developers happened to be wearing shirts with collars. The manager made a big point about how happy she was to see that because (paraphrasing here) shirt collars are professional. Yea, forget competence, skill, ability, work ethic, demeanor...no, apparently professionalism involves...shirt collars. Idiot. That's not the only example of clothing-based naivety I've seen among people who *should* know better: It's truly disturbing how many businesspeople can be trivially fooled into thinking any old random con artist is a trustworthy professional, simply by the con artist walking into any dept store and buying a suit to wear. Oh, I see he's wearing a suit. That means he must be very professional! People are morons. The sad reality is that your physical appearance - including your clothing - can have a big impact on how people perceive you, so in many situations, wearing nicer clothing can have a definite impact. This is particularly true when dealing with stuff like sales where you're constantly having to deal with new people. That's not to say that clothing makes the man, but impressions like that can matter, even if it seems like they shouldn't. So, it makes a lot of sense for some folks to wear nicer clothes - or professional clothes - as part of their job. However, for engineers, it's ridiculous. We shouldn't normally be interacting with anyone where it would matter. So, attire like t-shirt and jeans should be fine. Our clothing should have little impact on our job. And in most cases, if an engineering manager is pushing for that sort of thing, I think that it's a very bad sign. - Jonathan M Davis
Re: DMD v2.066.0-rc1
On Thursday, 14 August 2014 at 19:14:32 UTC, Nick Sabalausky wrote: On 8/7/2014 1:05 PM, Manu via Digitalmars-d-announce wrote: That's what I mean about this culture; it's the opposite of linux, and it outright rejects practises that are linux-like. While I don't doubt that's true of a lot of people in the industry, I have to question how much stubbornly clinging to ignorance can really count as a culture. I'm tempted to claim that isn't culture at all, it's just pandemic pigheaded ignorance. Somehow, I doubt that anyone claims that you pull your punches or that you don't speek your mind... :) - Jonathan M Davis
Re: D 2.066 is out. Enjoy!
On Tuesday, 19 August 2014 at 00:23:22 UTC, Nick Sabalausky wrote: On 8/18/2014 7:14 PM, Dicebot wrote: I also propose to start 2.067 beta branch right now and declare it yet another bug-fixing release. Seconded. Regardless of whether we start another release going that quickly or not, I think that we really need to figure out how to be doing regressionless releases more along the lines of 2 months apart. And if we're getting a lot of those, maybe we should operate more like the linux kernel, which has a merge window of something like a week after a release before they start turning that into the next release - in which case we would do something like continue to merge changes into master all the time but create a new branch and start regressing it within a week or two of actually completing the previous release. Certainly, I don't think that we should wait more than a month before branching, since if we took a month, that would leave a month to get all of the regressions ironed out and still have a 2 month release cycle, and with how things have been going, I'm not sure that we'd even manage to do that in a month. - Jonathan M Davis
Re: D 2.066 is out. Enjoy!
On Tuesday, 19 August 2014 at 04:26:48 UTC, Andrei Alexandrescu wrote: Well that's what happened - someone started 2.067. What's the advantage of doing this? Now we need to worry about master and 2.067 instead of just master. -- Andrei Well, what you do at that point is just fix all of the regressions on the branch, and when it's ready you do another release. You don't put anything else on it. All of the normal dev work goes on master. And some point after the branch has been released as the next release, you branch again. Now, unless we have enough regressions on master that it's going to take us over a month to fix them, I think that branching right after releasing is a bit much, though if some of the regressions are bad enough, maybe it would make sense to release faster. And given how long we've been trying to get 2.066 ready after branching it and how much work has been done on master since then, maybe it makes sense. I don't know. I would have thought though that we'd aim to branch something like 2 to 4 weeks after releasing and then take about a month to make sure that all regressions are fixed so that we get a release about every two months. All the major dev work just continues on master, and it'll end up on a branch about every two months staggered from when that branch gets released as an official release. Certainly, aiming for something along those lines would get us faster releases than we've been doing. We've been waiting way too long to branch and then been rather slow about getting through all of the regressions. By branching earlier, we should be able to release more quickly. - Jonathan M Davis
Re: Fix #2529: explicit protection package #3651
On Tuesday, 19 August 2014 at 17:11:19 UTC, Dicebot wrote: On Tuesday, 19 August 2014 at 17:08:25 UTC, Walter Bright wrote: On 8/19/2014 7:01 AM, Dicebot wrote: Walter, now that release is out can you please state your opinion about https://github.com/D-Programming-Language/dmd/pull/3651 ? It is blocking Phobos module split and decoupling. I keep thinking there's gotta be a way to do this without language changes. Any specific ideas? I can't imagine any clean solution - and proposed language extensions fits naturally into existing system without introducing any new concepts. It is also somewhat frequently asked about in NG. Yeah, I don't see how this could be done without a language change. Currently, modules in sub-packages are treated no differently from modules in completely different packages, so anything you did to give access to a module in a super-package to one in a sub-package would give access to any module. - Jonathan M Davis
Re: D 2.066 is out. Enjoy!
On Thursday, 21 August 2014 at 15:20:49 UTC, Daniel Murphy wrote: Jacob Carlborg wrote in message news:lt50m0$20f0$1...@digitalmars.com... Support for C++ templates was in the last release, and the new pull request is only for special mangling of some stl declarations. You see, I get confused of all the syntax changes ;) Don't worry, so did Walter. LOL. Yeah, well, it would be ni going to support C+ce if we could get an actual list of the C++ features that D currently supports somewhere (and how to use them if it's not obvious). You've been doing so much great work on that that I have no clue what the current state of things is. For instance, this is the first I've heard of anything about template support; I'd thought that we were never going to support templates. Is it just for name mangling or for actually compiling them? - Jonathan M Davis
Re: D 2.066 is out. Enjoy!
On Tuesday, 19 August 2014 at 08:14:41 UTC, novice2 wrote: http://dlang.org/changelog.html Version D 2.066 August 18, 2014 ... Phobos enhancements 1.Bugzilla 3780: getopt improvements by Igor Lesik Sorry, i can't find this improvements nor in getopt.d nor in http://dlang.org/phobos/std_getopt.html. Is this announce prematurely, and that this changes will be seen in 2.067 ? I suspect that the changelog was done by dates rather than based on what was actually merged. Someone else was commenting that some stuff was in there that's going to be in 2.067 and not 2.066, and 2.066 took long enough after it was branched, that it would be easy to accidentally list 2.067 stuff for 2.066 if you were looking at merge dates rather than what actually went on the 2.066 branch. - Jonathan M Davis
Re: D 2.066 is out. Enjoy!
On Thursday, 21 August 2014 at 20:33:56 UTC, Walter Bright wrote: On 8/21/2014 11:54 AM, Jonathan M Davis wrote: LOL. Yeah, well, it would be ni going to support C+ce if we could get an actual list of the C++ features that D currently supports somewhere (and how to use them if it's not obvious). You've been doing so much great work on that that I have no clue what the current state of things is. For instance, this is the first I've heard of anything about template support; I'd thought that we were never going to support templates. Is it just for name mangling or for actually compiling them? The thing is, while the code was there, there wasn't a single test case for it in the test suite. Furthermore, at least for Elf, there was no support for the special mangling done for ::std:: stuff. The thing is, modern C++ practice makes heavy use of std types. Having an interface to C++ code is fairly unusable unless D can also interface to std::string, std::vector, and a few others. The first step is to support the mangling of them. Then, try to construct a workalike on the D side that follows D rules, and yet is able to seamlessly interact with the corresponding C++ code. We'll see how far we can get with that, and then evaluate what to do next. There are no plans for actually compiling C++ code with a D compiler. The plan is for support like we do for C - have a .d header file for it. Well, I wouldn't have expected us to be compiling C++ per se, but previously, it seemed like the party line was that we wouldn't be supporting C++ templates at all because of how hard they were and because we don't want a C++ compiler in the D compiler. I'm certainly all for anything we can do for C++ compatability without going off the deep end. I just don't hear much about what we're actually doing right now. So, I really have no idea what the current status of that is. With what was said at dconf and comments like these, it seems like we're making huge progress in comparison to where we were, and as far as I can tell, about the only way to hear about it is to either pay a lot of attention to dmd pulls or to see an occasonal comment from Daniel talking about it or from someone who's paying close attention to what he's up to. So, at some point in the near future, it would be nice if there were somewhere that actually said what D can actually do with C++ now, even if that doesn't include everything that's going to be coming or if much of it is marked as experimental and relatively untested. - Jonathan M Davis
Re: core.stdcpp
On Tue, 26 Aug 2014 07:00:26 + Ola Fosheim Gr via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Tuesday, 26 August 2014 at 06:35:18 UTC, Walter Bright wrote: The implementation of it, however, is going to be ugly and very specific to each C++ compiler. The user shouldn't need to have to see that ugliness, though. Sounds easier to write your own ::std:: on the c++ side... Quite possibly, but then it wouldn't integrate with existing C++ libraries built with the system's C++ compiler, which would be the point. - Jonathan M Davis
Re: Blog post on hidden treasure in the D standard library.
On Sat, 30 Aug 2014 10:44:18 + safety0ff via Digitalmars-d-announce digitalmars-d-announce@puremagic.com wrote: On Saturday, 30 August 2014 at 07:59:16 UTC, Gary Willoughby wrote: Stop being such a grammar nazi. I didn't bring it up because I felt like being pedantic, I brought it up as a suggestion to make it more pleasant to read. Since you've already been labelled as a pedant, perhaps you should learn the difference between pedantry and Nazism. Can we please stop arguing this (and that goes for both sides of this). This discussion is not in the least bit productive. There is no question that failing to capitalize the letter i when it's used as a pronoun is bad English, and there's no way that anyone fluent enough in English to write an article like this is not going to know that. So, clearly, he's doing it on purpose. I agree that bad grammar - especially when it's so blatant like this - detracts from what's being written, and pointing out that it was a problem was fine, but continuing to argue about it serves no purpose. Clearly, he knows that what he's doing is bad grammar, but he's doing it anyway for whatever his personal reasons are. Arguing about it like this isn't helping. It's just increasing the level of contention around here. - Jonathan M Davis
Re:
On Monday, April 14, 2014 20:47:06 Brad Roberts via Digitalmars-d wrote: Another flurry of bounces floated through today (which I handled by removing the suspensions, again). The only practical choice is a fairly intrusive one. I've enabled the from_is_list option, meaning that the 'from' address from mail originating through the list will be different. I have no idea how well or badly this will work out, but it's that or stop the mail/news gateway which is an even worse option. I've set the mailman from_is_list option to 'wrap_message'. This will likely be the first message through the list with that option set, so we'll see how it works out. I've done this for only the digitalmars.d list so far, but if it works well enough, I'll make the same change to every list. If any of you work at yahoo, would you please visit whatever team is responsible for deciding to cause this world of pain and thank them for me? Yikes. This is making it much harder to read what comes from who what with via Digitalmars-d tacked onto the end of everyone's name. Bleh. The guys at Yahoo are definitely making life harder for the rest of us. - Jonathan M Davis
Re: D UFCS anti-pattern
On Thu, 24 Apr 2014 22:21:32 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: Recently, I observed a conversation happening on the github pull request system. In phobos, we have the notion of output ranges. One is allowed to output to an output range by calling the function 'put'. Here is the implementation of put: void put(R, E)(ref R r, E e) { static if(is(PointerTarget!R == struct)) enum usingPut = hasMember!(PointerTarget!R, put); else enum usingPut = hasMember!(R, put); enum usingFront = !usingPut isInputRange!R; enum usingCall = !usingPut !usingFront; static if (usingPut is(typeof(r.put(e { r.put(e); } else static if (usingPut is(typeof(r.put((E[]).init { r.put((e)[0..1]); } else static if (usingFront is(typeof(r.front = e, r.popFront( { r.front = e; r.popFront(); } else static if ((usingPut || usingFront) isInputRange!E is(typeof(put(r, e.front { for (; !e.empty; e.popFront()) put(r, e.front); } else static if (usingCall is(typeof(r(e { r(e); } else static if (usingCall is(typeof(r((E[]).init { r((e)[0..1]); } else { static assert(false, Cannot put a ~E.stringof~ into a ~R.stringof); } } There is an interesting issue here -- put can basically be overridden by a member function of the output range, also named put. I will note that this function was designed and written before UFCS came into existence. So most of the machinery here is designed to detect whether a 'put' member function exists. One nice thing about UFCS, now any range that has a writable front(), can put any other range whose elements can be put into front, via the pseudo-method put. In other words: void foo(int[] arr) { int[] result = new int[arr.length]; result.put(arr); // put arr into result. } But there is an issue with this. If the destination range actually implements the put member function, but doesn't implement all of the global function's niceties, r.put(...) is not as powerful/useful as put(r,...). Therefore, the odd recommendation is to *always* call put(r,...) I find this, at the very least, to be confusing. Here is a case where UFCS ironically is not usable via a function call that so obviously should be UFCS. The anti-pattern here is using member functions to override or specialize UFCS behavior. In this case, we even hook the UFCS call with the member function, encouraging the name conflict! As a possible solution, I would recommend simply change the name of the hook, and have the UFCS function forward to the hook. This way, calling put(r,...) and r.put(...) is always consistent. Does this make sense? Anyone have any other possible solutions? A relevant bug report (where I actually advocate for adding more of this horrible behavior): https://issues.dlang.org/show_bug.cgi?id=12583 If it doesn't work to override a free function with a member function, I honestly don't see much point to UFCS. The whole idea behind it is to make it so that you don't have to care whether a function is a free function or a member function. The current situation essentially forces you to not use UFCS except in cases where you're trying to add member functions to built-in types. And as such, calling functions on user-defined types using UFCS runs a high risk of not compiling, because all it takes is for the user-defined type to define a function with the same name - even if it takes completely different arguments - and now the compiler won't even try to use the free function anymore. I really think that we should fix it so that stuff like outputRange.put(foo) works - including when types define put themselves. AFAIK, that means changing the overload rules so that member functions conflict with free functions only when they take the same arguments - in which case the member function would be called, as it is now, except that the cases where a free function matches the arguments would also work, allowing us to override free functions with member functions where appropriate and prevent simple name collisions from making UFCS not work (i.e. when the member function takes completely different arguments, UFCS would still use the free function). Without a change along those lines, I'd be strongly inclined to argue against using UFCS in any situation except in those where you need to add member functions to the built-in types. And the only common case for that that I'm aware of is making it so that arrays can function as ranges. But this issue goes far beyond put. - Jonathan M Davis
Re: Issue 9148
On Sat, 26 Apr 2014 00:44:13 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Fri, 25 Apr 2014 23:26:29 -0400, Xinok xi...@live.com wrote: On Saturday, 26 April 2014 at 01:57:06 UTC, bearophile wrote: This is one of the largest problems left in the implementation of D purity: https://issues.dlang.org/show_bug.cgi?id=9148 One example of the refused code: void foo(const int[] a) { int bar() pure { return a[0]; } } void main() {} Bye, bearophile I think this would break the D conventions of purity. 'a' is not immutable, only const, meaning one or more elements could change, making 'bar' impure. It should compile. Purity in D is not the same as the more traditional definition of purity. For example, this compiles: int foo(int[] a) pure {return a[0];} I see no difference from the above. It's the same, and it isn't. Your example has no access to anything other than a, whereas in bearophile's example, bar has access to its outer scope. So, you can have fun like auto baz = new int[](5); void foo(const int[] a) { int bar() pure { return a[0]; } auto i = bar(); baz[0] = 2; auto j = bar(); } void main() { foo(baz); } Now, if we treat the outer scope like a mutable this pointer (which it probably essentially is underneath the hood), then bar can probably be weakly pure, but we have to be very careful not to consider it strongly pure in spite of the fact that it has no explicit, mutable arguments. Contrast that with int foo(const int[] a) pure { return a[0]; } which is the same as your example save for the const, and that function _can_ be strongly pure if it's passed an immutable array (dmd doesn't currently take strong purity that far - it only treats immutable parameters as strongly pure, not immutable arguments to const parameters -but there's no reason why it couldn't). Your example can't be strongly pure either way though, since its parameter is mutable. In any case, I think that bearophile's example is subtly different from yours, though I think that we can make it weakly pure if we're very careful about how we handle the outer scope. However, I'm not sure that treating it as weakly pure buys us anything except in the case where we're trying to make the outer function pure as well. - Jonathan M Davis
Re: DIP61: Add namespaces to D
On Sun, 27 Apr 2014 23:49:41 -0700 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/27/2014 11:17 PM, Jacob Carlborg wrote: On 27/04/14 21:39, Walter Bright wrote: std.datetime is a giant kitchen sink. This is not the best way to organize things. Using smaller modules under packages is a much better way. It's taken an amazingly long time for the core developers to realize this. I'm glad it's happened tough :) We've known it for a long time, but nobody has done anything about it. For example, the new package.d feature was specifically designed so that long modules can be broken up without breaking user code. Nobody has yet modified any Phobos code to actually do this. It's my fault as far as std.datetime goes. I had it mostly done last summer but then didn't have time to finish it, and enough has changed since then that I'm going to have to start over. And life has been quite hectic for me, making it so that I'm not getting to stuff like this as soon as I'd like. I hope to get back to it soon though. It's long past time that it get done. - Jonathan M Davis
Re: DIP61: Add namespaces to D
On Mon, 28 Apr 2014 10:45:40 +0200 Andrej Mitrovic via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/28/14, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: It's my fault as far as std.datetime goes. I had it mostly done last summer but then didn't have time to finish it, and enough has changed since then that I'm going to have to start over. And life has been quite hectic for me, making it so that I'm not getting to stuff like this as soon as I'd like. I hope to get back to it soon though. It's long past time that it get done. Hey if you're out of time, let us know. Maybe give us just a small guide on where to move things around, and we'll take it from there and split it up into packages. I think that I can get to it soon (though unfortunately, I've thought that for a while now and still haven't reached that point). My current plan is to make a number of smaller pull requests to clean it up a bit first. I started with https://github.com/D-Programming-Language/phobos/pull/2088 but it's blocked by a compiler bug. And since it could be a while before that's fixed, I should probably just do some other pull requests to do more of the cleanup and deal with the merge conflicts that it causes. At least Andrei already removed _assertPred, so I don't have to deal with that (and that made splitting std.datetime more of a pain from what I recall - particularly since I was trying to remove it at the same time, which wasn't a good idea). - Jonathan M Davis
Re: DIP61: Add namespaces to D
On Mon, 28 Apr 2014 17:19:16 +0200 Andrej Mitrovic via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/28/14, Dicebot via Digitalmars-d digitalmars-d@puremagic.com wrote: Yeah, it is just a random idea I have just had. I'm afraid you're 7 years too late for that patent. :P https://issues.dlang.org/show_bug.cgi?id=1297 I find it rather funny that bugzilla still labels it as new. :) But that's bugzilla for you. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. While it's not something that I generally like to do, I know that we have instances of that where I work. Also, if the unit tests have to deal with shared resources, they may very well be theoretically independent but would run afoul of each other if run at the same time - a prime example of this would be std.file, which has to operate on the file system. I fully expect that if std.file's unit tests were run in parallel, they would break. Unit tests involving sockets would be another type of test which would be at high risk of breaking, depending on what sockets they need. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 17:58:34 + Atila Neves via Digitalmars-d digitalmars-d@puremagic.com wrote: Unit tests though, by definition (and I'm aware there are more than one) have to be independent. Have to not touch the filesystem, or the network. Only CPU and RAM. I disagree with this. A unit test is a test that tests a single piece of functionality - generally a function - and there are functions which have to access the file system or network. And those tests are done in unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 13:26:40 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 10:50 AM, Jonathan M Davis via Digitalmars-d wrote: There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. Default thread-local globals? -- Andrei Sure, that helps, but it's trivial to write a unittest block which depends on a previous unittest block, and as soon as a unittest block uses an external resource such as a socket or file, then even if a unittest block doesn't directly depend on the end state of a previous unittest block, it still depends on external state which could be affected by other unittest blocks. So, ultimately, the language really doesn't ensure that running a unittest block can be parallelized. If it's pure as bearophile suggested, then it can be done, but as long as a unittest block is impure, then it can rely on global state - even inadvertently - (be it state directly in the program or state outside the program) and therefore not work when pararellized. So, I suppose that you could parallelize unittest blocks if they were marked as pure (though I'm not sure if that's currently a legal thing to do), but impure unittest blocks aren't guaranteed to be parallelizable. I'm all for making it possible to parallelize unittest block execution, but as it stands, doing so automatically would be a bad idea. We could make it so that a unittest block could be marked as parallelizable, or we could even move towards making parallelizable the default and require that a unittest block be marked as unparallelizable, but we'd have to be very careful with that, as it will break code if we're not careful about how we do that transition. I'm inclined to think that marking unittest blocks as pure to parallelize them is a good idea, because then the unittest blocks that are guaranteed to be parallelizable are run in parallel, whereas those that aren't wouldn't be. The primary dowside would be that the cases where the programmer knew that they could be parallelized but they weren't pure, since those unittest blocks wouldn't be parallelized. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 18:53:22 + monarch_dodra via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wednesday, 30 April 2014 at 15:54:42 UTC, bearophile wrote: We've resisted named unittests but I think there's enough evidence to make the change. Yes, the optional name for unittests is an improvement: unittest {} unittest foo {} I am very glad your coworker find such usability problems :-) If we do name the unittests, then can we name them with strings? No need to polute namespace with ugly symbols. Also: // unittest Sort: Non-Lvalue RA range { ... } // vs // unittest SortNonLvalueRARange { ... } // It would be simple enough to avoid polluting the namespace. IIRC, right now, the unittest blocks get named after the line number that they're on. All we'd have to do is change it so that their name included the name given by the programmer rather than being the name given by the programmer. e.g. unittest(testFoo) { } results in a function called something like unittest_testFoo. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 21:09:14 +0100 Russel Winder via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 2014-04-30 at 11:19 -0700, Jonathan M Davis via Digitalmars-d wrote: unittest blocks just like any other unit test. I would very much consider std.file's tests to be unit tests. But even if you don't want to call them unit tests, because they access the file system, the reality of the matter is that tests like them are going to be run in unittest blocks, and we have to take that into account when we decide how we want unittest blocks to be run (e.g. whether they're parallelizable or not). In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of unit test? There is nothing in the language which has ever mandated that unittest blocks be parallelizable or that they be pure (which is essentially what you're saying all unittest blocks should be). And restricting unittest blocks so that they have to be pure (be it conceptually pure or actually pure) would be a _loss_ of functionality. Sure, let's make it possible to parallelize unittest blocks where appropriate, but I contest that we should start requiring that unittest blocks be pure (which is what a function has to be in order to be pararellized whether it's actually marked as pure or not). That would force us to come up with some other testing mechanism to run those tests when there is no need to do so (and I would argue that there is no compelling reason to do so other than ideology with regards to what is truly a unit test). On the whole, I think that unittest blocks work very well as they are. If we want to expand on their features, then great, but let's do so without adding new restrictions to them. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 15:33:17 -0700 H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, Apr 30, 2014 at 02:48:38PM -0700, Jonathan M Davis via Digitalmars-d wrote: On Wed, 30 Apr 2014 21:09:14 +0100 Russel Winder via Digitalmars-d digitalmars-d@puremagic.com wrote: [...] In which case D is wrong to allow them in the unittest blocks and should introduce a new way of handling these tests. And even then all tests can and should be parallelized. If they cannot be then there is an inappropriate dependency. Why? Because Andrei suddenly proposed that we parallelize unittest blocks? If I want to test a function, I'm going to put a unittest block after it to test it. If that means accessing I/O, then it means accessing I/O. If that means messing with mutable, global variables, then that means messing with mutable, global variables. Why should I have to put the tests elsewhere or make is that they don't run whenthe -unttest flag is used just because they don't fall under your definition of unit test? [...] What about allowing pure marking on unittests, and those unittests that are marked pure will be parallelized, and those that aren't marked will be run serially? I think that that would work, and if we added purity inferrence to unittest blocks as Nordlow suggests, then you wouldn't even have to mark them as pure unless you wanted to enforce that it be runnable in parallel. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 20:33:06 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 13:50:10 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 08:59:42 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 8:54 AM, bearophile wrote: Andrei Alexandrescu: A coworker mentioned the idea that unittests could be run in parallel In D we have strong purity to make more safe to run code in parallel: pure unittest {} This doesn't follow. All unittests should be executable concurrently. -- Andrei In general, I agree. In reality, there are times when having state across unit tests makes sense - especially when there's expensive setup required for the tests. int a; unittest { // set up a; } unittest { // use a; } == unittest { int a; { // set up a; } { // use a; } } It makes no sense to do it the first way, you are not gaining anything. It can make sense to do it the first way when it's more like LargeDocumentOrDatabase foo; unittest { // set up foo; } unittest { // test something using foo } unittest { // do other tests using foo which then take advantage of changes made // by the previous test rather than doing all of those changes to // foo in order to set up this test } In general, I agree that tests shouldn't be done that way, and I don't think that I've ever done it personally, but I've seen it done, and for stuff that requires a fair bit of initialization, it can save time to have each test build on the state of the last. But even if we all agree that that sort of testing is a horrible idea, the language supports it right now, and automatically parallelizing unit tests will break any code that does that. Honestly, the idea of running unit tests in parallel makes me very nervous. In general, across modules, I'd expect it to work, but there will be occasional cases where it will break. Then you didn't write your unit-tests correctly. True unit tests-anyway. In fact, the very quality that makes unit tests so valuable (that they are independent of other code) is ruined by sharing state across tests. If you are going to share state, it really is one unit test. All it takes is that tests in two separate modules which have separate functionality access the file system or sockets or some other system resource, and they could end up breaking due to the fact that the other test is messing with the same resource. I'd expect that to be a relatively rare case, but it _can_ happen, so simply parallelizing tests across modules does risk test failures that would not have occurred otherwise. Across the unittest blocks in a single module, I'd be _very_ worried about breakage. There is nothing whatsoever in the language which guarantees that running them in parallel will work or even makes sense. All that protects us is the convention that unit tests are usually independent of each other, and in my experience, it's common enough that they're not independent that I think that blindly enabling parallelization of unit tests across a single module is definitely a bad idea. I think that if we add the assumption, the resulting fallout would be easy to fix. Note that we can't require unit tests to be pure -- non-pure functions need testing too :) Sure, they need testing. Just don't test them in parallel, because they're not guaranteed to work in parallel. That guarantee _does_ hold for pure functions, because they don't access global, mutable state. So, we can safely parallelize a unittest block that is pure, but we _can't_ safely paralellize one that isn't - not in a guaranteed way. I can imagine that even if you could only parallelize 90% of unit tests, that would be an effective optimization for a large project. In such a case, the rare (and I mean rare to the point of I can't think of a single use-case) need to deny parallelization could be marked. std.file's unit tests would break immediately. It wouldn't surprise me if std.socket's unit tests broke. std.datetime's unit tests would probably break on Posix systems, because some of them temporarily set the local time zone - which sets it for the whole program, not just the current thread (those tests aren't done on Windows, because Windows only lets you set it for the whole OS, not just the program). Any tests which aren't pure risk breakage due to changes in whatever global, mutable state they're accessing. I would strongly argue that automatically parallelizing any unittest block which isn't pure is a bad idea, because it's not guaranteed to work, and it _will_ result in bugs in at least some cases. If we make it so that unittest blocks have their purity inferred (and allow you to mark them as pure to enforce that they be pure if you want to require
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 14:35:45 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: Agreed. I think we should look into parallelizing all unittests. -- I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking. If we make it so that the functions generated from unittest blocks have their purity inferred, then any unittest block which can safely be parallelized could then be parallelized by the test runner based on their purity, and any impure unittest functions could then be safely run in serial. And if you want to make sure that a unittest block is parallizable, then you can just explicitly mark it as pure. With that approach, we don't risk breaking existing unit tests, and it allows tests that need to not be run in parallel to work properly by guaranteeing that they're still run serially. And it even make it so that many tests are automatically parallelizable without the programmer having to do anything special for it. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 22:32:33 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 10:01 PM, Jonathan M Davis via Digitalmars-d wrote: I'm all for parallelizing all unittest blocks that are pure, as doing so would be safe, but I think that we're making a big mistake if we try and insist that all unittest blocks be able to be run in parallel. Any that aren't pure are not guaranteed to be parallelizable, and any which access system resources or other global, mutable state stand a good chance of breaking. There are a number of assumptions here: (a) most unittests that can be effectively parallelized can be actually inferred (or declared) as pure; (b) most unittests that cannot be inferred as pure are likely to break; (c) it's a big deal if unittests break. I question all of these assumptions. In particular I consider unittests that depend on one another an effective antipattern that needs to be eradicated. Even if they don't depend on each other, they can depend on the system. std.file's unit tests will break if we parallelize them, because it operates on files and directories, and many of those tests operate on the same temp directories. That can be fixed by changing the tests, but it will break the tests. Other tests _can't_ be fixed if we force them to run in parallel. For instance, some of std.datetime's unit tests set the local time zone of the system in order to test that LocalTime works correctly. That sets it for the whole program, so all threads will be affected even if they're running other tests. Right now, this isn't a problem, because those tests set the timezone at their start and reset it at their end. But if they were made to run in parallel with any other tests involving LocalTime, there's a good chance that those tests would have random test failures. They simply can't be run in parallel due to a system resource that we can't make thread-local. So, regardless of how we want to mark up unittest blocks as parallelizable or not parallelizable (be it explicit, implict, using pure, or using something else), we do need a way to make it so that a unittest block is not run in parallel with any other unittest block. We can guarantee that pure functions can safely be run in parallel. We _cannot_ guarantee that impure functions can safely be run in parallel. I'm sure that many impure unittest functions could be safely run in parallel, but it would require that the programmer verify that if we don't want undefined behavior - just like programmers have to verify that @system code is actually @safe. Simply running all unittest blocks in parallel is akin to considering @system code @safe in a particular piece of code simply because by convention that code should be @safe. pure allows us to detect guaranteed, safe parallelizability. If we want to define some other way to make it so a unittest block can be marked as parallelizable regardless of purity, then fine. But automatically parallelizing impure functions means that we're going to have undefined behavior for those unittest functions, and I really think that that is a bad idea - in addition to the fact that some unittest blocks legitimately cannot be run in parallel due to the use of system resources, so parallelizing them _will_ not only break them but make them impossible to write in a way that's not broken without adding mutexes to the unittest blocks to stop the test runner from running them in parallel. And IMHO, if we end up having to do that anywhere, we've done something very wrong with how unit tests work. - Jonathan M Davis
Re: Parallel execution of unittests
On Wed, 30 Apr 2014 23:56:53 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: I don't think undefined behavior is at stake here, and I find the simile invalid. Thread isolation is a done deal in D and we may as well take advantage of it. Worse that could happen is that a unittest sets a global and surprisingly the next one doesn't see it. At any rate I think it's pointless to insist on limiting parallel running to pure - let me just say I understood the point (thanks) so there is no need to restate it, and that I think it doesn't take us a good place. I'm only arguing for using pure on the grounds that it _guarantees_ that the unittest block is safely parallelizable. If we decide that that guarantee isn't necessary, then we decide that it isn't necessary, though I definitely worry that not having that guarantee will be problematic. I do agree though that D's thread-local by default helps quite a bit in ensuring that most tests will be runnable in parallel. However, if we went with purity to indicate parallelizability, I could easily see doing it implicitly based on purity and allowing for a UDA or somesuch which marked a unittest block as trusted pure so that it could be run in parallel. So, I don't think that going with pure would necessarily be too restrictive. It just would require that the programmer do some extra work to be able to treat a unittest block as safely parallelizable when the compiler couldn't guarantee that it was. Ultimately, my biggest concern here is that it be possible to guarantee that a unittest block is not run in parallel with any other unittest block if that particular unittest requires it for any reason, and some folks seem to be arguing that such tests are always invalid, and I want to make sure that we don't ever consider that to be the case for unittest blocks in D. If we do parallel by default and allow for some kind of markup to make a unittest block serial, then that can work. I fully expect that switching to parallel by default would break a number of tests, which I do think is a problem (particularly since a number of those tests will be completely valid), but it could also be an acceptable one - especially if for the most part, the code that it breaks is badly written code. Regardless, we will need to make sure that we message the change clearly in order to ensure that a minimal number of people end up with random test failures due to the change. On a side note, regardless of whether we want to use purity to infer paralellizability, I think that it's very cool that we have the capability to do so if we so choose, whereas most other languages have no way of even coming close to being able to tell whether a function can be safely parallelized or not. The combination of attributes such as pure and compile-time inference is very cool indeed. - Jonathan M Davis
Re: Parallel execution of unittests
On Thu, 01 May 2014 07:26:59 + Dicebot via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 1 May 2014 at 04:50:30 UTC, Jonathan M Davis via Digitalmars-d wrote: std.file's unit tests would break immediately. It wouldn't surprise me if std.socket's unit tests broke. std.datetime's unit tests would probably break on Posix systems, because some of them temporarily set the local time zone - which sets it for the whole program, not just the current thread (those tests aren't done on Windows, because Windows only lets you set it for the whole OS, not just the program). Any tests which aren't pure risk breakage due to changes in whatever global, mutable state they're accessing. We really should think about separating Phobos tests into unit tests and higher level ones (in separate top-level source folder). The fact that importing std.file in my code with `rdmd -unittest` may trigger file I/O makes me _extremely_ disgusted. How did we even get here? _ Honestly, I see no problem with std.file's unit tests triggering I/O. That's what the module _does_. And it specifically uses the system's temp directory so that it doesn't screw with anything else on the system. Separating the tests out into some other set of tests wouldn't buy us anything IMHO. The tests need to be run regardless, and they need to be run with the same frequency regardless. Splitting those tests out would just make them harder for developers to run, because now they'd have to worry about running two sets of tests instead of just one. As far as I can see, splitting out tests that do I/O would be purely for ideological reasons and would be of no practical benefit. In fact, it would be _less_ practical if we were to do so. - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Wed, 30 Apr 2014 13:21:33 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: Walter and I have had a long chat in which we figured our current offering of abstractions could be improved. Here are some thoughts. There's a lot of work ahead of us on that and I wanted to make sure we're getting full community buy-in and backup. First off, we're considering eliminating destructor calls from within the GC entirely. It makes for a faster and better GC, but the real reason here is that destructors are philosophically bankrupt in a GC environment. I think there's no need to argue that in this community. The GC never guarantees calling destructors even today, so this decision would be just a point in the definition space (albeit an extreme one). I really don't like the fact that struct destructors are not called by the GC, and if anything, I'd be inclined to argue for finding a way to guarantee that they get run rather than guaranteeing that they never get run. It's just far too easy to have a struct expect that its destructor will be run and then have issues when it's not run. But it would be better to define struct destructors as never getting run rather than having it be undefined as it is now. We're considering deprecating ~this() for classes in the future. While it's not good to rely on finalizers, they're good to have as backup if the appropriate cleanup function doesn't get called like it's supposed to. They're not as critical as they'd be in Java, since we have structs, but I'd be disinclined to remove finalizers from D without a really good reason. Also, we're considering a revamp of built-in slices, as follows. Slices of types without destructors stay as they are. Slices T[] of structs with destructors shall be silently lowered into RCSlice!T, defined inside object.d. That type would occupy THREE words, one of which being a pointer to a reference count. That type would redefine all slice primitives to update the reference count accordingly. RCSlice!T will not convert implicitly to void[]. Explicit cast(void[]) will be allowed, and will ignore the reference count (so if a void[] extracted from a T[] via a cast outlives all slices, dangling pointers will ensue). I foresee any number of theoretical and practical issues with this approach. Let's discuss some of them here. I'm really going to have to think about this one. It's such a radical change that I really don't know what to think about it. It will be interesting to see what others have to say about. - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Wed, 30 Apr 2014 14:00:31 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 4/30/14, 1:57 PM, Timon Gehr wrote: On 04/30/2014 10:45 PM, Andrei Alexandrescu wrote: An extreme one indeed, it would break a lot of my code. Every D project I wrote that does networking manages memory using a class that resides on the managed heap, but holds the actual wrapped data in the unmanaged heap. So should I take it those classes all have destructors? -- Andrei (Yes, those destructors free the unmanaged memory.) Thanks... that would need to change :o). -- Andrei And it doesn't even work now, because it's not guaranteed that finalizers get run. And IIRC, based on some of the discussions at dconf last year, dealing with the GC and unloading shared libraries would probably make the situation even worse. But not being able to rely on finalizers running does put us in a bit of a pickle, because it basically means that any case where you need a finalizer, you should probably be using reference counting rather than the GC. That would tend to mean that either classes are going to need to be wrapped in a struct that reference-counts them and/or they're going to need to be allocated with a custom allocator rather than the GC. This problem makes me think of C#'s using blocks where the object is created at the beginning of the using block, and it's dispose method is called when that block is exited (it may also be collected then, but I don't remember). I don't think that that's quite what we want, since there are plenty of cases where you want to pass a class around, so some kind of reference counting would be better, but we probably should consider having some kind of standard function similar to dispose so that there's a standard method to call when a class needs to be cleaned up. And that could tie into a struct in Phobos that we would have to do the reference counting by wrapping the class. - Jonathan M Davis
Re: A few considerations on garbage collection
On Wed, 30 Apr 2014 20:08:03 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 14:15:03 -0400, Dmitry Olshansky dmitry.o...@gmail.com wrote: IIRC they do, it's only arrays of such that doesn't. Anyhow having such a dangerous construct built-in (new = resource leak) in the language becomes questionable. No, they don't. Only objects are marked as having a finalizer. The finalize flag in the GC assumes that the first size_t in the block is a pointer to a vtable. A struct cannot have this. We need to fundamentally modify how this works if we want finalizers for structs to be called, but I think it's worth doing. IIRC, Rainer's precise GC does this. It would be _very_ cool if we could make it so that struct destructors get run when they're on the GC heap. I'm sure that the fact that they're not called currently creates a number of subtle bugs when structs are used which assume that they're destructor is going to be called when they're destroyed/freed. - Jonathan M Davis
Re: Parallel execution of unittests
On Thu, 01 May 2014 10:42:54 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thu, 01 May 2014 00:49:53 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wed, 30 Apr 2014 20:33:06 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: I do think there should be a way to mark a unit test as don't parallelize this. Regardless what our exact solution is, a key thing is that we need to be able have both tests which are run in parallel and tests which are run in serial. Switching to parallel by default will break code, but that may be acceptable. And I'm somewhat concerned about automatically parallelizing unit tests which aren't pure just because it's still trivial to write unittest blocks that aren't safely parallelizable (even if most such examples typically aren't good practice) whereas they'd work just fine now. But ultimately, my main concern is that we not enforce that all unit tests be parallelized, because that precludes certain types of tests. A function may be impure, but run in a pure way. True. The idea behind using purity is that it guarantees that the unittest blocks would be safely parallelizable. But even if we were to go with purity, that doesn't preclude having some way to mark a unittest as parallelizable in spite of its lack of purity. It just wouldn't be automatic. Anything that requires using the local time zone should be done in a single unit test. Most everything in std.datetime should use a defined time zone instead of local time. Because LocalTime is the default timezone, most of the tests use it. In general, I think that that's fine and desirable, because LocalTime is what most everyone is going to be using. Where I think that it actually ends up being a problem (and will eventually necessitate that I rewrite a number of the tests - possibly most of them) is when tests end up making assumptions that can break in certain time zones. So, in the long run, I expect that far fewer tests will use LocalTime than is currently the case, but I don't think that I agree that it should be avoided on quite the level that you seem to. It is on my todo list though to go over std.datetime's unit tests and make it stop using LocalTime where that will result in the tests failing in some time zones. Take for example, std.datetime. The constructor for SysTime has this line in it: _timezone = tz is null ? LocalTime() : tz; All unit tests that pass in a specific tz (such as UTC) could be pure calls. But because of that line, they can't be! Pretty much nothing involving SysTime is pure, because adjTime can't be pure, because LocalTime's conversion functions can't be pure, because it calls the system's functions to do the conversions. So, very few of SysTime's unit tests could be parallelized based on purity. The constructor is just one of many places where SysTime can't be pure. So, it's an example of the types of tests that would have to be marked as explicitly parallelizable if we used purity as a means of determining automatic parallelizabitily. - Jonathan M Davis
Re: Parallel execution of unittests
On Thu, 01 May 2014 14:40:41 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/1/14, 2:28 PM, Jason Spencer wrote: But it seems the key question is whether order can EVER be important for any reason. I for one would be willing to give up parallelization to get levelized tests. What are you seeing on your project? How do you allow tests to have dependencies and avoid order issues? Why is parallelization more important than that? I'll be blunt. What you say is technically sound (which is probably why you believe it is notable) but seems to me an unnecessarily complex engineering contraption that in all likelihood has more misuses than good uses. I fully understand you may think I'm a complete chowderhead for saying this; in the past I've been in your place and others have been in mine, and it took me years to appreciate both positions. -- Andrei It's my understanding that given how druntime is put together, it should be possible to override some of its behaviors such that you could control the order in which tests were run (the main thing lacking at this point is the fact that you can currently only control it at module-level granularity) and that that's what existing third party unit test frameworks for D do. So, I would think that we could make it so that the default test runner does things the sensible way that works for most everyone, and then anyone who really wants more control can choose to override the normal test runner to do run the tests the way that they want to. That should be essentially the way that it is now. The main question then is which features we think are sensible for everyone, and I think that based on this discussion, at this point, it's primarily 1. Make it possible for druntime to access unit test functions individually. 2. Make it so that druntime runs unit test functions in parallel unless they're marked as needing to be run in serial (probably with a UDA for that purpose). 3. Make it so that we can name unittest blocks so that stack traces have better function names in them. With those sorted out, we can look at further features like whether we want to be able to run unit tests by name (or whatever other nice features we can come up with), but we might as well start there rather than trying to come up with a comprehensive list of the features that D's unit testing facilities should have (especially since we really should be erring on the side of simple). - Jonathan M Davis
Re: DIP(?) Warning to facilitate porting to other archs
On Thu, 01 May 2014 11:17:09 + Temtaime via Digitalmars-d digitalmars-d@puremagic.com wrote: Hi everyone. I think it's need to have -w64(or other name, offers ?) flag that warns if code may not compile on other archs. Example: size_t a; uint b = a; // ok on 32 without a warning but fail on 64 with error And on 32 with -w64 it'll be : Warning : size_t.sizeof may be greater than 32 bit What you thinks ? Should i create proposal or nobody cares about porting and it's useless ? Any ideas are welcome. The compiler doesn't even know that size_t exists. It's just an alias in object_.d. So, it could be fairly involved to get the compiler to warn about something like this. And while in some respects, this would be nice to have, I don't think that it's actually a good idea. IMHO, the compiler pretty much has no business warning about anything. As far as the compiler is concerned, everything should be either an error or nothing (and Walter agrees with me on this; IIRC, the only reason that he added warnings in the first place was as an attempt to appease some folks). About the only exception would be deprecation-related warnings, because those are items that aren't currently errors but are going to be errors. If warnings are in the compiler, programmers are forced to fix them as if they were errors (because it's bad practice to leave compiler warnings in your build), and they can actually affect what does and doesn't compile thanks to the -w flag (which can be particularly nasty when stuff like template constraints get involved). Warnings belong in lint-like tools where the user can control what they want to be warned about, including things that would be useful to them but most other folks wouldn't care about. So, unless you're suggesting that we make it an error to assign a value of size_t to a uint, I don't think that it makes any sense for the compiler to say anything about this, and given the fact that it doesn't know anything about size_t anyway, it's probably not particularly reasonable to have the compiler warn about it even if we agreed that it would be a good idea. D is ideally suited to writing lint-like tools, and as I understand it, Brian Schott has written one. I don't know what state it's currently in or what exactly it can warn you about at this point, but I think that it would be better to look at putting warnings like this in such a tool than to try and get it put in the compiler. - Jonathan M Davis
Re: DIP(?) Warning to facilitate porting to other archs
On Fri, 02 May 2014 15:54:37 -0400 Nick Sabalausky via Digitalmars-d digitalmars-d@puremagic.com wrote: Warnings ARE a built-in lint-like tool. Perhaps, but having them in the compiler is inherently flawed, because you have little-to-no control over what it warns about, and you're forced to essentially treat them as errors, because it's incredibly error-prone to leave any warnings in the build (they mask real problems too easily). As such, it makes no sense to have warnings in the compiler IMHO. On top of that, lint itself proves that lint tends to not get used. True, that is a problem. But if folks really want the warnings, they can go to the extra effort. And I'd much rather err on the side of folks screwing up because they didn't bother to run the tool than having to fix nonexistent problems in my code because someone convinced a compiler dev to make the compiler warn about something that's a problem some of the time but isn't a problem in what I'm actually doing. - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Fri, 02 May 2014 21:03:15 + monarch_dodra via Digitalmars-d digitalmars-d@puremagic.com wrote: On Friday, 2 May 2014 at 15:06:59 UTC, Andrei Alexandrescu wrote: So now it looks like dynamic arrays also can't contain structs with destructors :o). -- Andrei Well, that's always been the case, and even worst, since in a dynamic array, destructor are guaranteed to *never* be run. Furthermore, given the append causes relocation which duplicates, you are almost *guaranteed* to leak your destructors. You just can't keep track of the usage of a naked dynamic array. This usually comes as a great surprise to users in .learn. It's also the reason why using File[] never ends well... Heck, I probably knew that before, but I had completely forgotten. If you'd asked me yesterday whether struct destructors were run in dynamic arrays, I'd have said yes. And if someone like me doesn't remember that, would you expect the average D programmer to? The current situation is just plain bug-prone. Honestly, I really think that we need to figure out how to make it so that struct destructors are guaranteed to be run so long as the memory that they're in is collected. Without that, having destructors in structs anywhere other than directly on the stack is pretty much broken. - Jonathan M Davis
Re: DIP(?) Warning to facilitate porting to other archs
On Fri, 02 May 2014 22:39:12 + Meta via Digitalmars-d digitalmars-d@puremagic.com wrote: On Friday, 2 May 2014 at 21:40:09 UTC, Jonathan M Davis via Digitalmars-d wrote: True, that is a problem. But if folks really want the warnings, they can go to the extra effort. Why are we making people go to extra effort to get lint-like functionality if we want it to be something that everyone uses? Whether a linter is a separate logical entity within the compiler or a library that can be hooked into, it should be on by default. The problem that some of what gets warned about is _not_ actually a problem. If it always were, it would be an error. So, unless you have control over exactly what gets warned about and have the ability to disable the warning in circumstances where it's wrong, it makes no sense to have the warnings, because you're forced to treat them as errors and always fix them, even if the fix is unnecessary. If the compiler provides that kind of control, then fine, it can have warnings, but dmd doesn't and won't, because Walter doesn't want it to have a vast assortment of flags to control anything (warnings included). That being the case, it makes no sense to put the warnings in the compiler. With a lint tool however, you can configure it however you want (especially because there isn't necessarily one, official tool, making it possible to have a lint tool that does exactly what you want for your project). It's not tied to what the language itself requires, making it much more sane as a tool for giving warnings. The compiler tends to have to do what fits _everyone's_ use case, and that just doesn't work for warnings. Putting warnings in the compiler always seems to result in forcing people to change their code to make the compiler shut up about something that is perfectly fine. - Jonathan M Davis
Re: Reopening the debate about non-nullable-by-default: initialization of member fields
On Sat, 03 May 2014 00:50:14 + Idan Arye via Digitalmars-d digitalmars-d@puremagic.com wrote: We are all sick and tired of this debate, but today I've seen a question in Stack Exchange's Programmers board that raises a point I don't recall being discussed here: http://programmers.stackexchange.com/questions/237749/how-do-languages-with-maybe-types-instead-of-nulls-handle-edge-conditions Consider the following code: class Foo{ void doSomething(){ } } class Bar{ Foo foo; this(Foo foo){ doSomething(); this.foo=foo; } void doSomething(){ foo.doSomething(); } } Constructing an instance of `Bar`, of course, segfaults when it calls `doSomething` that tries to call `foo`'s `doSomething`. The non-nullable-by-default should avoid such problems, but in this case it doesn't work since we call `doSomething` in the constructor, before we initialized `foo`. Yeah, I brought this up before, and it's one of the reasons why I'm against non-nullable by default. It means that class references will need to be treated the same as structs whose init property is disabled, which can be _very_ limiting. And I don't know if we currently handle structs with disabled init properties correctly in all cases, since it's not all that hard for something subtle to have been missed and allow for such a struct to be used before it was actually initialized (and the fact that not much code uses them would make it that much more likely that such a bug would go unnoticed). Hopefully, all those issues have been sorted out by now though. If so, then I would think that we already have all of the rules in place for how non-nullable references would be dealt with with regards to initialization, but they'd still be very limiting, because most of D expects that all types have an init property. - Jonathan M Davis
Re: The Current Status of DQt
On Sat, 03 May 2014 11:00:37 + w0rp via Digitalmars-d digitalmars-d@puremagic.com wrote: So, I am eager to hear what people think about all of this. Does anyone like the work that I have done, and will it be useful? Have I committed some terrible crime against nature, for which I must be punished? Does anyone have any ideas about things that could be improved, or where to go next? Please, let me know. I can't really comment much on your approach or implementation, because I haven't looked at what you've done, and while I do have some experince with Qt, I haven't done a lot with it (and I haven't done a lot with GUI programming in general), so I'm not in a good position to comment on or review a D wrapper for Qt. That being said, if I were to write a GUI application in either C++ or D, I would want to use Qt (preferably Qt5). And given what is on my todo list, I expect that I'll be looking at writing a GUI application in D within a year or two. So, having a useable wrapper library for Qt in D is something that I'm very interested in seeing happen. I wasn't aware of your efforts in that regard (I was just aware of QtD, though it's not clear to me how actively developed it is at this point, since it was my understanding that the original devs dropped it, but I know that some folks have repos of it with more recent changes), but I'm very glad that someone is taking up this torch, and I wish you the best of luck with it. I'm just not likely to be of much help in reviewing or critiquing it at this point. However, there are quite a few folks around here who are not only much more familiar with GUI development but who are also very opinionated on the matter, so hopefully some of them will be able to chime in with useful insights. - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Sat, 03 May 2014 22:44:39 -0400 Nick Sabalausky via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/3/2014 6:44 PM, Andrei Alexandrescu wrote: On 5/3/14, 12:40 PM, Walter Bright wrote: On 5/1/2014 7:59 AM, Andrei Alexandrescu wrote: If a class has at least one member with a destructor, the compiler might need to generate a destructor for the class. And in fact that's what dmd does. Which suggests a simple solution for calling destructors for structs and arrays: * Lower new for structs to return: new S; - return (new class { S member; }).member; * Lower array construction similarly. Then voila, the anonymous classes will destroy structs and arrays appropriately. Uhh, but doesn't this completely break as soon as class dtors go away? Based on other comments, I think that Andrei has been convinced that class destructors can't go away at this point and that there certainly isn't any consensus that it would even be desirable for them to go away (though what he'd choose to do if we could break code willy-nilly, I don't know). So, this particular proposal is presumably done with the idea that class destructors are here to stay. Rather, it's trying to make it so that the destructors for structs on the heap get run unlike now - which is a much better direction to try and go IMHO. - Jonathan M Davis
Re: More radical ideas about gc and reference counting
On Sat, 03 May 2014 15:44:03 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/3/14, 12:40 PM, Walter Bright wrote: On 5/1/2014 7:59 AM, Andrei Alexandrescu wrote: If a class has at least one member with a destructor, the compiler might need to generate a destructor for the class. And in fact that's what dmd does. Which suggests a simple solution for calling destructors for structs and arrays: * Lower new for structs to return: new S; - return (new class { S member; }).member; * Lower array construction similarly. Then voila, the anonymous classes will destroy structs and arrays appropriately. This might be a good approach, though I confess that it strikes me as rather weird to wrap structs in classes like that. It also leaves open the question of how to deal with structs that are newed directly rather than put in an array. _Those_ really need to be destroyed properly as well. And unless you're suggesting that S* effectively become a pointer to a struct within an anonymous class in all cases, not only would this not work for structs which were newed up directly on the heap, but I'd be worried about what would happen when someone did something like arr[5] with an array of structs. Would they get a pointer to a class or a struct? It was my understanding that some of what Rainer Schutze did with his precise GC involved adding RTInfo which could make it possible to run struct destructors for structs that were newed up directly on the heap. If that is indeed the case, then I would think that we could use that for arrays of structs as well. - Jonathan M Davis
Re: D For A Web Developer
On Sat, 03 May 2014 19:36:53 -0700 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/3/2014 6:57 PM, Nick Sabalausky wrote: I'm not sure mock networks can really be used for testing a client-only lib of some specific protocol. There may also be other examples. There's also the question of whether or not D's unittest {...} should *expect* to be limited to tests that are *technically* unit tests. Currently, unittest {...} is useful for more forms of testing than just unit tests. I think it's debatable whether we want kill off those uses without offering a comparable alternative with reasonable migration. I'm not suggesting killing off anything. I'm suggesting it may not be good practice to use unit tests for testing actual networks. I'd write unit tests which talked to sockets on the same computer if that was what was required to test a particular function, but I would definitely consider it bad practice to have a unit test try to talk to anything on a separate computer. Now, if you're using unittest blocks for something other than unit tests, then I guess that that could be fine, though I question that it's good practice to use unittest blocks for other purposes. Regardless, unittest blocks don't really put any restrictions on what kind of code can go in them, and I'd prefer that that stay the case. The discussion on parallelizing unit tests threatens that on some level, but as long as we have the means to mark unittest blocks in some manner that tells the test runner not to run them in parallel with any other unittest blocks, then I think that we should be fine on that front. - Jonathan M Davis
Re: Scenario: OpenSSL in D language, pros/cons
On Sun, 04 May 2014 08:34:19 + Daniele M. via Digitalmars-d digitalmars-d@puremagic.com wrote: I have read this excellent article by David A. Wheeler: http://www.dwheeler.com/essays/heartbleed.html And since D language was not there, I mentioned it to him as a possible good candidate due to its static typing and related features. However, now I am asking the community here: would a D implementation (with GC disabled) of OpenSSL have been free from Heartbleed-type vulnerabilities? Specifically http://cwe.mitre.org/data/definitions/126.html and http://cwe.mitre.org/data/definitions/20.html as David mentions. I find this perspective very interesting, please advise :) @safe code protects against indexing an array out-of-bounds. So, if OpenSSL had been implemented in D, and its heartbeat feature used @safe code, then heartbleed would not have been possible. As soon as an attempt was made to index passed the end of the array, it would have thrown a RangeError and killed the program. Now, even if OpenSSL had been implemented in D, if it had used @system or @trusted code for its heartbeat feature, then it could have had the bug just as easily in D as it did in C. And given all of the horrible practices in OpenSSL, I very much doubt that having it written in D would have prevented much, because anyone making the choices that the OpenSSL folks have been making would likely have ended up with horrible D code which was mostly @system and probably doing all kinds of things that are inherently risky, forgoeing many of the benefits that D provides. I think that it's safe to say that D makes it easier to catch problems like this, but it doesn't entirely prevent them, and bad programming practices can pretty much always get around protections that the language provides unless the language provides no ways around those protections (which isn't the case in D, because it's a systems language and needs to provide low-level access and features for those programs that need them - it just doesn't use those by default). If I had more time, I'd actually be tempted to write an SSL implementation in D, but even if I were to do an excellent job of it, it would still need to be vetted by security experts to make sure that it didn't have horrible security bugs in it (much as it would be likely that there would be fewer thanks to the fact that it would be writen in D), and I suspect that it's the kind of thing that many people aren't likely to trust because of how critical it is. So, I don't know how good an idea it is at this point for someone to write an implementation of SSL or TLS in D. Certainly, it's the type of thing where we've generally tried to wrap existing C libraries in order to avoid having to spend the time, effort, and expertise on in order to fully implement it ourselves. The Go guys did it, but if I understand correctly, the fellow that did it was one of the OpenSSL developers, so presumably he's already very familiar with all of the ins and outs of SSL, and I don't know if any of us here are (I'm certainly not - if I were doing it, I'd pretty much just have to go off of its associated RFCs, for better or worse). At this point though, if I were looking at using an existing implementation of SSL, I'd probably be looking at GnuTLS rather than OpenSSL given how horrible OpenSSL's codebase is. I don't know that GnuTLS is any better, but it wouldn't be hard for it to be. OpenSSL is horrible both with regards to its implementation and its API, and we'd all be better off if something better replaced it (be it GnuTLS or something else). Unfortunately, even if something better _were_ written in D, it's probably only the D folks who would benefit, since it's not terribly likely at this point that very many folks are going to wrap a D library in order to use it another language. However, regardless of whether we ever end up with an SSL implementation in D, I think that in the long run, D will show itself to be much, much better than C or C++ at writing code that has a low number of security bugs. - Jonathan M Davis
Re: Scenario: OpenSSL in D language, pros/cons
On Sun, 04 May 2014 21:18:22 + Daniele M. via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sunday, 4 May 2014 at 10:23:38 UTC, Jonathan M Davis via Digitalmars-d wrote: And then comes my next question: except for that malloc-hack, would it have been possible to write it in @safe D? I guess that if not, module(s) could have been made un-@safe. Not saying that a similar separation of concerns was not possible in OpenSSL itself, but that D could have made it less development-expensive in my opinion. I don't know what all OpenSSL is/was doing, and I haven't looked into it in great detail. I'm familiar with what caused heartbleed, and I'm somewhat familiar with OpenSSL's API from having dealt with it at work, but most of what I know about OpenSSL, I know from co-workers who have had to deal with it and other stuff that I've read about it, and in general, from what I understand, it's just plain badly designed and badly written, and it's a miracle that it works as well as it does. Most of the problems seem to stem from how the project is managed (including having horrible coding style and generally not liking to merge patches), but it's also certain that a number of the choices that they've made make it easier for security problems to creep in (e.g. using their own malloc in an attempt to gain some speed on some OSes). From what I know of SSL itself (and I've read some of the spec, but not all of it), very little of it (and probably none of it save for the actual operations on the sockets) actually requires anything that's @system. The problem is when you go to great lengths to optimize the code, which the OpenSSL guys seem to have done. When you do that, you do things like turn off array bounds checking and generally try and avoid many of the safety features that a language like D provides, since many of them do incur at least some overhead. Actually implementing SSL itself wouldn't take all that long from what I understand. The main problem is in maintenance - probably in particular with regards to the fact that you'd have to keep adding support for more encryption methods as they come out (which technically aren't part of SSL itself), but I'm not familiar enough with the details to know all of the nooks and crannies that would cause maintenance nightmares. The base spec is less than 70 pages long though. The fellow who answered the question here seems to think that implementing SSL itself is actually fairly easy and that he's done it several times for companies already: http://security.stackexchange.com/questions/55465 I fully expect that if someone were to implement it in D, it would be safer out of the box than a C implementation would be. But if you had to start playing tricks to get it faster, that would increase the security risk, and in order for folks to trust it, you'd have to get the code audited, which is a whole other order of pain (and one that potentially costs money, depending on who does the auditing). If I had more time, I'd actually be tempted to write an SSL implementation in D, but even if I were to do an excellent job of it, it would still need to be vetted by security experts to make sure that it didn't have horrible security bugs in it (much as it would be likely that there would be fewer thanks to the fact that it would be writen in D), and I suspect that it's the kind of thing that many people aren't likely to trust because of how critical it is. Nobody would expect/trust a single person to do this job :P Working in an open source project would be best. If someone around here implemented SSL in D, I fully expect that it would be open source, and I fully expect that one person could do it. It's just a question of how long it would take them - though obviously sharing the work among multiple people could make it faster. Where it's definitely required that more people get involved is when you want the code audited to ensure that it's actually safe and secure. And that's where implementing an SSL library is fundamentally different from implementing most libraries - it's so integral to security that it doesn't really cut it to just throw an implementation together and toss it out there for folks to use it if they're interested. Unfortunately, even if something better _were_ written in D, it's probably only the D folks who would benefit, since it's not terribly likely at this point that very many folks are going to wrap a D library in order to use it another language. Here I don't completely agree: if we can have a binary-compatible implementation done in D, then we would be able to modify software to eventually use it as a dependency. I don't see the necessary D dependencies as prohibitive here. If D were to be part of a typical Linux distro, a D compiler would have to be part of a typical Linux distro. We're getting there with gdc as it's getting merged into gcc, but I don't think that it's ended up on many distros
Re: Scenario: OpenSSL in D language, pros/cons
On Sun, 04 May 2014 13:29:33 + Meta via Digitalmars-d digitalmars-d@puremagic.com wrote: The only language I would really trust is one in which it is impossible to write unsafe code, because you can then know that the developers can't use such unsafe hacks, even if they wanted to. Realistically, I think that you ultimately have to rely on the developers doing a good job. Good tools help a great deal (including a programming language that's safe by default while still generally being efficient), but if you try and restrict the programmer such that they can only do things that are guaranteed to be safe, I think that you're bound to make it impossible to do a number of things, which tends to not only be very frustrating to the programmers, but it can also make it impossible to get the performance that you need in some circumstances. So, while you might be able to better trust a library written in a language that's designed to make certain types of problems impossible, I don't think that it's realistic for that language to get used much in anything performance critical like an SSL implementation. Ultimately, I think that the trick is to make things as safe as they can be without actually making it so that the programmer can't do what they need to be able to do. And while, I don't think that D hit the perfect balance on that one (e.g. we should have made @safe the default if we wanted that), I think that we've done a good job of it overall - certainly far better than C or C++. - Jonathan M Davis
Re: Scenario: OpenSSL in D language, pros/cons
On Mon, 05 May 2014 07:39:13 + Paulo Pinto via Digitalmars-d digitalmars-d@puremagic.com wrote: Sometimes I wonder how much money have C design decisions cost the industry in terms of anti-virus, static and dynamic analyzers tools, operating systems security enforcements, security research and so on. All avoidable with bound checking by default and no implicit conversions between arrays and pointers. Well, a number of years ago, the folks who started the codebase of the larger products at the company I work at insisted on using COM everywhere, because we _might_ have to interact with 3rd parties, and they _might_ not want to use C++. So, foolishly, they mandated that _nowhere_ in the codebase should any C++ objects be passed around except by pointer. They then had manual reference counting on top of that to deal with memory management. That decision has cost us man _years_ in time working on reference counting-related bugs. Simply using smart pointers instead would probably have saved the company millions. COM may have its place, but forcing a whole C++ codebase to function that way was just stupid, especially when pretty much none of it ever had to interact directly with 3rd party code (and even if it had, it should have been done through strictly defined wrapper libraries; it doesn't make sense that 3rd parties would hook into the middle of your codebase). Seemingly simple decisions can have _huge_ consequences - especially when that decision affects millions of lines of code, and that's definitely the case with some of the decisions made for C. Some of them may have be unavoidable give the hardware situation and programming climate at the time that C was created, but we've been paying for them ever since. And unfortunately, the way things are going at this point, nothing will ever really overthrow C. We'll have to deal with it on some level for a long, long time to come. - Jonathan M Davis
Re: Parallel execution of unittests
On Mon, 05 May 2014 10:00:54 + bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote: Walter Bright: D has so many language features, we need a higher bar for adding new ones, especially ones that can be done straightforwardly with existing features. If I am not wrong, all this is needed here is a boolean compile-time flag, like __is_main_module. I think this is a small enough feature and gives enough back that saves time, to deserve to be a built-in feature. I have needed this for four or five years and the need/desire isn't going away. As far as I can tell, adding a feature wouldn't add much over simply using a version block for defining your demos. Just because something is done in python does not mean that it is appropriate for D or that it requires adding features to D in order to support it. Though I confess that I'm biased against it, because not only have I never needed the feature that you're looking for, but I'd actually consider it bad practice to organize code that way. It makes no sense to me to make it so that any arbitrary module can be the main module for the program. Such code should be kept separate IMHO. And I suspect that most folks who either haven't done much with python and/or who don't particularly like python would agree with me. Maybe even many of those who use python would; I don't know. Regardless, I'd strongly argue that this is a case where using user-defined versions is the obvious answer. It may not give you what you want, but it gives you want you need in order to make it so that a module has a main that's compiled in only when you want it to be. And D is already quite complicated. New features need to pass a high bar, and adding a feature just so that something is built-in rather than using an existing feature which solves the problem fairly simply definitely does not pass that bar IMHO. I'm completely with Walter on this one. - Jonathan M Davis
Re: Scenario: OpenSSL in D language, pros/cons
On Mon, 05 May 2014 10:24:27 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Monday, 5 May 2014 at 09:32:40 UTC, JR wrote: On Sunday, 4 May 2014 at 21:18:24 UTC, Daniele M. wrote: And then comes my next question: except for that malloc-hack, would it have been possible to write it in @safe D? I guess that if not, module(s) could have been made un-@safe. Not saying that a similar separation of concerns was not possible in OpenSSL itself, but that D could have made it less development-expensive in my opinion. TDPL SafeD visions notwithstanding, @safe is very very limiting. I/O is forbidden so simple Hello Worlds are right out, let alone advanced socket libraries. I/O is not forbidden, it's just that writeln and friends currently can't be made safe, but that is being worked on AFAIK. While I/O usually goes through the OS, the system calls can be manually verified and made @trusted. As the underlying OS calls are all C functions, there will always be @system code involved in I/O, but in most cases, we should be able to wrap those functions in D functions which are @trusted. Regarldess, I would think that SSL could be implemented without sockets - that is, all of its operations should be able to operate on arbitrary data regardless of whether that data is sent over a socket or not. And if that's the case, then even if the socket operations themselves had to be @system, then everything else should still be able to be @safe. Most of the problems with @safe stem either from library functions that don't use it like they should, or because the compiler does not yet do a good enough job with attribute inference on templated functions. Both problems are being addressed, so the situation will improve over time. Regardless, there's nothing fundamentally limited about @safe except for operations which are actually unsafe with regards to memory, and any case where something isn't @safe when it's actually memory safe should be and will be fixed (as well as any situation which isn't memory safe but is considered @safe anyway - we do unfortunately still have a few of those). - Jonathan M Davis
Re: Thread name conflict
On Mon, 05 May 2014 15:55:13 +0400 Dmitry Olshansky via Digitalmars-d digitalmars-d@puremagic.com wrote: Why the heck should internal symbols conflict with public from other modules? No idea. Because no one has been able to convince Walter that it's a bad idea for private symbols to be visible. Instead, we've kept the C++ rules for that, and they interact very badly with module-level symbols - something that C++ doesn't have to worry about. Unfortunately, as I understand it, fixing it isn't quite as straightforward as making private symbols invisible. IIRC, Martin Nowak had a good example as to why as well as a way to fix the problem, but unfortunately, I can't remember the details now. Regardless, I think that most of us agree that the fact that private symbols conflict with those from other modules is highly broken. And it makes it _very_ easy to break code by making any changes to a module's implementation. The question is how to convince Walter. It'll probably require that someone just go ahead and implement it and then argue about a concrete implementation rather than arguing about the idea. - Jonathan M Davis
Re: Parallel execution of unittests
On Mon, 05 May 2014 11:26:29 + bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote: Jonathan M Davis: Such code should be kept separate IMHO. This means that you now have two modules, so to download them atomically you need some kind of packaging, like a zip. If your project is composed by many modules this is not a problem. But if you have a single module project (and this happens often in Python), going from 1 to 2 files is not nice. I have written tens of reusable D modules, and some of them have a demo or are usable stand-alone when you have simpler needs. Honestly, I wouldn't even consider distributing something that was only a single module in size unless it were on the scale of std.datetime, which we've generally agreed is too large for a single module. So, a single module wouldn't have enough functionality to be worth distributing. And even if I were to distribute such a module, I'd let its documentation speak for itself and otherwise just expect the programmer to read the code. Regardless, the version specifier makes it easy to have a version where main is defined for demos or whatever else you might want to do with it. So, I'd suggest just using that. I highly doubt that you'd be able to talk either Walter or Andrei into supporting a separate feature for this. At this point, we're trying to use what we already have to implement new things rather than adding new features to the language, no matter how minor it might seem. New language features are likely be restricted to things where we really need them to be language features. And this doesn't fit that bill. - Jonathan M Davis
Re: Thread name conflict
On Mon, 05 May 2014 13:11:29 + Dicebot via Digitalmars-d digitalmars-d@puremagic.com wrote: On Monday, 5 May 2014 at 12:48:11 UTC, Jonathan M Davis via Digitalmars-d wrote: On Mon, 05 May 2014 15:55:13 +0400 Dmitry Olshansky via Digitalmars-d digitalmars-d@puremagic.com wrote: Why the heck should internal symbols conflict with public from other modules? No idea. Because no one has been able to convince Walter that it's a bad idea for private symbols to be visible. Instead, we've kept the C++ rules for that, and they interact very badly with module-level symbols - something that C++ doesn't have to worry about. As far as I know Walter does not object changes here anymore. It is only matter of agreeing on final design and implementing. Well, that's good to hear. Unfortunately, as I understand it, fixing it isn't quite as straightforward as making private symbols invisible. IIRC, Martin Nowak had a good example as to why as well as a way to fix the problem, but unfortunately, I can't remember the details now. I remember disagreeing with Martin about handling protection checks from template instances. Those are semantically verified at declaration point but actual instance may legitimately need access to private symbols of instantiating module (think template mixins). Probably there were other corner cases but I can't remember those I have not been arguing about :) IIRC, it had something to do with member functions, but I'd have to go digging through the newsgroup archives for the details. In general though, I think that private symbols should be ignored by everything outside of the module unless we have a very good reason to do otherwise. Maybe they should still be visible for the purposes of reflection or some other case where seeing the symbols would be useful, but they should never conflict with anything outside of the module without a really good reason. Anyway, DIP22 is on agenda for DMD 2.067 so this topic is going to be back to hot state pretty soon. It's long passed time that we got this sorted out. - Jonathan M Davis
Re: Get object address when creating it in for loop
On Mon, 05 May 2014 16:15:42 + hardcoremore via Digitalmars-d digitalmars-d@puremagic.com wrote: How to get and address of newly created object and put it in pointer array? int maxNeurons = 100; Neuron*[] neurons = new Neuron*[](maxNeurons); Neuron n; for(int i = 0; i maxNeurons; i++) { n = new Neuron(); neurons[] = n; // here n always returns same adress } writefln(Thread func complete. Len: %s, neurons); This script above will print array with all the same address values, why is that? n gives you the address of the local variable n, not of the object on the heap that it points to. You don't normally get at the address of class objects in D. There's rarely any reason to. Classes always live on the heap, so they're already references. Neuron* is by definition a pointer to a class _reference_ not to an instance of Neuron. So, you'd normally do Neuron[] neurons; for your array. I very much doubt that you really want an array of Neuron*. IIRC, you _can_ get at an address of a class instance by casting its reference to void*, but I'm not sure, because I've never done it. And even then, you're then using void*, not Neuron*. Also FYI, questions like this belong in D.learn. The D newsgroup is for general discussions about D, not for questions related to learning D. - Jonathan M Davis
Re: Scenario: OpenSSL in D language, pros/cons
On Tue, 06 May 2014 09:56:11 +0200 Timon Gehr via Digitalmars-d digitalmars-d@puremagic.com wrote: On 05/05/2014 12:41 PM, Jonathan M Davis via Digitalmars-d wrote: Regardless, there's nothing fundamentally limited about @safe except for operations which are actually unsafe with regards to memory What does 'actually unsafe' mean? @safe will happily ban statements that will never 'actually' result in memory corruption. @safe is supposed to ban any code where it can't prove that it can't access memory that it shouldn't - be it to read from it to write to it - as well as code that it can't prove will not result in other code accessing memory that it shouldn't have access to. It's quite possible that some code which shouldn't be being banned is currently being banned, and it may be the case that we'll be stuck with some code being banned simply because we can't make the compiler smart enough to know that it doesn't need to ban it (though hopefully that's kept to a minimum). But @trusted is there precisely for the situations where an operation is unsafe in the general case but is perfectly safe in a specific case based on information that the compiler does not have access to but the programmer does (such as knowing that certain values will never be passed to the given function). However, I don't think that there's much question that at this point @safe isn't quite correct. There are definitely cases right now that are considered @safe when they shouldn't be (e.g. https://issues.dlang.org/show_bug.cgi?id=8838 ), and I expect that there are also cases where code is considered @system when the compiler should consider it @safe (though I don't know of any of those off the top of my head). - Jonathan M Davis
Re: [OT] DConf - How to survive without a car?
On Tue, 06 May 2014 10:20:45 +0800 Lionello Lunesu via Digitalmars-d digitalmars-d@puremagic.com wrote: Hi all, After last year's incident with my tires getting slashed, I'm really hoping I can do without a car during this year's DConf. How feasible is this? I'll be staying at Aloft. Would be great if there's someone I can share a ride with. I've also seen there's a public bus going more or less to FB and back, so I should be good there. (Right?) But how about getting to SFO or down town? Am I causing myself a whole lot of pain (albeit of a different kind) by not renting a car? To be clear, I'm not looking for an economical option, just peace of mind. While getting around in that area without a car is certainly possible, it's also going to be far more of a pain, and the odds of your tires being slashed even once are quite low, let alone a second time. So, I'd suggest that you just rent a car and not worry about your tires getting slashed. Clearly, you were quite unlucky last year, but I'd be very surprised if you were that unlucky again. Obviously, it's up to you though. - Jonathan M Davis
From fields missing names in mailing list
Ever since the mailing list software was changed to say sender via Digitalmars-d, a number of the messages have been from via Digitalmars-d - they're missing the actual sender. And for many of them, the person who sent the message didn't bother to put a signature on it, making it so that you can't tell who it's from. One example of this is the Enforced @nogc for dtors? thread. Several of the messages are missing the sender. However, if I look at forum.dlang.org, there are names on those posts - e.g. the two posts by Ola folsheim Grostad have his name on them in the forum, but his name does not show up before via Digitalmars-d in the From field in the e-mails. So, I'm guessing that he posted via the forum, and his name got lost somehow when the message was translated into an e-mail for the mailing list. I don't know if this is a problem with the mailing list software, the forum software, or both, but it makes it hard to follow who's saying what. It's bad enough that everyone's names now have via Digitalmars-d tacked onto the end of them (since that definitely makes the from field harder to read), but not even putting the senders name in it makes the thread impossible to follow. So, I'd appreciate it if we could get this problem sorted out sometime soon. Thanks. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Wed, 07 May 2014 20:58:21 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: So there's this recent discussion about making T[] be refcounted if and only if T has a destructor. That's an interesting idea. More generally, there's the notion that making user-defined types as powerful as built-in types is a Good Thing(tm). Which brings us to something that T[] has that user-defined types cannot have. Consider: import std.stdio; void fun(T)(T x) { writeln(typeof(x).stringof); } void main() { immutable(int[]) a = [ 1, 2 ]; writeln(typeof(a).stringof); fun(a); } This program outputs: immutable(int[]) immutable(int)[] which means that the type of that value has subtly and silently changed in the process of passing it to a function. This change was introduced a while ago (by Kenji I recall) and it enabled a lot of code that was gratuitously rejected. This magic of T[] is something that custom ranges can't avail themselves of. In order to bring about parity, we'd need to introduce opByValue which (if present) would be automatically called whenever the object is passed by value into a function. This change would allow library designers to provide good solutions to making immutable and const ranges work properly - the way T[] works. There are of course a bunch of details to think about and figure out, and this is a large change. Please chime in with thoughts. Thanks! As far as I can see, opByValue does the same thing as opSlice, except that it's used specifically when passing to functions, whereas this code immutable int [] a = [1, 2, 3]; immutable(int)[] b = a[]; or even immutable int [] a = [1, 2, 3]; immutable(int)[] b = a; compiles just fine. So, I don't see how adding opByValue helps us any. Simply calling opSlice implicitly for user-defined types in the same places that it's called implicitly on arrays would solve that problem. We may even do some of that already, though I'm not sure. The core problem in either case is that const(MyStruct!T) has no relation to MyStruct!(const T) or even const(MyStruct!(const T)). They're different template instantations and therefore can have completely different members. So, attempts to define opSlice such that it returns a tail-const version of the range tends to result in recursive template instantiations which then blow the stack (or maybe error out due to too many levels - I don't recall which at the moment - but regardless, it fails). I think that careful and clever use of static ifs could resolve that, but that's not terribly pleasant. At best, it would result in an idiom that everyone would have to look up exactly how to do correctly every time they needed to define opSlice. Right now, you'd have to declare something like struct MyRange(T) { ... static if(isMutable!T) MyRange!(const T) opSlice() const {...} else MyRange opSlice() const {...} ... } and I'm not even sure that that quite works, since I haven't even attempted to define a tail-const opSlice recently. Whereas ideally, you'd just do something mroe like struct MyRange(T) { ... MyRange!(const T) opSlice() const {...} ... } but that doesn't currently work due to recursive template instantations. I don't know quite how we can make it work (maybe making the compiler detect when MyRange!T and MyRange!(const T) are effectively identical), but I think that that's really the problem that we need to solve, not coming up with a new function, because opSlice is already there to do what we need (though it may need to have some additional implicit calls added to it to make it match when arrays are implicitly sliced). Regardless, I concur that this is a problem that sorely needs solving it. Without it, const and ranges really don't mix at all. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 06:48:57 + bearophile via Digitalmars-d digitalmars-d@puremagic.com wrote: Currently only the slices decay in mutables, while an immutable int doesn't become mutable: That's because what's happening is that the slice operator for arrays is defined to return a tail-const slice of the array, and then any time you slice it - be it explicit or implicit - you get a tail-const slice. It really has nothing to do with passing an argument to a function beyond the fact that that triggers an implicit call to the slice operator. For feature parity here, what we really should be looking it is how to make opSlice have feature parity, not adding a new function. And ints can't be sliced, so there is no situation where you end up with a tail-const slice of an int. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 12:38:44 +0200 Timon Gehr via Digitalmars-d digitalmars-d@puremagic.com wrote: On 05/08/2014 08:55 AM, Jonathan M Davis via Digitalmars-d wrote: As far as I can see, opByValue does the same thing as opSlice, except that it's used specifically when passing to functions, whereas this code immutable int [] a = [1, 2, 3]; immutable(int)[] b = a[]; or even immutable int [] a = [1, 2, 3]; immutable(int)[] b = a; compiles just fine. So, I don't see how adding opByValue helps us any. Simply calling opSlice implicitly for user-defined types in the same places that it's called implicitly on arrays would solve that problem. We may even do some of that already, though I'm not sure. Automatic slicing on function call is not what actually happens. You can see this more clearly if you pass an immutable(int*) instead of an immutable(int[]): there is no way to slice the former and the mechanism that determines the parameter type to be immutable(int)* is the same. (And indeed doing the opSlice would have undesirable side-effects, for example, your pet peeve, implicit slicing of stack-allocated static arrays would happen on any IFTI call with a static array argument. :o) Also, other currently idiomatic containers couldn't be passed to functions.) Ah, you're right. The fact that slicing an array results in tail-const array is part of the equation, but implicit slicing isn't necessarily (though in the case of IFTI, you still get an implicit slice - it's just that that's a side effect of what type the parameter is inferred as). Still, the core problem is that MyRange!(const T) is not the same as const(MyRange!T), and that needs to be solved for opSlice to work properly. I'd still be inclined to try and just solve the problem with opSlice rather than introducing opByValue (maybe by having a UDA on opSlice which indicates that it should be implicitly sliced in the same places that a dynamic array would?), but as you point out, the problem is unfortunately a bit more complicated than that. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 14:48:18 +0200 Sönke Ludwig via Digitalmars-d digitalmars-d@puremagic.com wrote: Am 08.05.2014 13:05, schrieb monarch_dodra: On Thursday, 8 May 2014 at 07:09:24 UTC, Sönke Ludwig wrote: Just a general note: This is not only interesting for range/slice types, but for any user defined reference type (e.g. RefCounted!T or Isolated!T). Not necessarily: As soon as indirections come into play, you are basically screwed, since const is turtles all the way down. So for example, the conversion from const RefCounted!T to RefCounted!(const T) is simply not possible, because it strips the const-ness of the ref count. What we would *really* need here is NOT: const RefCounted!T = RefCounted!(const T) But rather RefCounted!T = RefCounted!(const T) The idea is to cut out the head const directly. This also applies to most ranges too BTW. We'd be much better of if we never used `const MyRange!T` to begin with, but simply had a conversion from `MyRange!T` to `MyRange!(const T)`, which references the same data. In fact, I'm wondering if this might not be a more interesting direction to explore. The reference count _must_ be handled separate from the payload's const-ness, or a const(RefCount!T) would be completely dysfunctional. Unless the reference count is completely separate from const(RefCount!T) (which would mean that the functions which accessed the reference couldn't pure - otherwise they couldn't access the reference count), const(RefCount!T) _is_ completely dysfunctional. The fact that D's const can't be cast away in to do mutation without violating the type system means that pretty much anything involving references or pointers is dysfunctional with const if you ever need to mutate any of it (e.g. for a mutex or a reference count). std.typecons.RefCounted is completely broken if you make it const, and I would expect that of pretty much any wrapper object intended to do reference counting. Being able to do RefCounted!T - RefCounted!(const T) makes some sense, but const(RefCounted!T) pretty much never makes sense. That being said, unlike monarch_dodra, I think that it's critical that we find a way to do the equivalent of const(T[]) - const(T)[] with ranges - i.e. const(Range!T) or const(Range!(const T)) - Range!(const T). I don't think that Range!T - Range!(const T) will be enough at all. It's not necessarily the case that const(Range!T) - Range!(const T) would always work, but it's definitely the case that it would work if the underlying data was in an array, and given what it takes for a forward range to work, it might even be the case that a forward range could do be made to do that conversion by definition. The problem is the actual mechanism of converting const(Range!T) to Range!(const T) in the first place (due to recursive template instantiations and the fact that the compiler doesn't understand that they're related). The concept itself is perfectly sound in many cases - unlike with const(RefCounted!T) - because in most cases, with a range, it's the data being referred to which needs to stay const, whereas the bookkeeping stuff for it can be copied and thus be made mutable. With a reference count, however, you have to mutate what's actually being pointed to rather than being able to make a copy and mutate that. So, const(RefCounted!T) - RefCounted!(const T) will never work - not unless the reference is outside of const(RefCounted!T), in which case, it can't be pure, which can be just as bad as not working with const. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 16:33:06 + David Nadlinger via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 8 May 2014 at 16:30:13 UTC, Sönke Ludwig wrote: For what practical reason would that be the case? I know that the spec states undefined behavior, but AFAICS, there is neither an existing, nor a theoretical reason, why this should fail: Compiler optimizations based on immutability. Or even just based on const. Optimizations based on const are going to be rarer, because other objects of the same type but which are mutable could refer to the same object and thus mutate it, but if the object is thread-local (as is the default), then there will still be some cases where the compiler will be able to assume that the object isn't mutated even if immutable isn't involved at all. If you're even attempting to cast away const and then mutate the object, you need to have a really good understanding of how the compiler could even theoretically optimize based on const (especially since even if an optimization isn't done now, and your code works, it could be added later and break your code). So, I'd strongly argue that casting away const from an object and mutating it is a fundamentally broken idiom in D. You may have a better chance of avoiding blowing your foot off if immutable isn't involve, but you still risk serious bugs. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 17:18:03 +0200 Sönke Ludwig via Digitalmars-d digitalmars-d@puremagic.com wrote: Right, which is my point: const(RefCount!T) *is* dysfunctional, which is why you'd want to skip it out entirely in the first place.This holds true for types implemented with RefCount, such as Array and Array.Range. Okay, I didn't know that. For various reasons (mostly weak ref support) I'm using my own RefCount template, which casts away const-ness of the reference counter internally. Which technically violates the type system and isn't something that should be done - though you _should_ be able to get away with it as long as immutable isn't involved. Still, the compiler is permitted to assume that const objects aren't mutated (because that's what const is supposed to guarantee), so you're risking subtle bugs due to compiler optimizations and whatnot. - Jonathan M Davis
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 17:39:25 +0200 Sönke Ludwig via Digitalmars-d digitalmars-d@puremagic.com wrote: Am 08.05.2014 16:22, schrieb Jonathan M Davis via Digitalmars-d: On Thu, 08 May 2014 14:48:18 +0200 Sönke Ludwig via Digitalmars-d digitalmars-d@puremagic.com wrote: Am 08.05.2014 13:05, schrieb monarch_dodra: On Thursday, 8 May 2014 at 07:09:24 UTC, Sönke Ludwig wrote: Just a general note: This is not only interesting for range/slice types, but for any user defined reference type (e.g. RefCounted!T or Isolated!T). Not necessarily: As soon as indirections come into play, you are basically screwed, since const is turtles all the way down. So for example, the conversion from const RefCounted!T to RefCounted!(const T) is simply not possible, because it strips the const-ness of the ref count. What we would *really* need here is NOT: const RefCounted!T = RefCounted!(const T) But rather RefCounted!T = RefCounted!(const T) The idea is to cut out the head const directly. This also applies to most ranges too BTW. We'd be much better of if we never used `const MyRange!T` to begin with, but simply had a conversion from `MyRange!T` to `MyRange!(const T)`, which references the same data. In fact, I'm wondering if this might not be a more interesting direction to explore. The reference count _must_ be handled separate from the payload's const-ness, or a const(RefCount!T) would be completely dysfunctional. Unless the reference count is completely separate from const(RefCount!T) (which would mean that the functions which accessed the reference couldn't pure - otherwise they couldn't access the reference count), const(RefCount!T) _is_ completely dysfunctional. The fact that D's const can't be cast away in to do mutation without violating the type system means that pretty much anything involving references or pointers is dysfunctional with const if you ever need to mutate any of it (e.g. for a mutex or a reference count). std.typecons.RefCounted is completely broken if you make it const, and I would expect that of pretty much any wrapper object intended to do reference counting. Being able to do RefCounted!T - RefCounted!(const T) makes some sense, but const(RefCounted!T) pretty much never makes sense. That being said, unlike monarch_dodra, I think that it's critical that we find a way to do the equivalent of const(T[]) - const(T)[] with ranges - i.e. const(Range!T) or const(Range!(const T)) - Range!(const T). I don't think that Range!T - Range!(const T) will be enough at all. It's not necessarily the case that const(Range!T) - Range!(const T) would always work, but it's definitely the case that it would work if the underlying data was in an array, and given what it takes for a forward range to work, it might even be the case that a forward range could do be made to do that conversion by definition. The problem is the actual mechanism of converting const(Range!T) to Range!(const T) in the first place (due to recursive template instantiations and the fact that the compiler doesn't understand that they're related). The concept itself is perfectly sound in many cases - unlike with const(RefCounted!T) - because in most cases, with a range, it's the data being referred to which needs to stay const, whereas the bookkeeping stuff for it can be copied and thus be made mutable. With a reference count, however, you have to mutate what's actually being pointed to rather than being able to make a copy and mutate that. So, const(RefCounted!T) - RefCounted!(const T) will never work - not unless the reference is outside of const(RefCounted!T), in which case, it can't be pure, which can be just as bad as not working with const. - Jonathan M Davis Unless I'm completely mistaken, it's safe to cast away const when it is known that the original reference was constructed as mutable. Anyway, this is what I do in my own RefCount struct. But my main point was that any user defined reference type is affected by the head vs. tail const issue, not just range types. So a decent solution should solve it for all of those types. Part of my point was that getting a tail-const slice of a range is fundamentally different from trying to get a tail-const with a ref-counted struct. With a range, the bookkeeping that would end up being mutable in the tail-const slice can usually be copied in order to be mutable and still work properly. In a ref-counted struct, however, the ref-count is a reference or pointer and copying it wouldn't help. You need to be able to mutate the same count that every other reference to the same object is using. And that requires casting away const (thus violating the type system) rather than being able to make a copy. So, while it might be the case that the issue of tail-constness can be generalized beyond slices in a useful way, and I don't think that it applies to reference
Re: From slices to perfect imitators: opByValue
On Thu, 08 May 2014 18:10:28 +0200 Timon Gehr via Digitalmars-d digitalmars-d@puremagic.com wrote: On 05/08/2014 06:02 PM, monarch_dodra wrote: If you have const data referencing mutable data, then yes, you can cast away all the const you want, but at that point, it kind of makes the whole const thing moot. This is not guaranteed to work. I guess the only related thing that is safe to do is casting away const, but then not modifying the memory. Exactly. It's effectively illegal to cast away const and then mutate the object. The compiler lets you do it, because D is a systems language, but the compiler is free to assume that the object wasn't modified, so unless you know what you're doing and are very, very careful, you're risking subtle bugs. Really, casting away const and then mutating the now-mutable object is not something that you should ever be doing. - Jonathan M Davis
Re: The Current Status of DQt
On Fri, 09 May 2014 09:56:09 + Kagamin via Digitalmars-d digitalmars-d@puremagic.com wrote: Please see this public service announcement: http://xkcd.com/1179/ Though it lists 20130227 as discouraged format, but it's a valid ISO 8601 format, and phobos Date.toISOString generates string in that format: http://dlang.org/phobos/std_datetime.html#.Date.toISOString Yes, it's supported, because it's standard, but it's preferred that toISOExtString be used precisely because the non-extended format is not only discouraged, but it's harder to read (which is probably why it's discouraged). - Jonathan M Davis
Re: range behaviour
On Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d digitalmars-d@puremagic.com wrote: I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront? Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way). - Is it allowed to call front multiple times between two calls to popFront? Definitely. _Lots_ of range-based code would break otherwise - though there are casese where that can cause problems depending on what you rely on (e.g. map!(a = to!string(a)) will return equal strings, but they aren't the _same_ string). Was there a result of that discussion? Is it documented somewhere? AFAIK, there's just the semi-recent newsgroup discussion on the matter, though maybe someone put something up on the wiki. - Jonathan M Davis
Re: range behaviour
On Tue, 13 May 2014 10:30:47 -0700 H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: Of course, for efficiency purposes range-based code (esp. Phobos code) should try their best to only call .front once. But it should be perfectly permissible to call .front multiple times. Oh, but that's part of the fun. If the range's front returns by ref or auto ref - or if it actually made front a public member variable - then it would be _less_ efficient to assign the result of front to a variable in order to avoid calling front multiple times. Much as ranges have a defined API, there's enough flexibility in the API and in how it's implemented for any given range that generalities about what is more or less efficient or which is more or less likely to avoid unintended behaviors isn't a sure thing - which is part of why that particular discussion was instigated in the first place and why discussions about stuff like whether front can be transitive or not keep popping up from time to time. In general, it's probably better to avoid calling front multiple times though - the issue with map being a prime example of where the problem can be worse than simply an issue of efficiency - and in most cases, ranges don't have a front which returns by ref or auto ref. But unfortunately, because that's just _most_ cases, it does make it so that we can't make a blanket statement about what is more or less efficient. - Jonathan M Davis
Re: range behaviour
On Tue, 13 May 2014 13:29:32 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Tue, 13 May 2014 12:58:09 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Tue, 13 May 2014 18:38:44 +0200 Benjamin Thaut via Digitalmars-d digitalmars-d@puremagic.com wrote: I know that there was a recent discussion about how the methods of ranges should behave. E.g. - Does empty always have to be called before calling front or popFront? Certainly, ranges are pretty much always used this way, but there was some debate as to whether empty could have work done in it (and thus _had_ to be called). However, I believe that the consensus was that yes, empty had to be called (certainly, both Walter and Andrei felt that way). I don't agree there was a consensus. I think empty should not have to be called if it's already logically known that the range is not empty. The current documentation states that, and I don't think there was an agreement that we should change it, despite the arguments from Walter. In any case, I think generic code for an unknown range type in an unknown condition should have to call empty, since it cannot logically prove that it's not. Even if it was required, it would be an unenforceable policy, just like range.save. Yeah, and they're both cases where the common case will work just fine if you do it wrong but which will break in less common cases, meaning that the odds are much higher that the bug won't be caught. :( - Jonathan M Davis
Re: Memory allocation purity
On Wed, 14 May 2014 22:42:46 + Brian Schott via Digitalmars-d digitalmars-d@puremagic.com wrote: What is the plan for the pure-ity of memory management? Right now the new operator is considered to be pure even though it is not, but related functinos like malloc, GC.addRange, GC.removeRange, and others are not. This prevents large portions of std.allocator from being pure. This then prevents any containers library built on std.allocator from being pure, which does the same for any funcions and data structures written using those containers. If malloc can never be considered pure, even when hidden behind an allocator, why can it be considered pure when hidden behind the GC? I think malloc should definitely be considered pure for the same reasons that new is considered pure. I don't know about the other memory management functions though. I'd really have to think through their side effects to have an opinion on them. If we can make them pure though, that would certainly help with ensuring that allocators can be pure. - Jonathan M Davis
Re: Memory allocation purity
On Wed, 14 May 2014 17:00:39 -0700 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/14/2014 3:42 PM, Brian Schott wrote: If malloc can never be considered pure, even when hidden behind an allocator, It cannot be pure as long as it can fail. why can it be considered pure when hidden behind the GC? Because GC failures are not recoverable, so the pure allocation cannot fail. Then we should create a wrapper for malloc which throws a MemoryError when malloc fails. Then malloc failures would be the same as GC failures. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 01:33:34 + Idan Arye via Digitalmars-d digitalmars-d@puremagic.com wrote: On Wednesday, 14 May 2014 at 22:50:10 UTC, w0rp wrote: I think even C malloc should be considered pure. True, it affects global state by allocating memory, but it never changes existing values, it just allows for new values. free is pure because it isn't side-effecting, it deallocates what you give it. That's just my perspective on it though, others might have other views on it. `free` is not pure, because if you have a reference to that memory that reference is no longer valid. But does that really matter from the perspective of pure? That's really more of an @safety issue. There might be some way that that violates purtiy, but I can't think of one at the moment. free can't be strongly pure, because it's arguments couldn't be immutable (or even const) without violating the type system, but I _think_ that it's fine for free to be weakly pure. It's quite possible that I'm missing something though. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 01:25:52 + Kapps via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 15 May 2014 at 00:00:37 UTC, Walter Bright wrote: Because GC failures are not recoverable, so the pure allocation cannot fail. Is this intentionally the case? I always thought you could handle it with onOutOfMemoryError if you knew that your program could handle running out of memory and free some memory (i.e., cached data) to recover. I've never actually had a use for it myself, so I'm just basing this off of newsgroup discussions I remember (possibly incorrectly) reading. It's intended that all Errors be considered unrecoverable. You can catch them when necessary to try and do cleanup, but that's fairly risky, and you should probably only do it when you're very sure of what's going on (which generally means that the Error was thrown very close to where you're catching it). There's no guarantee that automatic cleanup occurs when Errors are thrown (unlike with Exceptions), so when an Error is thrown, the program tends to be in a weird state on top of the weird state that caused the Error to be thrown in the first place. In theory, you could catch an OutOfMemoryError very close to where it was thrown and deal with it, but that's really not what was intended. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 05:51:14 + via Digitalmars-d digitalmars-d@puremagic.com wrote: Yep, purity implies memoing. No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 07:22:02 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 15 May 2014 at 06:59:08 UTC, Jonathan M Davis via Digitalmars-d wrote: And it _definitely_ has nothing to do with functional purity. Which makes it pointless and misleading. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. No, you can't say it is functionally pure if you can flip a coin with a pure function. To do that you would need a distinction between prove pure and assume pure as well as having immutable reference types that ban identity comparison. So, no, purity does _not_ imply memoization. It should, or use a different name. Originally, pure required that the function parameters be pure in addition to disallowing the function from accessing global or static variables or calling functions that weren't pure. It allowed for mutation within the function, and it allowed for allocation via new, but from the outside, the function _was_ functionally pure. The problem was that it was almost useless. You just couldn't do anything in a pure function that mattered most of the time. You couldn't call any other functions from the pure function unless they were pure, which meant that the arguments to them had to be immutable, which just didn't work, because while the arguments to the first function were immutable, what it had to do internally often involved operating on other variables which were created within the function which were not immutable and didn't need to be immutable, but you couldn't use them with any functions unless they were immutable thanks to the fact that all pure functions had to have immutable parameters, and pure functions could only call pure functions. It just didn't work. So, Don introduced the idea of weak purity. What it comes down to is that it's an extension of the concept that mutation within a pure function is fine just so long as its arguments aren't mutated. We made it so that pure functions _didn't_ have to have immutable parameters. They just couldn't access anything that wasn't passed to them as arguments. This meant that they could only mutate what they were given and thus they didn't violate the strong purity of the original pure function which had immutable parameters. e.g. string strongFunc(immutable string foo, int i) pure { auto foo = str ~ hello world ; weak(str, i); return str; } void weakFunc(ref string str, int i) pure { foreach(j; 0 .. i) str ~= to!(j); } The strong guarantees that strongFunc has which make it functionally pure are not violated by the fact that weakFunc is _not_ functionally pure. But by marking it pure, it guarantees that it can safely be called from a strongly pure function without violating the guarantees of that strongly pure function. To do that, _all_ we need to guarantee is that the weakly pure function cannot access anything save for what's passed into it (since if it could access global variables, that would violate the guarantees of any other pure functions that called it), but we do need that guarantee. The result is that the pure attribute doesn't in and of itself mean functional purity anymore, but it _can_ be used to build a function which is functionally pure. You could argue that a different attribute should be used other than pure to mark weakly pure functions, but that would just complicate things. The compiler is capable of figuring out the difference between a weakly pure and strongly pure function on its own from just the function signature just so long as weak purity is detectable from the function signature. So, we only need one attribute - one to mark the fact that the function can't access global, mutable state and can't call any functions that can. And we were already marking strongly pure functions with pure, so it made perfect sense to use it on weakly pure functions as well. At that point, it was just up to the compiler to detect whether the function was strongly pure or not and thus was functionally pure and could be used in optimizations. So, sorry that it offends your sensibilities that pure by itself does not indicate functional purity, but it's a building block for functional purity, and the evolution of things made it make perfect sense to use the pure attribute for this. And even if pure _didn't_ enable functional purity, it would still be highly useful just from the fact that a pure function (be it weak or strong) cannot access global variables, and that makes it _much_ easier to reason about code, because you know that it isn't accessing anything that wasn't passed to it. I recommend that you read this article by David Nadlinger: http://klickverbot.at/blog/2012/05/purity-in-d/ - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 10:14:48 +0200 luka8088 via Digitalmars-d digitalmars-d@puremagic.com wrote: On 15.5.2014. 8:58, Jonathan M Davis via Digitalmars-d wrote: On Thu, 15 May 2014 05:51:14 + via Digitalmars-d digitalmars-d@puremagic.com wrote: Yep, purity implies memoing. No, it doesn't. _All_ that it means when a function is pure is that it cannot access global or static variables unless they can't be changed after being initialized (e.g. they're immutable, or they're const value types), and it can't call any other functions which aren't pure. It means _nothing_ else. And it _definitely_ has nothing to do with functional purity. Now, combined with other information, you _can_ get functional purity out it - e.g. if all the parameters to a function are immutable, then it _is_ functionally pure, and optimizations requiring functional purity can be done with that function. But by itself, pure means nothing of the sort. So, no, purity does _not_ imply memoization. - Jonathan M Davis Um. Yes it does. http://dlang.org/function.html#pure-functions functional purity (i.e. the guarantee that the function will always return the same result for the same arguments) The fact that it should not be able to effect or be effected by the global state is not a basis for purity, but rather a consequence. Even other sources are consistent on this matter, and this is what purity by definition is. The reread the paragraph at the top of the section of the documentation that you linked to: Pure functions are functions which cannot access global or static, mutable state save through their arguments. This can enable optimizations based on the fact that a pure function is guaranteed to mutate nothing which isn't passed to it, and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity (i.e. the guarantee that the function will always return the same result for the same arguments). That outright says that pure only _can_ enable functional purity - in particular when the compiler is able to guarantee that the function cannot mutate its arguments. pure itself however means nothing of the sort. The fact that pure functions cannot access global state _is_ the basis for functional purity when combined with parameters that arguments cannot be mutated. If you get hung up on what the concept of functional purity is or what you thought pure was before using D, then you're going to have a hard time understanding what pure means in D. And yes, it's a bit weird, but it comes from the practical standpoint of how to make functional purity possible without being too restrictive to be useful. So, it really doesn't matter what other sources say about what purity means. That's not what D's pure means. D's pure is just a building block for what purity normally means. It makes it so that the compiler can detect functional purity and then optimize based on it, but it doesn't in and of itself have anything to do with functional purity. If the documentation isn't getting that across, then I guess that it isn't clear enough. But I would have thought that the part that said and in cases where the compiler can guarantee that a pure function cannot alter its arguments, it can enable full, functional purity would have made it clear that D's pure is _not_ functionally pure by itself. The first part of the paragraph says what pure really means: Pure functions are functions which cannot access global or static, mutable state save through their arguments. Everything else with regards to functional purity is derived from there, but in and of itself, that's _all_ that the pure attribute in D means. See also: http://klickverbot.at/blog/2012/05/purity-in-d/ - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 10:10:57 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 15 May 2014 at 09:23:00 UTC, Jonathan M Davis via Digitalmars-d wrote: functions that weren't pure. It allowed for mutation within the function, and it allowed for allocation via new, but from the outside, the function _was_ functionally pure. If it didn't return the memory allocated with new and if the call to new resulted in an exception, yes. It just didn't work. That I question. A pure function ( http://en.wikipedia.org/wiki/Pure_function ) depends on the values of the parameters, and only that. That is most useful. Those value can be very complex. You could have a pure member function look up values in a cache. Then the configuration of entire cache is the value. You need to think about this in terms of pre/post conditions in Hoare Logic (which I am not very good at btw). So, Don introduced the idea of weak purity. What it comes down to is that it's an extension of the concept that mutation within a pure function is fine just so long as its arguments aren't mutated. We made it so that pure functions _didn't_ have to have immutable parameters. They just couldn't access anything that wasn't passed to them as arguments. This meant that they could only mutate what they were given and thus they didn't violate the strong purity of the original pure function which had immutable parameters. And that's fine as long as nobody else is holding a reference to those mutable parameters. That would only matter if the compiler were trying to optimize based on pure functions with mutable parameters. It doesn't. And it would actually be very difficult for it to do so without doing full program optimization, which really doesn't work with the C linking model that D uses. The fact that we have thread-local by default helps, but it's not enough for more than very simple cases. The compiler doesn't care about optimizing weakly pure functions. The whole purpose of weakly pure functions is to have functions which aren't functionally pure but can still be used in functions that _are_ functionally pure. If you think in terms of a context for purity such as a transaction then you can even allow access to globals as long as they remain constant until the transaction is committed (or you leave the context where purity is desired). Meaning, you can memoize within that context. That doesn't work with D's model, because it doesn't have any concept of transactions like that. It also doesn't really have any concept of memoization either. The most that it does that is anything like memoizaton is optimize away multiple calls to a function within a single expression (or maybe within a statement - I don't remember which). So, auto y = sqrt(x) * sqrt(x); might become something more like auto temp = sqrt(x); y = x * x; but after that, the result of sqrt(x) is forgotten. So, in reality, the optimization gains from strongly pure functions are pretty minimal (almost non-existent really). If we were willing to do code flow analysis, we could probably make more optimizations (assuming that the exact same function call was made several times within a single function, which isn't all that common anyway), but Walter is generally against doing code flow analysis in the compiler due to the complications that it adds. We have some, but not a lot. The two main gains for purity are 1. being able to know for sure that a function doesn't access any global variables, which makes it easier to reason about the code. 2. being able to implicitly convert types to and from mutable, const, and immutable based on the knowledge that a particular value has to be unique. I'd say that functional purity was really the original goal of adding pure to the language, but it's really those two effects which have given us the most benefit. #2 in particular was unexpected, and the compiler devs keep finding new places that they can take advantage of it, which makes dealing with immutable a lot more pleasant - particularly when it comes to creating immutable objects that require mutation in order to be initialized properly. functions that called it), but we do need that guarantee. The result is that the pure attribute doesn't in and of itself mean functional purity anymore, but it _can_ be used to build a function which is functionally pure. But, that can be deduced by the compiler, so what is the point of having pure for weakly pure? Clearly you only need to specify strongly pure? It can't be deduced from the signature, and the compiler has to be able to know based only on the signature, because it doesn't necessarily have the source code for the function available. The only functions for which the compiler ever deduces anything from their bodies are templated functions, because it always has their bodies, and if it didn't do the attribute inference for you, you'd be forced
Re: Memory allocation purity
On Thu, 15 May 2014 10:48:07 + Don via Digitalmars-d digitalmars-d@puremagic.com wrote: Yes. 'strong pure' means pure in the way that the functional language crowd means 'pure'. 'weak pure' just means doesn't use globals. But note that strong purity isn't an official concept, it was just the terminology I used when explain to Walter what I meant. I don't like the term because it's rather misleading -- in reality you could define a whole range of purity strengths (more than just two). The stronger the purity, the more optimizations you can apply. Yeah, I agree. The problem is that it always seems necessary to use the terms weak pure to describe the distinction - or maybe I just suck at coming up with a better way to describe it than you did initially. Your recent post in this thread talking about @noglobal seems to be a pretty good alternate way to explain it though. Certainly, the term pure throws everyone off at first. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 08:43:11 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/15/14, 6:28 AM, Dicebot wrote: This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern. I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable. - Jonathan M Davis
Re: Memory allocation purity
On Thu, 15 May 2014 11:03:13 -0700 Walter Bright via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/15/2014 2:45 AM, Don wrote: An interesting side-effect of the recent addition of @nogc to the language, is that we get this ability back. I hadn't thought of that. Pretty cool! Definitely, but we also need to be careful with it. If @nogc just restricts allocations by the GC and not allocations in general, and if we make it so that malloc is pure (even if it's only when wrapped by a function which throws an Error when malloc returns null), then I don't think that we quite get it back, because while the GC may not have allocated any objects, malloc could still have be used to allocate them. We'd need to be able to either say that _nothing_ allocated within the function, which isn't quite what @nogc does as I understand it (though admittedly, I haven't paid much attention to the discussions on it, much as I would have liked to). So, maybe we need to find a way to make it so that a wrapped malloc can be pure but isn't @nogc? Though if we go that route, that implies that @nogc should have been @noalloc. Regardless, I think that making malloc pure definitely affects the issue of whether a @nogc function can be assumed to not return newly allocated memory. - Jonathan M Davis
Re: hijackable/customizable keyword for solving the customized algorithm issue?
On Fri, 16 May 2014 16:45:28 + Yota via Digitalmars-d digitalmars-d@puremagic.com wrote: On Thursday, 15 May 2014 at 17:08:58 UTC, monarch_dodra wrote: On Thursday, 15 May 2014 at 12:16:52 UTC, Steven Schveighoffer wrote: On Thu, 15 May 2014 02:05:08 -0400, monarch_dodra monarchdo...@gmail.com wrote: move will also delegate to proxyMove. This is the correct solution IMO. The principle of least surprise should dictate that a.foo should always mean the same thing. All that is required to enforce this is to make the hook function have a different name. The issue(s) I have with the hook must have a different name is: 1. In the algorithm implementation, you must explicitly test for the hook, which, if we want to expand on, would turn all our algorithms into: void foo(T)(T t) { static if (is(typeof(t.fooHook( return t.fooHook(); else ... } 2. The overriden hook is only useful if you import the corresponding algorithm. So for example, in my 3rd pary library, if my type has findHook, I'd *have* to import std.algorithm.find for it to be useful. Unless I want to make a direct call to findHook, which would be ugly... How about a middle ground? Have the function names be identical, and decorate the member version with @proxy or @hook, rather than decorating the original definition. I'd find this to be less surprising. That sounds like an interesting idea, but it would require adding the concept to the compiler - though I suppose that that's a downside with monarch_dodra's original proposal as well. Neither can be done in the library itself. That's not necessarily the end of the world, but at this point, I'm very hesitant to support adding another attribute to the language without a really, really good reason. This particular problem can be solved by Steven's suggestion of using proxy functions with different names, which works within the language as it's currently defined. So, while that might not be an altogether desirable solution, I'm inclined to believe that it's good enough to make it not worth adding additional attributes to the language to solve the problem. - Jonathan M Davis
Re: Memory allocation purity
On Sun, 18 May 2014 06:58:25 -0700 H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote: On Thu, 15 May 2014 08:43:11 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/15/14, 6:28 AM, Dicebot wrote: This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern. I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable. [...] bool func(int x) /* pure? */ { int[] a, b; a = new int[x]; b = new int[x]; return (a.ptr b.ptr); } Where do you draw the line? H. I think that it was pointed out somewhere else in this thread that that sort of comparison is already undefined in C (and thus probably D) - certainly it's not something that's really valid to do normally. However, there are other things that we currently do normally which have similar problems (e.g. toHash on Object hashing the reference). Maybe if we could come up with a set of operations which weren't valid in a pure function because they'd behave differently depending on which memory block was given to new, then we could make it work. But we may simply need to say that memoization of pure functions just isn't going to work if we allow allocations to take place in pure functions. That wouldn't be ideal, but I'm also not convinced that it matters much. It's already so rare that memoization of a function call can occur, that I'm pretty much convinced that memoization is useless as an optimization - at least as long as the compiler is doing it. After all, how often does a function get called with same the arguments within a single function let alone a single expression (and as I understand it, memoization only ever occurs at this point within either a single expression or statement - I don't remember which - but regardless, it's not even within a whole function)? And since there's no way that the compiler is going to memoize alls to a function across functions or even across calls to the same function which is calling the function which is being memoized, unless it's very common to call a function with the same result within a single function (and I really don't think that it is), then memoization by the compiler is really of minimal benefit, much as it would be nice to have it where we can. Regardless, we should error on the side of not memoizing in order to avoid undefined behavior due to memory allocations causing the pure function to act slightly differently across function calls (even when it's given the same arguments). - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 05:16:13 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Monday, 19 May 2014 at 01:19:29 UTC, Jonathan M Davis via Digitalmars-d wrote: It's already so rare that memoization of a function call can occur, that I'm pretty much convinced that memoization is useless as an optimization - at least as long as the compiler is doing it. After all, how often does a function get called with same the arguments within a single function let alone a single expression (and as I understand it, memoization only ever occurs at this point within either a single expression or statement - I don't remember which - but regardless, it's not even within a whole function)? Memoization is valid throughout the program. Opportunities occur frequently with generic programming and inlining. I seriously question that assertion. How often do you really call the same function with the same arguments? And given that for the memoization to be done by the compiler without saving the result somewhere (which could actually _harm_ efficiency in many cases and certainly isn't something that the compiler does right now), the most that it could possibly memoize would be within a single function, that makes it all the less likely that it could memoize any calls. And given that the most it currently memoizes would be within a single expression (or possibly a single statement - I can't remember which), that makes it so that about the only time that it can memoize function calls is when you do something like foo(2) * foo(2), and how often does _that_ happen? Sure, it happens from time to time, but in my experience, it's very rare to make the same function call with the same arguments within a single expression or statement. Regardless, we should error on the side of not memoizing in order to avoid Then you don't need strict pure functions. As far as I'm concerned the two big gains from pure are 1. it makes it easier to reason about code, because it guarantees that the function didn't access any global or static variables. 2. it allows us to implicitly convert to different levels of mutability for the return type of pure functions where the compiler can guarantee that the return value was allocated within the function. Any other benefits we get from pure are great, but they're incidental in comparison. And in particular, in my experience, memoization opportunities are so rare that there's really no point in worrying about them. They're just a nice bonus when they do happen. - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 06:05:26 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Monday, 19 May 2014 at 05:39:49 UTC, Jonathan M Davis via Digitalmars-d wrote: 1. it makes it easier to reason about code, because it guarantees that the function didn't access any global or static variables. It can, through the parameters, like an array of pointers. And avoiding IO is not sufficient to mark 90% of my code as weakly pure. Except that if most of your code is marked pure, then there aren't very many points where it could access a global or static variable. And regardless of that, the fact that the only way to access a global or static variable in a pure function is through one of its arguments still guarantees that the function isn't messing with anything that wasn't passed to it, so that still helps a lot with being able to reason about code. Sure, it could still be passed an argument that points to a global variable (directly or indirectly), but then you only have to worry about globals being accessed if they were passed in (directly or indirectly). Sure, it's not as big a gain as when a function is strongly pure, but it's good enough for most cases. 2. it allows us to implicitly convert to different levels of mutability for the return type of pure functions where the compiler can guarantee that the return value was allocated within the function. But if you can have a struct/pointer as a parameter then you can clearly return objects not allocated in the function? The compiler is smart enough in many cases to determine whether the return value could have been passed in or not (though it wouldn't surprise me if it could be made smarter in that regard). With a function like string foo(string bar) pure {...} it can't assume that the return type is unique, because it could have been passed in via the parameter, but with string foo(char[] bar) pure {..} or int* foo(string bar) pure {..} it could, because it's impossible for the parameter to be returned from the function (unless casting that breaks the type system is used anyway - and the compiler is free to assume that that isn't done). So, it varies quite a bit as to whether a pure function is guaranteed to be returning newly allocated memory or not, but the compiler often can determine that, and when it can, it makes dealing with immutable far, far more pleasant. It's particularly useful when you need to allocate an immutable object but also need to mutate it as part of initializing it. If you do it in a pure function where the compiler knows that the object couldn't have been passed in, then the return type can be freely converted to various levels of mutability - including immutable - without having to use immutable within the function. - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 07:37:55 + via Digitalmars-d digitalmars-d@puremagic.com wrote: On Monday, 19 May 2014 at 06:30:46 UTC, Jonathan M Davis via Digitalmars-d wrote: makes dealing with immutable far, far more pleasant. It's particularly useful when you need to allocate an immutable object but also need to mutate it as part of initializing it. If you do it in a pure function where the compiler knows that the object couldn't have been passed in, then the return type can be freely converted to various levels of mutability - including immutable - without having to use immutable within the function. It does not appear as a clean design that functions should have different semantics than a block. What matters is that the object reference is unique. Confusing this with pure seems like a bad idea. I don't follow you. The fact that D's pure helps the compiler determine cases where it knows that the return value of a function is unique is a key feature of pure and has proven to be a great idea. Perhaps you're hung up on the fact that the term pure is being used, and you're thinking about functional purity? If so, forget about pure in the functional sense if you want to discuss D purity. You need to think of it as something more like @noglobal. That combined with other information in the function signature allows the compiler to determine cases where it knows that the returned value is unique. It also can lead to the compiler determining that a function in functionally pure and thus memoizable, but at this point, that's pretty incidental to what pure is and does. It's part of it, but it's not the primary feature of what D's pure is or is for. It's unfortunate that the language's evolution lead us to using the term pure for what it's currently used for, but we're pretty much stuck with it at this point. Regardless, the fact that D's pure allows us to determine when the return value of a function has to be unique is fantastic and has proven very helpful. - Jonathan M Davis
Re: current ref args state.
On Mon, 19 May 2014 09:34:48 + evilrat via Digitalmars-d digitalmars-d@puremagic.com wrote: as topic says sometimes ref has a 'little' problem now, it is unavoidable in some cases and has some readability in code and much more... imagine we have a function void getMyNumber(ref int number) which assigns 42 to 'number', now lets look at C# code first: --- C# code - int someNumber = 3; MyCoolLibraryClass.getMyNumber(ref someNumber); -- ok, it clearly shows where our variable could be changed, now look at same D code: --- D code - int someNumber = 3; getMyNumber(someNumber); assert(someNumber == 42); -- wait... what? what does this function? is it using our nice variable to do some computation with it? or maybe it is assigned something to our varible? and what if we had lot of other function calls before this? how to know our variable stay the all time we feed it to functions? now this is the problem, ref arguments isn't that obvious as it should be, it is very easy to mess up in code and spent hours to see what's going wrong. currently i trying to add /*ref*/ comment before passing such variables to functions. we should really change this behaviout and add warning when passing ref args to functions if it doesnt have ref keyword like in C#, i don't ask to enfoce, but such little tweak would help a lot. i may overlooked proposals for this case, if any please give a link, sorry for possible duplicate topic. It's been discussed before and shot down. Among other things, making ref at the call site would break a lot of code, and it wouldn't play well with UFCS at all. Also, making ref optional at the call site is by far worse than not having it all, because then the lack of it still tells you absolutely nothing but gives the false sense of security that it's not being passed by ref. And we really shouldn't be adding more warnings, since in the grand scheme of things, a warning is pretty much the same thing as an error, since not only is it bad practice to leave warnings in your code, but the -w flag makes warnings into errors anyway. C++ is in the same situation with regards to its references, and it works fine. However, folks do often use pointers instead of references in C++ in order to make it clearer, and you're free to the same in D if you really don't like having functions which have ref parameters. I really don't expect that this aspect of D is going to change at this point. - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 09:42:31 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote: On Thu, 15 May 2014 08:43:11 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/15/14, 6:28 AM, Dicebot wrote: This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern. I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. Memoizing reference returns that are immutable should be fine. Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types) is something that matters. It has less of an impact when you're dealing with immutable objects, because changing the value of one won't change the value of another, but it can still change the behavior of the program due to the fact that they're not actually the same object. So, I'd be inclined to argue that no functions which return memory should be memoizable. And given that the compiler can only memoize functions within a single expression (or maybe statement - I can't remember which) - I don't think that that restriction even costs us much. - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 13:11:43 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Mon, 19 May 2014 12:35:26 -0400, Jonathan M Davis via Digitalmars-d digitalmars-d@puremagic.com wrote: On Mon, 19 May 2014 09:42:31 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d digitalmars-d@puremagic.com wrote: On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote: On Thu, 15 May 2014 08:43:11 -0700 Andrei Alexandrescu via Digitalmars-d digitalmars-d@puremagic.com wrote: On 5/15/14, 6:28 AM, Dicebot wrote: This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern. I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. Memoizing reference returns that are immutable should be fine. Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types) It shouldn't matter. Something that returns immutable references, can return that same thing again if asked the same way. Except that a pure function _can't_ return the same object by definition - not unless it was passed in via an argument. And unless the compiler can guarantee that the object being return _isn't_ newly allocated, then it has to assume that it could have been, in which case, it can't assume that two calls to the same pure function return the same object. It may return _equal_ objects - but not the same object. And since we're talking about reference types, the fact that they're not the same object can definitely have an impact on the behavior of the program. Nobody should be looking at the address in any meaningful way. Maybe they shouldn't be, but there's nothing stopping them. It's perfectly legal to write code which depends on the value of the address of an object. So, memoizing the result of a pure function which returns a reference type _will_ have an impact on the behavior of some programs. is something that matters. It has less of an impact when you're dealing with immutable objects, because changing the value of one won't change the value of another, but it can still change the behavior of the program due to the fact that they're not actually the same object. Such a program is incorrectly written. Well, then Object.toHash is incorrectly written. And given that the compiler can only memoize functions within a single expression (or maybe statement - I can't remember which) - I don't think that that restriction even costs us much. It can make a huge difference, and it doesn't have to be memoized within the same expression, it could be memoized globally with a hashtable, or within the same function. Not by the compiler. All it ever does with regards to memoization is optimize out multiple calls to the same function with the same argument within a single expression. It doesn't even make that optimization elsewhere within a function. Sure, a programmer could choose to explicitly memoize the result, but then it's up to them to not memoize it when it shouldn't be memoized. - Jonathan M Davis
Re: Memory allocation purity
On Mon, 19 May 2014 14:33:55 -0400 Steven Schveighoffer via Digitalmars-d digitalmars-d@puremagic.com wrote: The whole POINT of pure functions is that it will return the same thing. The fact that it lives in a different piece of memory or not is not important. We have to accept that. Any code that DEPENDS on that being in a different address is broken. Except that if you're talking about reference types, and the reference points to two different objects, then they're _equal_ rather than being the same object. That's the whole point of the is operator - to check whether two objects are in fact the same object. I agree that relying on things like whether one address in memory is larger or smaller than another address in unrelated memory is just plane broken, but it's perfectly legitimate for code to depend on whether a particular object is the same object as another rather than equal. And memoizing the result of a function - be it pure or otherwise - will therefore have an effect on the semantics of perfectly valid programs. And honestly, even if equality were all that mattered for memoization, the fact that the compiler only does it on a very, very restrictive basis (since it never saves the result, only optimizes out extraneous calls) makes it so that memoization is kind of useless from the standpoint of the compiler. It's only really useful when the programmer does it, and they can decide whether it's okay to memoize a function based on the requirements of their program (e.g. whether reference equality matters or just object equality). - Jonathan M Davis