[Issue 16080] [REG2.071.0] Internal error: backend\cgobj.c 3406 when building static library
https://issues.dlang.org/show_bug.cgi?id=16080 Sönke Ludwigchanged: What|Removed |Added Summary|Internal error: |[REG2.071.0] Internal |backend\cgobj.c 3406 when |error: backend\cgobj.c 3406 |building static library |when building static ||library --
Legal operator overloading
Is it legal/possible to overload the unary * operator? Also is it legal/possible to individually overload the comparison operators and not return a bool? (Before you ask no I'm not crazy, I am trying to make a library solution to multiple address spaces for supporting OpenCL/CUDA in D, and vector comparisons.)
[Issue 15925] [REG 2.071] Import declaration from mixin templates are ignored
https://issues.dlang.org/show_bug.cgi?id=15925 --- Comment #11 from github-bugzi...@puremagic.com --- Commits pushed to master at https://github.com/dlang/dmd https://github.com/dlang/dmd/commit/f0f38381ed27fd8a4d2e36d13623698970cff7bd Revert "[REG 2.071] Fix Issue 15925 - Import declaration from mixin templates are ignored" https://github.com/dlang/dmd/commit/64f3fdb27b6d9d465c307b45c26a6c9fe10844b8 Merge pull request #5800 from dlang/revert-5724-fix-15925-stable https://github.com/dlang/dmd/commit/b86f3b3b357f6d0edc0c8a60552657b922443017 fix Issue 15925 - Import declaration from mixin templates are ignored https://github.com/dlang/dmd/commit/ba178e607c33e121142ec15c5064d953fd87a191 Merge pull request #5810 from MartinNowak/fix15925 --
Re: Copyright for Phobos to D Foundation
On 30/05/2016 4:53 AM, Iain Buclaw wrote: On Sunday, 29 May 2016 at 10:59:57 UTC, rikki cattermole wrote: Uninstall dmd and have ldc installed. Until the maintainers fix their problem with not sharing or upstreaming bug fixes and contributions. Then I would not make such endorsements. I was making the point that dub could handle ldc as the default if dmd is not on the PATH variable.
Re: Transient ranges
On Sunday, May 29, 2016 18:27:53 default0 via Digitalmars-d wrote: > On Sunday, 29 May 2016 at 18:09:29 UTC, Steven Schveighoffer > > wrote: > > On 5/29/16 1:45 PM, Steven Schveighoffer wrote: > >> On 5/27/16 7:42 PM, Seb wrote: > >>> So what about the convention to explicitely declare a > >>> `.transient` enum > >>> member on a range, if the front element value can change? > >> > >> enum isTransient(R) = is(typeof(() { > >> > >>static assert(isInputRange!R); > >>static assert(hasIndirections(ElementType!R)); > >>static assert(!allIndrectionsImmutable!(ElementType!R)); // > >> > >> need to > >> write this > >> })); > > > > obviously, this is better as a simple && statement between the > > three requirements :) When I started writing, I thought I'd > > have to write some runtime code. > > > > -Steve > > Would that make a range of polymorphic objects transient? It would make pretty much anything that isn't a value type - including a type that's actually a value but uses postblit to do it - be treated as transient, with the one exception being that if the reference types involved are immutable (be they the element type or members in the elmenet type), then it's not treated as transient. This means a very large number of ranges will be treated as being transient, which is completely unacceptable IMHO. Having a transient front is _not_ the norm, and code is usually written with the assumption that front is not transient. In almost all cases, if a range-based function happens to work with a transient front, it's by luck and not because it was designed that way. You can't statically check for transience, because it depends on runtime behavior. At best, you can statically eliminate a fairly small portion of the ranges as not being having transient fronts. If we want to actually support transient fronts, it really needs to be explicit IMHO. Regardless, I don't think that we want to need to be checking for transience in range-based functions in general. It's too much extra complication for too little benefit. A very small number of ranges actually have or benefit from having a transient front, and I don't think that it's worth supporting them as ranges given how much that affects everything else. Otherwise, you end up with the 1% case causing problems for all range-based code. - Jonathan M Davis
Re: Transient ranges
On Sunday, 29 May 2016 at 17:36:24 UTC, Steven Schveighoffer wrote: Wholly disagree. If we didn't cache the element, D would be a laughingstock of performance-minded tests. byLine already is a laughingstock performance wise: https://issues.dlang.org/show_bug.cgi?id=11810 It's way faster to read the entire file into a buffer and iterate by line over that. I have to agree with Jonathan, I see a lot of proposals in this thread but I have yet to see a cost/benefit analysis that's pro transient support. The amount of changes needed to support them is not commensurate to any possible benefits.
Re: Transient ranges
On Sunday, May 29, 2016 13:36:24 Steven Schveighoffer via Digitalmars-d wrote: > On 5/27/16 9:48 PM, Jonathan M Davis via Digitalmars-d wrote: > > On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: > >> So what about the convention to explicitely declare a > >> `.transient` enum member on a range, if the front element value > >> can change? > > > > Honestly, I don't think that supporting transient ranges is worth it. > > Every > > single range-based function would have to either test that the "transient" > > enum wasn't there or take transient ranges into account, and > > realistically, > > that isn't going to happen. For better or worse, we do have byLine in > > std.stdio, which has a transient front, but aside from the performance > > benefits, it's been a disaster. > > Wholly disagree. If we didn't cache the element, D would be a > laughingstock of performance-minded tests. Having byLine not copy its buffer is fine. Having it be a range is not. Algorithms in general just do not play well with that behavior, and I don't think that it's reasonable to expect them to. > > It's way too error-prone. We now have > > byLineCopy to combat that, but of course, byLine is the more obvious > > function and thus more likely to be used (plus it's been around longer), > > so > > a _lot_ of code is going to end up using it, and a good chunk of that code > > really should be using byLineCopy. > > There's nothing actually wrong with using byLine, and copying on demand. > Why such a negative connotation? Because it does not play nicely with ranges, and aside from a few rare ranges like byLine that have to deal directly with I/O, transience isn't even useful. Having an efficient solution that plays nicely with I/O is definitely important, but it doesn't need to be a range, especially when it complicates ranges in general. byLine doesn't even work with std.array.array, and if even that doesn't work, I don't see how a range could be considered well-behaved. > > I'm of the opinion that if you want a transient front, you should just use > > opApply and skip ranges entirely. > > So you want to make this code invalid? Why? > > foreach(i; map!(a => a.to!int)(stdin.byLine)) > { > // process each integer > ... > } > > You want to make me copy each line to a heap-allocated string so I can > parse it?!! If it's a range, then it can be passed around to other algorithms with impunity, and almost nothing is written with the idea that a range's front is transient. There's no way to check for transience, and I don't think that it's even vaguely worth adding yet another range primitive that has to be checked for everywhere just for this case. Transience does _not_ play nicely with algorithms in general. Using opApply doesn't completely solve the problem (since the buffer could still escape - we'd need some kind of scope attribute or wrapper to fix that problem), but it makes it so that you can't pass such a a range around and run into problems with all of the algorithms that don't play nicely with it. So, instead, you end up with code that looks something like foreach(line; stdin.byLine()) { auto i = line.to!int(); ... } And yes, it's slightly longer, but it prevents a whole class of bugs by not having it be a range with a transient front. > > Allowing for front to be transient - > > whether you can check for it or not - simply is not worth the extra > > complications. I'd love it if we deprecated byLine's range functions, and > > made it use opApply instead and just declare transient ranges to be > > completely unsupported. If you want to write your code to have a transient > > front, you can obviously take that risk, but you're on your own. > > There is no way to disallow front from being transient. In fact, it > should be assumed that it is the default unless it's wholly a value-type. Pretty much no range-based code is written with the idea that front is transient. It's pretty much the opposite. Unfortunately, we can't check for all of the proper range semantics at compile time (be it having to do with transience, the fact that front needs to be the same every time until popFront is called, that save has to actually result in a range that will have exactly the same elements, or whatever other runtime behavior that ranges are supposed to adhere to), but just because something can't be checked for doesn't mean that it should be considered reasonable or valid. IMHO, a range with a transient front should be considered as valid as a range that returns a different value every time that front is called without popFront having been called. Neither can be tested for, but both cause problems. If we're going to support transience, then we _need_ to have some sort of flag/enum in the type to indicate that the range is transient, but that complicates everything, because then all range implementations have to check for it and pass it on when they wrap that type, and many algorithms will have to expclicitly check
Re: Transient ranges
On Sunday, 29 May 2016 at 17:45:00 UTC, Steven Schveighoffer wrote: On 5/27/16 7:42 PM, Seb wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? enum isTransient(R) = is(typeof(() { static assert(isInputRange!R); static assert(hasIndirections(ElementType!R)); static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this })); -Steve allIndrectionsImmutable could probably just be is(T : immutable) (ie implicitly convertible to immutable). Value types without (or with immutable only) indirections should be convertible to immutable, since the value is being copied.
Re: The Case Against Autodecode
On 5/29/2016 5:56 PM, H. S. Teoh via Digitalmars-d wrote: As far as Unicode is concerned, it is a standard for representing *written* text, not spoken language, so concepts like phonemes aren't even relevant in the first place. Let's not get derailed from the present discussion by confusing the two. As far as D is concerned, we are not going to invent our own concepts around text that is different from Unicode or redefine Unicode terms. Unicode is what it is, and D is going to work with it.
Re: String compare in words?
On average there would be less than 4 bytes remaining to compare. So a simple straightforward byte comparison should do the job efficiently.
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 17:35:35 UTC, Nick Sabalausky wrote: Unlike Python, we wouldn't be maintaining a "with auto-decoding" fork for years and years and years, ensuring nobody ever had a pressing reason to bother migrating. If it happens, they better. The D1 fork was maintained for almost three years for a good reason. Heck, we weather breaking fixes enough anyway. Not nearly on a scale similar to changing how strings are iterated; not since the D1/D2 split. It was an annoying pain (at least to me), but I got through it fine and never even entertained the thought of just sticking with the old compiler. Not sure most people even noticed it. Point is, in D, even when something does need to change, life goes on fine. As long as we don't maintain a long-term fork ;) The problem is not active users. The problem is companies who have > 10K LOC and libraries that are no longer maintained. E.g. It took Sociomantic eight years after D2's release to switch only a few parts of their projects to D2. With the loss of old libraries/old code (even old answers on SO), all of a sudden you lose a lot of the network effect that makes programming languages much more useful.
Re: String compare in words?
On Sunday, 29 May 2016 at 20:40:52 UTC, qznc wrote: On Sunday, 29 May 2016 at 18:15:16 UTC, qznc wrote: On Sunday, 29 May 2016 at 17:38:17 UTC, Jonathan M Davis wrote: And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you. If I write the comparison naively, the assembly clearly shows a "movzbl" [0]. It loads a single byte! The other single byte load is encoded in the address mode of "cmp". Implementation: bool stringcmp(string x, string y) { foreach(i; 0..x.length) { if (x[i] != y[i]) // byte compare return false; } return true; } It makes no sense to load single bytes here. Since we only want to check for equality, we could load two full words and compare four or eight bytes in one go. Ok, to answer my own question, this looks good: bool string_cmp_opt(immutable(ubyte)[] x, immutable(ubyte)[] y) { pragma(inline, false); if (x.length != y.length) return false; int i=0; // word-wise compare is faster than byte-wise if (x.length > size_t.sizeof) for (; i < x.length - size_t.sizeof; i+=size_t.sizeof) { size_t* xw = cast(size_t*) [i]; size_t* yw = cast(size_t*) [i]; if (*xw != *yw) return false; } // last sub-word part for (; i < x.length; i+=1) { if (x[i] != y[i]) // byte compare return false; } return true; } Any comments or recommendations? I don't know if this would be faster, but here is my attempt. It assumes the arrays start at an address multiple of 8. if (x is y) return true; if (x.length != y.length) return false; size_t l = x.length; ubyte* a = x.ptr, b = y.ptr; for (size_t n = l>>3; n != 0; --n, a+=8, b+=8) if (*cast(long*)a ^ *cast(long*)b) return false; if (l & 4) { if (*cast(int*)a ^ *cast(int*)b) return false; a+= 4; b+= 4; } if (l & 2) { if (*cast(short*)a ^ *cast(short*)b) return false; a+=2; b+=2; } return (l & 1) && (*a ^ *b) ? false : true; If the pointers are not on an address multiple of 8, one has to inverse the trailing tests to consume the bytes in front of the array until the address becomes a multiple of 8. The trailing tests could eventually be replaced by a simple sequential byte compare. I don't know which is faster.
Re: The Case Against Autodecode
On Sun, May 29, 2016 at 01:13:36PM +, Tobias M via Digitalmars-d wrote: > On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote: > > Ok, you have a point there, to be precise is a multigraph (a > > digraph)(cf. [1]). In French you can have multigraphs consisting of > > three or more characters /o/, as in Irish => /i:/. > > However, a phoneme is not necessarily a spoken "character" as > > represents one phoneme but consists of two "characters" or > > graphemes. can represent two different phonemes (voiced and > > unvoiced "th" as in `this` vs. `thorough`). > > What I meant was, a phoneme is the "character" (smallest unit) in a > spoken language, not that it corresponds to a character (whatever that > means). [...] Calling a phoneme a "character" is misleading. A phoneme is a logical sound unit in a spoken language, whereas a "character" is a unit of written language. The two do not necessarily have a direct correspondence (or even any correspondence whatsoever). In a language like English, whose writing system was codified many hundreds of years ago, the spoken language has sufficiently diverged from the written language (specifically, in the way words are spelt) that the correspondence between the two is complex at best, downright arbitrary at worst. For example, the 'o' in "women" and the 'i' in "fish" map to the same phoneme, the short /i/, in (common dialects of) spoken English, in spite of being two completely different characters. Therefore conflating "character" and "phoneme" is misleading and is only confusing the issue. As far as Unicode is concerned, it is a standard for representing *written* text, not spoken language, so concepts like phonemes aren't even relevant in the first place. Let's not get derailed from the present discussion by confusing the two. T -- What are you when you run out of Monet? Baroque.
[Issue 16094] error: overlapping slice assignment (CTFE)
https://issues.dlang.org/show_bug.cgi?id=16094 Era Scarecrowchanged: What|Removed |Added Keywords||CTFE --
[Issue 16094] New: error: overlapping slice assignment (CTFE)
https://issues.dlang.org/show_bug.cgi?id=16094 Issue ID: 16094 Summary: error: overlapping slice assignment (CTFE) Product: D Version: D2 Hardware: x86 OS: Windows Status: NEW Severity: minor Priority: P1 Component: dmd Assignee: nob...@puremagic.com Reporter: rtcv...@yahoo.com The following function f works properly when compiled, but during CTFE breaks and errors out with: Error: overlapping slice assignment [3..6] = [0..3] char[] f() { char[] x = new char[6]; x[3..6] = x[0..3]; return x; } enum cpy = f(); version: dmd 2.071.0 32bit OS: Windows 7 x64 --
Re: A ready to use Vulkan triangle example for D
On Sunday, 29 May 2016 at 00:42:56 UTC, maik klein wrote: On Sunday, 29 May 2016 at 00:37:54 UTC, Alex Parrill wrote: On Saturday, 28 May 2016 at 19:32:58 UTC, maik klein wrote: Btw does this even work? I think the struct initializers have to be Foo foo = { someVar: 1 }; `:` instead of a `=` I didn't do this because I actually got autocompletion for `vertexInputStateCreateInfo.` and that meant less typing for me. No, its equals. In C it's a colon, which is a tad confusing. https://dpaste.dzfl.pl/bd29c970050a Gah, I got them backwards. Colon in D, equals in C. Could have sworn I checked before I posted that...
Re: faster splitter
On Sunday, 29 May 2016 at 21:07:21 UTC, qznc wrote: On Sunday, 29 May 2016 at 18:50:40 UTC, Jon Degenhardt wrote: A minor thing - you might consider also calculating the median and median version of MAD (median of absolute deviations from the median). The reason is that benchmarks often have outliers in the max time dimension, median will do a job reducing the effect of those outliers than mean. Your benchmark code could publish both forms. I don't think that would help. This is not a normal distribution, so mean and median don't match anyways. What would you learn from the median? Oh, I see. The benchmark varies the data on each run and aggregates, is that right? Sorry, I missed that.
Re: PowerNex - New release of my D kernel
On Sunday, 29 May 2016 at 23:15:13 UTC, Wild wrote: Hey! I have new release of my D kernel called PowerNex. This release should be a bit more interesting than the last one that I release back in November 2015. This one contains a working memory manager, a custom TTY renderer, BMP image renderer, a VFS, etc. More information is in the Github release. https://github.com/Vild/PowerNex/releases/tag/v0.1.0-ALPHA The Github release also have a precompiled ISO. The project is fully open source and located at https://github.com/Vild/PowerNex under the MPLv2 license. Hopefully someone will find this interesting. All feedback is appreciated. -Dan Interesting.
Re: A technique to mock "static interfaces" (e.g. isInputRange)
On Sunday, 29 May 2016 at 22:20:05 UTC, Era Scarecrow wrote: On Thursday, 26 May 2016 at 09:40:26 UTC, Atila Neves wrote: On Wednesday, 25 May 2016 at 21:52:37 UTC, Alex Parrill wrote: Have you looked at std.typecons.AutoImplement at all? I'd never seen it before, thanks! I recall adding on a wishlist somewhere that every week or something that a video or article is made about these things. Preferably a video. It could be them talking about design decisions of certain features in D, or more likely exposure to an entire module in D that covers the basics and when you'd use each function/feature/template and why. I'm not talking you need an hour long video or something, maybe like the introduction to the STL format, or even rapid firing and showing quick premade examples/use-cases of why it was used (for optimal or problem solving reasons). The video could be say 10 minutes long. There's so much in the library I would have to go through and try to remember and I know very little of it. Exposing myself to it feels like a chore sometimes, and often I don't. Do you know about Adam's This Week in D? http://arsdnet.net/this-week-in-d/2016-may-22.html Vladimir also tries to maintain his Planet of D http://planet.dsource.org/ But the main problem is just time (videos even take more time), maybe we should have a user-based newsletter or forum where everyone can submit articles.
Re: The Case Against Autodecode
On 5/29/2016 4:47 AM, Tobias Müller wrote: No, this is well established terminology, you are confusing several things here: For D, we should stick with the terminology as defined by Unicode.
Re: Free the DMD backend
On Sunday, 29 May 2016 at 10:56:57 UTC, Russel Winder wrote: This is why LDC should be seen in the D community as the main production toolchain, and Dub should default to LDC for compilation. Agreed. Especially, LDC supports more platform.
Re: PowerNex - New release of my D kernel
On Sunday, 29 May 2016 at 23:15:13 UTC, Wild wrote: I have new release of my D kernel called PowerNex. -Dan nice works!
PowerNex - New release of my D kernel
Hey! I have new release of my D kernel called PowerNex. This release should be a bit more interesting than the last one that I release back in November 2015. This one contains a working memory manager, a custom TTY renderer, BMP image renderer, a VFS, etc. More information is in the Github release. https://github.com/Vild/PowerNex/releases/tag/v0.1.0-ALPHA The Github release also have a precompiled ISO. The project is fully open source and located at https://github.com/Vild/PowerNex under the MPLv2 license. Hopefully someone will find this interesting. All feedback is appreciated. -Dan
Re: Read registry keys recursively
On Sunday, 29 May 2016 at 16:46:34 UTC, Era Scarecrow wrote: you should see the problem. Here's the correct line! writeRegistryKeys(k.getKey(key.name())); this just occurred to me i tried to keep to the example but i shouldn't have. Since you already have the inner key, just pass that and it works. Far more obvious what's going on now. writeRegistryKeys(key);
Re: A technique to mock "static interfaces" (e.g. isInputRange)
On Thursday, 26 May 2016 at 09:40:26 UTC, Atila Neves wrote: On Wednesday, 25 May 2016 at 21:52:37 UTC, Alex Parrill wrote: Have you looked at std.typecons.AutoImplement at all? I'd never seen it before, thanks! I recall adding on a wishlist somewhere that every week or something that a video or article is made about these things. Preferably a video. It could be them talking about design decisions of certain features in D, or more likely exposure to an entire module in D that covers the basics and when you'd use each function/feature/template and why. I'm not talking you need an hour long video or something, maybe like the introduction to the STL format, or even rapid firing and showing quick premade examples/use-cases of why it was used (for optimal or problem solving reasons). The video could be say 10 minutes long. There's so much in the library I would have to go through and try to remember and I know very little of it. Exposing myself to it feels like a chore sometimes, and often I don't.
Beta D 2.071.1-b2
Second beta for the 2.071.1 release. http://dlang.org/download.html#dmd_beta http://dlang.org/changelog/2.071.1.html Please report any bugs at https://issues.dlang.org -Martin
Re: String compare in words?
On Sunday, 29 May 2016 at 20:51:19 UTC, Seb wrote: On Sunday, 29 May 2016 at 20:40:52 UTC, qznc wrote: [...] Isn't that something that the compiler should optimize for you when you do an equality comparison? Is it really faster than ldc (with all optimzations turned on)? It can be faster because of inlining. memcmp is a runtime call.
Re: The Case Against Autodecode
On 05/12/2016 10:15 PM, Walter Bright wrote: > On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote: >> I am as unclear about the problems of autodecoding as I am about the > necessity >> to remove curl. Whenever I ask I hear some arguments that work well > emotionally >> but are scant on reason and engineering. Maybe it's time to rehash > them? I just >> did so about curl, no solid argument seemed to come together. I'd be > curious of >> a crisp list of grievances about autodecoding. -- Andrei > > Here are some that are not matters of opinion. > > 6. Autodecoding has two choices when encountering invalid code units - > throw or produce an error dchar. Currently, it throws, meaning no > algorithms using autodecode can be made nothrow. There are more than 2 choices here, see the related discussion on avoiding redundant unicode validation https://issues.dlang.org/show_bug.cgi?id=14519#c32.
Re: full copies on assignment
On Friday, 27 May 2016 at 08:59:43 UTC, Marc Schütz wrote: Yes indeed it does. Thanks. Something in my version must have been different.
Re: faster splitter
On Sunday, 29 May 2016 at 18:50:40 UTC, Jon Degenhardt wrote: A minor thing - you might consider also calculating the median and median version of MAD (median of absolute deviations from the median). The reason is that benchmarks often have outliers in the max time dimension, median will do a job reducing the effect of those outliers than mean. Your benchmark code could publish both forms. I don't think that would help. This is not a normal distribution, so mean and median don't match anyways. What would you learn from the median? When looking at the assembly I don't like the single-byte loads. Since string (ubyte[] here) is of extraordinary importance, it should be worthwhile to use word loads [0] instead. Really fancy would be SSE. However, this is more advanced and requires further introspection. [0] http://forum.dlang.org/post/aahhopdcrengakvoe...@forum.dlang.org
Re: String compare in words?
On Sunday, 29 May 2016 at 20:40:52 UTC, qznc wrote: On Sunday, 29 May 2016 at 18:15:16 UTC, qznc wrote: [...] Ok, to answer my own question, this looks good: bool string_cmp_opt(immutable(ubyte)[] x, immutable(ubyte)[] y) { pragma(inline, false); if (x.length != y.length) return false; int i=0; // word-wise compare is faster than byte-wise if (x.length > size_t.sizeof) for (; i < x.length - size_t.sizeof; i+=size_t.sizeof) { size_t* xw = cast(size_t*) [i]; size_t* yw = cast(size_t*) [i]; if (*xw != *yw) return false; } // last sub-word part for (; i < x.length; i+=1) { if (x[i] != y[i]) // byte compare return false; } return true; } Any comments or recommendations? Isn't that something that the compiler should optimize for you when you do an equality comparison? Is it really faster than ldc (with all optimzations turned on)?
Re: The Case Against Autodecode
On Sun, May 29, 2016 at 03:55:22PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: > On 05/29/2016 09:42 AM, Tobias M wrote: > > On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote: > > > On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via > > > Digitalmars-d wrote: > > > > On 5/27/16 3:10 PM, ag0aep6g wrote: > > > > > I don't think there is value in distinguishing by language. > > > > > The point of Unicode is that you shouldn't need to do that. > > > > > > > > It seems code points are kind of useless because they don't > > > > really mean anything, would that be accurate? -- Andrei > > > > > > That's what we've been trying to say all along! :-P They're a > > > kind of low-level Unicode construct used for building "real" > > > characters, i.e., what a layperson would consider to be a > > > "character". > > > > Code points are *the fundamental unit* of unicode. AFAIK most (all?) > > algorithms in the unicode spec are defined in terms of code points. > > Sure, some algorithms also work on the code unit level. That can be > > used as an optimization, but they are still defined on code points. > > > > Code points are also abstracting over the different representations > > (UTF-...), providing a uniform "interface". > > So now code points are good? -- Andrei It depends on what you're trying to accomplish. That's the point we're trying to get at. For some operations, working with code points makes the most sense. But for other operations, it does not. There is no one representation that is best for all situations; it needs to be decided on a case-by-case basis. Which is why forcing everything to decode to code points eventually leads to problems. T -- Customer support: the art of getting your clients to pay for your own incompetence.
Re: String compare in words?
On Sunday, 29 May 2016 at 17:42:48 UTC, Era Scarecrow wrote: Worse I'm not sure if the code generation already does that and possibly does a better job than what we could do by hand... Not with dmd v2.071.0 or ldc 0.17.1. At least not in all the variations I tried to trick them with, like copying into fixed-size array. Well, they did insert a memcmp and libc is probably optimized like that, but they copied data first, which is unnecessary.
Re: String compare in words?
On Sunday, 29 May 2016 at 18:15:16 UTC, qznc wrote: On Sunday, 29 May 2016 at 17:38:17 UTC, Jonathan M Davis wrote: And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you. If I write the comparison naively, the assembly clearly shows a "movzbl" [0]. It loads a single byte! The other single byte load is encoded in the address mode of "cmp". Implementation: bool stringcmp(string x, string y) { foreach(i; 0..x.length) { if (x[i] != y[i]) // byte compare return false; } return true; } It makes no sense to load single bytes here. Since we only want to check for equality, we could load two full words and compare four or eight bytes in one go. Ok, to answer my own question, this looks good: bool string_cmp_opt(immutable(ubyte)[] x, immutable(ubyte)[] y) { pragma(inline, false); if (x.length != y.length) return false; int i=0; // word-wise compare is faster than byte-wise if (x.length > size_t.sizeof) for (; i < x.length - size_t.sizeof; i+=size_t.sizeof) { size_t* xw = cast(size_t*) [i]; size_t* yw = cast(size_t*) [i]; if (*xw != *yw) return false; } // last sub-word part for (; i < x.length; i+=1) { if (x[i] != y[i]) // byte compare return false; } return true; } Any comments or recommendations?
Re: Keeping a mutable reference to a struct with immutable members
On Sunday, 29 May 2016 at 19:52:37 UTC, Basile B. wrote: Do yo have a simple, concise runnable example to show ? This is the example I was using to test solutions, it's similar to where I encountered the problem in the first place import core.stdc.stdlib : malloc, free; import std.stdio; import std.range; import std.traits; struct RepeatRange(Range) if(isForwardRange!Range){ Range* source; Range original; this(Range original){ this.original = original; this.repeat(original.save); } @property auto ref front(){ return this.source.front; } void popFront(){ this.source.popFront(); if(this.source.empty) this.repeat(this.original.save); } @nogc void repeat(Range from){ if(this.source) free(this.source); ubyte* newptr = cast(ubyte*) malloc(Range.sizeof); assert(newptr !is null, "Failed to allocate memory."); ubyte* fromptr = cast(ubyte*) for(size_t i; i < Range.sizeof; i++) newptr[i] = fromptr[i]; this.source = cast(Range*) newptr; } this(this){ auto source = *this.source; this.source = null; this.repeat(source); } ~this(){ if(this.source) free(this.source); } enum bool empty = false; } struct SomeForwardRange{ int value = 0; const int other = 1; // Immutable member enum bool empty = false; @property auto ref save(){ return SomeForwardRange(this.value); } @property auto ref front(){ return this.value; } void popFront(){ this.value++; } } void main(){ auto range = RepeatRange!SomeForwardRange(SomeForwardRange(0)); foreach(item; range.take(10)){ writeln(item); } }
Re: A technique to mock "static interfaces" (e.g. isInputRange)
On Friday, 27 May 2016 at 18:49:12 UTC, Jacob Carlborg wrote: On 2016-05-27 15:12, Atila Neves wrote: [...] Hmm, here's the code inline: module red; [...] Yep, that's definitely crazier than I what I posted. Nifty! Atila
Re: The Case Against Autodecode
On 05/29/2016 09:42 AM, Tobias M wrote: On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote: On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: On 5/27/16 3:10 PM, ag0aep6g wrote: > I don't think there is value in distinguishing by language. > The point of Unicode is that you shouldn't need to do that. It seems code points are kind of useless because they don't really mean anything, would that be accurate? -- Andrei That's what we've been trying to say all along! :-P They're a kind of low-level Unicode construct used for building "real" characters, i.e., what a layperson would consider to be a "character". Code points are *the fundamental unit* of unicode. AFAIK most (all?) algorithms in the unicode spec are defined in terms of code points. Sure, some algorithms also work on the code unit level. That can be used as an optimization, but they are still defined on code points. Code points are also abstracting over the different representations (UTF-...), providing a uniform "interface". So now code points are good? -- Andrei
Re: Keeping a mutable reference to a struct with immutable members
On Sunday, 29 May 2016 at 19:09:13 UTC, pineapple wrote: On Sunday, 29 May 2016 at 18:52:36 UTC, pineapple wrote: What's the best way to handle something like this? Well I did get something to work but it's ugly and I refuse to believe there isn't a better way to handle this. Where `Range` is an alias to a struct with an immutable member, and `this.source` is the attribute that I need to be able to re-assign to a locally scoped return value: this.source = cast(Range*) newptr; Do yo have a simple, concise runnable example to show ?
[Issue 16085] Imported name causes lookup deprecation warning even if masked by member name
https://issues.dlang.org/show_bug.cgi?id=16085 --- Comment #4 from Martin Nowak--- This is just an occurrence of wrong code caused by issue 314, b/c selective imports weren't checked for visibility until recently. struct Bucketizer { import whatever : reallocate; // <- private } struct Segregator(LargeAllocator) { LargeAllocator _large; void reallocate() { _large.reallocate(); // deprecation } } --
[Issue 16085] Imported name causes lookup deprecation warning even if masked by member name
https://issues.dlang.org/show_bug.cgi?id=16085 Martin Nowakchanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #3 from Martin Nowak --- Error message at https://github.com/dlang/phobos/commit/140c51447feb951d62016cf62d8349e013ecd521^ std/experimental/allocator/building_blocks/segregator.d(188): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 1LU, 128LU, 16LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator std/experimental/allocator/building_blocks/segregator.d(182): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 129LU, 256LU, 32LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator std/experimental/allocator/building_blocks/segregator.d(188): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 257LU, 512LU, 64LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator std/experimental/allocator/building_blocks/segregator.d(182): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 513LU, 1024LU, 128LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator std/experimental/allocator/building_blocks/segregator.d(188): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 1025LU, 2048LU, 256LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator std/experimental/allocator/building_blocks/segregator.d(182): Deprecation: std.experimental.allocator.building_blocks.bucketizer.Bucketizer!(FreeList!(GCAllocator, 0LU, 18446744073709551615LU, cast(Flag)false), 2049LU, 3584LU, 512LU).Bucketizer.reallocate is not visible from module std.experimental.allocator.building_blocks.segregator --
Re: Keeping a mutable reference to a struct with immutable members
On Sunday, 29 May 2016 at 18:52:36 UTC, pineapple wrote: What's the best way to handle something like this? Well I did get something to work but it's ugly and I refuse to believe there isn't a better way to handle this. Where `Range` is an alias to a struct with an immutable member, and `this.source` is the attribute that I need to be able to re-assign to a locally scoped return value: import core.stdc.stdlib : malloc, free; if(this.source) free(source); ubyte* newptr = cast(ubyte*) malloc(Range.sizeof); assert(newptr !is null, "Failed to allocate memory."); Range saved = this.original.save; ubyte* savedptr = cast(ubyte*) for(size_t i; i < Range.sizeof; i++){ newptr[i] = savedptr[i]; } this.source = cast(Range*) newptr;
[Issue 15324] symbol is already defined / size of symbol changed
https://issues.dlang.org/show_bug.cgi?id=15324 --- Comment #6 from Ellery Newcomer--- this prevents pyd from compiling under gdc: https://github.com/ariovistus/pyd/issues/42 --
Keeping a mutable reference to a struct with immutable members
I found another post on this subject and the advice there was "don't put const members in your structs" - http://forum.dlang.org/thread/m87ln2$idv$1...@digitalmars.com This doesn't work out so well when the templated struct is referring to what happens to be a const array. I thought I could get it done with pointers, but then I realized my data was going out-of-scope. I tried using `Unqual!T thing_i_need_to_reassign_sometimes` where T was immutable but that didn't solve anything, either. What's the best way to handle something like this?
Re: faster splitter
On Sunday, 29 May 2016 at 12:22:23 UTC, qznc wrote: I played around with the benchmark. Some more numbers: The mean slowdown is 114, which means 14% slower than the fastest one. The mean absolute deviation (MAD) is 23. More precisely, the mean deviation above the mean slowdown of 103 is 100 and -13 below the mean slowdown. 1191 of the 1 runs were above the mean slowdown and 8770 below. The 39 missing runs are equal to the mean slowdown. A minor thing - you might consider also calculating the median and median version of MAD (median of absolute deviations from the median). The reason is that benchmarks often have outliers in the max time dimension, median will do a job reducing the effect of those outliers than mean. Your benchmark code could publish both forms. --Jon
Re: D plugin for Visual Studio Code
On Sunday, 29 May 2016 at 18:20:29 UTC, Martin Nowak wrote: I used the web search (which is really bad) and tried D, dlang, D lang, and D language. Apparently didn't try dlang, my fault. Would it be possible to add the other search terms? Same point had already been raised in a plugin review btw. Maybe you could also rename the plugin to dlang?
Re: Transient ranges
On Sunday, 29 May 2016 at 18:09:29 UTC, Steven Schveighoffer wrote: On 5/29/16 1:45 PM, Steven Schveighoffer wrote: On 5/27/16 7:42 PM, Seb wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? enum isTransient(R) = is(typeof(() { static assert(isInputRange!R); static assert(hasIndirections(ElementType!R)); static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this })); obviously, this is better as a simple && statement between the three requirements :) When I started writing, I thought I'd have to write some runtime code. -Steve Would that make a range of polymorphic objects transient?
[Issue 16016] Remove std.concurrencybase from the docs
https://issues.dlang.org/show_bug.cgi?id=16016 --- Comment #1 from github-bugzi...@puremagic.com --- Commits pushed to master at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/64ba09993f4d201e703cb13c0d1aa34bd83960f4 fix issue 16016 - remove std.concurrencybase from the docs https://github.com/dlang/phobos/commit/b58c1610a21ce96f9f2299fbe7f17c7cf40e8a7a Merge pull request #4377 from wilzbach/fix_issue_16016 fix issue 16016 - remove std.concurrencybase from the docs --
Re: D plugin for Visual Studio Code
On Sunday, 22 May 2016 at 12:39:08 UTC, WebFreak001 wrote: Is the name really that bad? I mean you can find it if you search for `dlang` in the editor because it has dlang in the description. I used the web search (which is really bad) and tried D, dlang, D lang, and D language.
Re: String compare in words?
On Sunday, 29 May 2016 at 17:38:17 UTC, Jonathan M Davis wrote: And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you. If I write the comparison naively, the assembly clearly shows a "movzbl" [0]. It loads a single byte! The other single byte load is encoded in the address mode of "cmp". Implementation: bool stringcmp(string x, string y) { foreach(i; 0..x.length) { if (x[i] != y[i]) // byte compare return false; } return true; } It makes no sense to load single bytes here. Since we only want to check for equality, we could load two full words and compare four or eight bytes in one go. This example is simplified and far-fetched. Actually, this is about the find algorithm [1]. [0] http://goo.gl/ttybAB [1] http://forum.dlang.org/post/vdjraubhtoqtxeshj...@forum.dlang.org
Re: Transient ranges
On 5/29/16 1:45 PM, Steven Schveighoffer wrote: On 5/27/16 7:42 PM, Seb wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? enum isTransient(R) = is(typeof(() { static assert(isInputRange!R); static assert(hasIndirections(ElementType!R)); static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this })); obviously, this is better as a simple && statement between the three requirements :) When I started writing, I thought I'd have to write some runtime code. -Steve
[Issue 16093] Trivial case of passing a template function to another template function doesn't compile
https://issues.dlang.org/show_bug.cgi?id=16093 --- Comment #2 from Max Samukha--- (In reply to b2.temp from comment #1) > It's because f is local. When f is static (i.e like a free function) it works > void main() { int x = 1; void f() { x += 1; } bar!f(); // ok } 'f' above is local, and the compiled program works as expected. Can't see why making it a template should affect the semantics. Also, if the following didn't compile, the whole "pass functions by alias" business would go down the drain: void bar(alias f)() { f(1); } void main() { int y = 1; alias f = (x) { y += x; }; // local, equivalent to "void f(A)(A x) { y += x; }" bar!f(); // ok } --
Re: @trusting generic functions
On 5/28/16 7:50 AM, Lodovico Giaretta wrote: Let's say I have a generic function that uses pointers. It will be inferred @system by the compiler, but I know that the pointer usage can be @trusted. The problem is that if I declare the function @trusted, I'm also implicitly trusting any call to @system methods of the template parameter. Dumb example (pointer usage can be trusted, but doSomething may not): auto doSomethingDumb(T)(ref T t) // I want it @trusted if doSomething is at least @trusted { T* pt = return pt.doSomething(); } Is there any way around this? Any way to declare a function @trusted as long as the methods of the template argument are at least @trusted? Thank you in advance. You can create a trusted expression by using a lambda and immediately calling it. ag0aep6g brought it up. I would write it like this (untested, but I think this works): return (()@trusted => )().doSomething(); The key is to limit your code that is tainted by @trusted to as little code as possible. Note that doSomethingDumb will be inferred @safe and not @trusted. The compiler should NEVER infer @trusted (for obvious reasons). -Steve
Re: Split general into multiple threads
On 5/29/16 7:28 AM, Dicebot wrote: On 05/26/2016 08:07 PM, Seb wrote: I think we all agree that general is having to much traffic and according to CyberShadow [1] this again is just an approval issue, however I expect this a bit controversial, so please no OT! Only other category proposals. Proposed categories: - DMD - DRuntime - Phobos - Language design (or Idea pool) - D Foundation + resources - Events - Other (formerly known as General) I want to stress that whatever categories we pick, we have to adapt them anyways if we realize that something is noisy again or too silent. https://github.com/CyberShadow/DFeed/issues/66 Without moderators to move mismatching topic between groups any more fine grained separation will do more harm than good. Note that this isn't possible with the NG being backed by NNTP. It would be nice to be able to move conversations. Instead of "please use D.learn instead", you would see "moved to more appropriate D.learn". I'd hate to lose my NNTP interface though :) -Steve
Re: String compare in words?
On Sunday, 29 May 2016 at 17:38:17 UTC, Jonathan M Davis wrote: In what way are you trying to compare them? If all you're doing is comparing them for equality, then just use ==. e.g. if(str1 == str2) { } And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you. I'm reminded that the GNU stdlib has a string compare function which defaults to using larger double words to get a speedup, and I think he wants to do that the same way. Although unless they are both the same size and both divisible by the size of size_t, then it's a multi-stage process to do correctly. Worse I'm not sure if the code generation already does that and possibly does a better job than what we could do by hand...
Re: Transient ranges
On 5/27/16 7:42 PM, Seb wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? enum isTransient(R) = is(typeof(() { static assert(isInputRange!R); static assert(hasIndirections(ElementType!R)); static assert(!allIndrectionsImmutable!(ElementType!R)); // need to write this })); -Steve
Re: Transient ranges
On 5/27/16 9:48 PM, Jonathan M Davis via Digitalmars-d wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. Every single range-based function would have to either test that the "transient" enum wasn't there or take transient ranges into account, and realistically, that isn't going to happen. For better or worse, we do have byLine in std.stdio, which has a transient front, but aside from the performance benefits, it's been a disaster. Wholly disagree. If we didn't cache the element, D would be a laughingstock of performance-minded tests. It's way too error-prone. We now have byLineCopy to combat that, but of course, byLine is the more obvious function and thus more likely to be used (plus it's been around longer), so a _lot_ of code is going to end up using it, and a good chunk of that code really should be using byLineCopy. There's nothing actually wrong with using byLine, and copying on demand. Why such a negative connotation? I'm of the opinion that if you want a transient front, you should just use opApply and skip ranges entirely. So you want to make this code invalid? Why? foreach(i; map!(a => a.to!int)(stdin.byLine)) { // process each integer ... } You want to make me copy each line to a heap-allocated string so I can parse it?!! Allowing for front to be transient - whether you can check for it or not - simply is not worth the extra complications. I'd love it if we deprecated byLine's range functions, and made it use opApply instead and just declare transient ranges to be completely unsupported. If you want to write your code to have a transient front, you can obviously take that risk, but you're on your own. There is no way to disallow front from being transient. In fact, it should be assumed that it is the default unless it's wholly a value-type. -Steve
Re: The Case Against Autodecode
On 05/12/2016 08:47 PM, Jack Stouffer wrote: If you're serious about removing auto-decoding, which I think you and others have shown has merits, you have to the THE SIMPLEST migration path ever, or you will kill D. I'm talking a simple press of a button. I'm not exaggerating here. Python, a language which was much more popular than D at the time, came out with two versions in 2008: Python 2.7 which had numerous unicode problems, and Python 3.0 which fixed those problems. Almost eight years later, and Python 2 is STILL the more popular version despite Py3 having five major point releases since and Python 2 only getting security patches. Think the tango vs phobos problem, only a little worse. D is much less popular now than was Python at the time, and Python 2 problems were more straight forward than the auto-decoding problem. You'll need a very clear migration path, years long deprecations, and automatic tools in order to make the transition work, or else D's usage will be permanently damaged. As much as I agree on the importance of a good smooth migration path, I don't think the "Python 2 vs 3" situation is really all that comparable here. Unlike Python, we wouldn't be maintaining a "with auto-decoding" fork for years and years and years, ensuring nobody ever had a pressing reason to bother migrating. And on top of that, we don't have a culture and design philosophy that promotes "do the lazy thing first and the robust thing never". D users are more likely than dynamic language users to be willing to make a few changes for the sake of improvement. Heck, we weather breaking fixes enough anyway. There was even one point within the last couple years where something (forget offhand what it was) was removed from std.datetime and its replacement was added *in the very same compiler release*. No transition period. It was an annoying pain (at least to me), but I got through it fine and never even entertained the thought of just sticking with the old compiler. Not sure most people even noticed it. Point is, in D, even when something does need to change, life goes on fine. As long as we don't maintain a long-term fork ;) Naturally, minimizing breakage is important here, but I really don't think Python's UTF migration situation is all that comparable.
Re: String compare in words?
On Sunday, May 29, 2016 17:13:49 qznc via Digitalmars-d-learn wrote: > Given two string (or char[] or ubyte[]) objects, I want to > compare them. The naive loop accesses the arrays byte-wise. How > could I turn this into a word-wise compare for better performance? > > Is a cast into size_t[] ok? Some Phobos helper functions? In what way are you trying to compare them? If all you're doing is comparing them for equality, then just use ==. e.g. if(str1 == str2) { } And if you're not simply comparing for equality, what are you looking to figure out? Without more information about what you're trying to do, it's kind of hard to help you. - Jonathan M Davis
Re: String compare in words?
On Sunday, 29 May 2016 at 17:13:49 UTC, qznc wrote: Given two string (or char[] or ubyte[]) objects, I want to compare them. The naive loop accesses the arrays byte-wise. How could I turn this into a word-wise compare for better performance? Is a cast into size_t[] ok? Some Phobos helper functions? Assuming you don't have codepoints and only want to do a raw compare, i believe you can as long as the sizes align.
Re: Transient ranges
On 5/29/16 7:28 AM, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. This doesn't help at all. I can still make a "transient" range with all three range primitives. There seems to be a misunderstanding about what a transient range is. byLine is a transient range that requires the front element be cacheable (I have to build the line somewhere, and reusing that buffer provides performance). Shoehorning into a popFront-only style "range" does not solve the problem. Not only that, but now I would have to cache BOTH the front element and the next one. -Steve
Re: Transient ranges
On 5/29/16 7:15 AM, Dicebot wrote: On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. What problems are solvable only by not caching the front element? I can't think of any. And there is no way to define "transient" ranges in a way other than explicitly declaring they are transient. There isn't anything inherent or introspectable about such ranges. -Steve
String compare in words?
Given two string (or char[] or ubyte[]) objects, I want to compare them. The naive loop accesses the arrays byte-wise. How could I turn this into a word-wise compare for better performance? Is a cast into size_t[] ok? Some Phobos helper functions?
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 13:04:18 UTC, Tobias M wrote: On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote: I am pretty sure that a single grapheme in unicode does not correspond to your notion of "character". I am pretty sure that what you think of as a "character" is officially called "Grapheme Cluster" not "Grapheme". Grapheme is a linguistic term. AFAIUI, a grapheme cluster is a cluster of codepoints representing a grapheme. It's called "cluster" in the unicode spec, because there there is no dedicated grapheme unit. I put "character" into quotes, because the term is not really well defined. I just used it for a short and pregnant answer. I'm sure there's a better/more correct definition of graphem/phoneme, but it's probably also much longer and complicated. Which is why we need to agree on a terminology, i.e. be clear when we use linguistic terms and when we use Unicode specific terminology.
[Issue 5108] Clarification on template alias parameters
https://issues.dlang.org/show_bug.cgi?id=5108 b2.t...@gmx.com changed: What|Removed |Added CC||b2.t...@gmx.com --- Comment #1 from b2.t...@gmx.com --- see also https://issues.dlang.org/show_bug.cgi?id=16093 --
[Issue 16093] Trivial case of passing a template function to another template function doesn't compile
https://issues.dlang.org/show_bug.cgi?id=16093 b2.t...@gmx.com changed: What|Removed |Added CC||b2.t...@gmx.com --- Comment #1 from b2.t...@gmx.com --- It's because f is local. When f is static (i.e like a free function) it works void bar(alias f)() { f(); } void main() { static void f()() { } bar!f(); } The doc is not clear about this case. So even if this is invalid the doc needs at least to list exactly what's supported or not. https://issues.dlang.org/show_bug.cgi?id=5108 --
Re: Copyright for Phobos to D Foundation
On Sunday, 29 May 2016 at 10:59:57 UTC, rikki cattermole wrote: Uninstall dmd and have ldc installed. Until the maintainers fix their problem with not sharing or upstreaming bug fixes and contributions. Then I would not make such endorsements.
Re: Transient ranges
On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote: On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange. Conversely, if it _does_ define `popFront()`, it _is_ an InputRange. What's the problem with introspecting that? Yes it can be introspected in library, but it breaks the expectations of the unsuspecting user. For example: auto r = getSomeRange(); assert (r.front == r.front); So, under your suggestion, the above could fail just because an implementation detail was changed like modifying some input range to become a transient range and somehow the code still compiled. Now I'm leaning more towards the buffer + popFront + empty trio, because it's harder to misuse. An alternative would be tagging such ranges with an enum or UDA, but we would need to verify/change too many places in phobos for this to work.
Re: Read registry keys recursively
On Sunday, 29 May 2016 at 15:48:49 UTC, TheDGuy wrote: Hello, i am wondering what is wrong with my code: import std.windows.registry; import std.stdio; void main(){ Key lclM = Registry.localMachine(); Key hrdw = lclM.getKey("HARDWARE"); writeRegistryKeys(hrdw); } void writeRegistryKeys(Key k){ foreach(Key key; k.keys){ writeRegistryKeys(key.getKey(key.name())); } writeln(k.name()); } i get: std.windows.registry.RegistryException@std\windows\registry.d(511): Failed to open requested key: "ACPI" Even though there is a key called 'ACPI' under localmachine/hardware? Well this was a fun thing to figure out. Geez... You have everything good except for one line. writeRegistryKeys(key.getKey(key.name())); Let's translate that. Assume key = ACPI... SO... ACPI.getkey("ACPI") you should see the problem. Here's the correct line! writeRegistryKeys(k.getKey(key.name()));
Re: Copyright for Phobos to D Foundation
On Sunday, 29 May 2016 at 10:54:34 UTC, Russel Winder wrote: On Sat, 2016-05-28 at 17:50 +, Seb via Digitalmars-d wrote: One thing that confused me a lot in the beginning, is that every Phobos module has it's own copyright - I am not a lawyer, but it sounded for me pretty weird that in theory I could get sued by a lot of Oracle-like patent trolls. I imagine the same effect also for companies when they read a different copyright on every module in Phobos. I am not sure of the situation with GDC since the GCC folk are involved with that – Iain may be able to take a view on this. Also I am not totally sure of the LDC situation. However I think it would be a very good idea if DMD, LDC, GDC, Phobos, druntime, and Dub and the repository were all copyright the D Foundation, and that all contributions had a copyright share or transfer. Licensing the common parts of the frontend and library as under the D Foundation should be no problem. Walter may have to let the FSF know though incase a clause needs changing in the copyright assignments.
Re: Transient ranges
On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote: On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange. Conversely, if it _does_ define `popFront()`, it _is_ an InputRange. What's the problem with introspecting that? Nothing. Just that it could lead to a lot of surprising mistakes, currently the following yields an error: struct A { auto front(){ ...} enum empty = false; } A as; foreach (a; as) {} // ERROR: no method popFront found with the proposed change it would compile.
Re: Transient ranges
On Sunday, 29 May 2016 at 15:45:14 UTC, Joseph Rushton Wakeling wrote: What's the problem with introspecting that? There is none :) it could be implemented today.
Re: How to hash any type to an integer?
On Sunday, 29 May 2016 at 11:05:21 UTC, Gary Willoughby wrote: I'm currently implementing a hash map as an exercise and wondered if there is a built-in function I could use to hash keys effectively? What I'm looking for is a function that hashes any variable (of any type) to an integer. I've been looking at the `getHash` function of the `TypeInfo` class but that only seems to return the passed pointer. Any ideas? How about hashOf? https://dlang.org/phobos/object.html#.hashOf
Re: Is placing data with align(32) on the stack with 16-byte alignment an error?
Am Sun, 29 May 2016 13:20:12 + schrieb Johan Engelen: > On Sunday, 29 May 2016 at 12:07:02 UTC, Marco Leise wrote: > > > > void main() { > > import core.simd; > > Matrix4x4 matrix; // No warning > > float8 vector; // No warning > > } > > Did you do some LDC IR/asm testing? No :) > With LDC, the type `float8` has 32-byte alignment and so will be > placed with that alignment on the stack. Ok, so practically all compilers honor the alignment attribute and DMD should follow suit. If I'm not mistaken, this is also a C interop ABI issue now. > For your Matrix4x4 user > type (I'll assume you meant to write `align(64)`), that alignment > becomes part of the type and will be put on the stack with > 64-byte alignment. (aliasing does not work: `alias Byte8 = > align(8) byte; Byte8 willBeUnaligned;`) Actually align(64), yes. But for this example align(32) was enough as I just wanted to focus on AVX types now. > I believe LDC respects the type's alignment when selecting > instructions, so when you specified align(32) byte for your type > it can use the aligned load instructions. If you did not specify > that alignment, or a lower alignment, it will use unaligned loads. > > A problem arises when you cast a (pointer of a) type with lower > alignment to a type with higher alignment; in that case, > currently LDC assumes that cast was valid in terms of alignment > and ! > > -Johan That sounds reasonable. Thanks for the insight. -- Marco
Read registry keys recursively
Hello, i am wondering what is wrong with my code: import std.windows.registry; import std.stdio; void main(){ Key lclM = Registry.localMachine(); Key hrdw = lclM.getKey("HARDWARE"); writeRegistryKeys(hrdw); } void writeRegistryKeys(Key k){ foreach(Key key; k.keys){ writeRegistryKeys(key.getKey(key.name())); } writeln(k.name()); } i get: std.windows.registry.RegistryException@std\windows\registry.d(511): Failed to open requested key: "ACPI" Even though there is a key called 'ACPI' under localmachine/hardware?
Re: Transient ranges
On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. I don't follow your reasoning here. In the proposal I put forward, if a range doesn't define `popFront()`, it's not an InputRange, it's a TransientRange. Conversely, if it _does_ define `popFront()`, it _is_ an InputRange. What's the problem with introspecting that?
Re: D, GTK, Qt, wx,…
On Sunday, 29 May 2016 at 11:03:36 UTC, Russel Winder wrote: From what I can tell QtD is in need of effort or restarting. I will probably give it another shot when D has better interop with C++. Particularly, when multiple inheritance of C++ interfaces is implemented, Walter admits that mingling C++ namespaces into D name hierarchy is a horrible idea, and so on.
Re: SQLite-D alpha is here [did it mention it works at CTFE]
Support for reading Index-Trees and there(WITHOUT ROWID) tables, has landed in master! Now the real fun can start. CTFE Query translation and optimization.
Re: aliasing expressions and identifiers
On Friday, 27 May 2016 at 11:50:42 UTC, Marc Schütz wrote: On Friday, 27 May 2016 at 10:04:14 UTC, Nick Treleaven wrote: On Thursday, 26 May 2016 at 08:29:41 UTC, Marc Schütz wrote: RCArray!int arr = [7]; ref r = arr[0]; arr = [9];// this releases the old array r++; // use after free ... statically prevent the above from compiling using: RCArray(T) { ... ref opIndex(size_t) return; Local refs cannot be assigned from a function returning ref if that function has any parameters marked with the return attribute. If there is no attribute, local refs + function returning ref is OK. Huh? `return` means that the returned reference is owned by the RCArray struct and must therefore not outlive it. If the RCArray is a local variable (or parameter), the local ref is always declared after it (because it must be initialized immediately), and will have a shorter scope than the RCArray. Therefore, such an assignment is always accepted. What about if the RCArray (of ref count 1) is assigned to a different one after the local ref is initialised? That is what we're discussing -it's your example above(!) It can be solved in one of two ways: Either by making the owner (`arr`) non-mutable during the existence of the references, thereby forbidding the call to `bar()` (I would prefer this one, as it's cleaner and can be used for many more things, e.g. the byLine problem) I don't see directly how this affects byLine.front, that does not return a reference. It returns a reference in the wider sense, namely a slice to a private buffer that gets overwritten by each call to `byLine`. Currently, DIP25 only applies to `ref`s in the narrow sense, but I'm assuming it will be generalized to include pointer, slices, class references, AAs and hidden context pointers. Making the ByLine range constant as long as there's a reference to its buffer would prevent surprises like this: auto lines = stdin.byLine.array; // => all elements of `lines` are the same, should have used `byLineCopy` So your solution would statically prevent popFront because front has escaped. I think we should just prevent front from escaping.
Re: The Case Against Autodecode
On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote: On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via Digitalmars-d wrote: On 5/27/16 3:10 PM, ag0aep6g wrote: > I don't think there is value in distinguishing by language. > The point of Unicode is that you shouldn't need to do that. It seems code points are kind of useless because they don't really mean anything, would that be accurate? -- Andrei That's what we've been trying to say all along! :-P They're a kind of low-level Unicode construct used for building "real" characters, i.e., what a layperson would consider to be a "character". Code points are *the fundamental unit* of unicode. AFAIK most (all?) algorithms in the unicode spec are defined in terms of code points. Sure, some algorithms also work on the code unit level. That can be used as an optimization, but they are still defined on code points. Code points are also abstracting over the different representations (UTF-...), providing a uniform "interface".
Re: Is placing data with align(32) on the stack with 16-byte alignment an error?
On Sunday, 29 May 2016 at 12:07:02 UTC, Marco Leise wrote: void main() { import core.simd; Matrix4x4 matrix; // No warning float8 vector; // No warning } Did you do some LDC IR/asm testing? With LDC, the type `float8` has 32-byte alignment and so will be placed with that alignment on the stack. For your Matrix4x4 user type (I'll assume you meant to write `align(64)`), that alignment becomes part of the type and will be put on the stack with 64-byte alignment. (aliasing does not work: `alias Byte8 = align(8) byte; Byte8 willBeUnaligned;`) I believe LDC respects the type's alignment when selecting instructions, so when you specified align(32) byte for your type it can use the aligned load instructions. If you did not specify that alignment, or a lower alignment, it will use unaligned loads. A problem arises when you cast a (pointer of a) type with lower alignment to a type with higher alignment; in that case, currently LDC assumes that cast was valid in terms of alignment and ! -Johan
Re: Why does std.container.array does not work with foraech( i, a; array ) {} ?
On Sunday, 29 May 2016 at 09:07:07 UTC, Jonathan M Davis wrote: On Sunday, May 29, 2016 07:14:12 ParticlePeter via Digitalmars-d-learn wrote: Which of the op(Index) operators is responsible for enabling this kind of syntax? Would it be possible to get it work with UFCS or would I have to wrap the array? std.container.array.Array works with foreach via ranges. foreach(e; myContainer) { } gets lowered to foreach(e; myContainer[]) { } which in turn gets lowered to something like for(auto r = myContainer[]; !r.empty; r.popFront()) { auto e = r.front; } Ranges do not support indices with foreach, and that's why you're not able to get the index with foreach and Array. However, if you use std.range.lockstep, you can wrap a range to get indices with foreach. e.g. foreach(i, e; lockstep(myContainer[])) { } http://dlang.org/phobos/std_range.html#.lockstep - Jonathan M Davis Thanks, due to your answer I found a way which is even better for me. I pimped the Array containers with some UFCS functions anyway, one of them returns the array data as a slice and this works nicely with that foreach variant as well auto data( T )( Array!T array ) { if( array.length == 0 ) return null; return (())[ 0..array.length ]; } // this works now foreach( i, a; someArrayContainer.data ) { ... } - PP
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote: Ok, you have a point there, to be precise is a multigraph (a digraph)(cf. [1]). In French you can have multigraphs consisting of three or more characters /o/, as in Irish => /i:/. However, a phoneme is not necessarily a spoken "character" as represents one phoneme but consists of two "characters" or graphemes. can represent two different phonemes (voiced and unvoiced "th" as in `this` vs. `thorough`). What I meant was, a phoneme is the "character" (smallest unit) in a spoken language, not that it corresponds to a character (whatever that means). My point was that we have to be _very_ careful not to mix our cultural experience with written text with machine representations. There's bound to be confusion. That's why we should always make clear what we refer to when we use the words grapheme, character, code point etc. I used 'character' in quotes, because it's not a well defined therm. Code point, grapheme and phoneme are well defined.
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote: I am pretty sure that a single grapheme in unicode does not correspond to your notion of "character". I am pretty sure that what you think of as a "character" is officially called "Grapheme Cluster" not "Grapheme". Grapheme is a linguistic term. AFAIUI, a grapheme cluster is a cluster of codepoints representing a grapheme. It's called "cluster" in the unicode spec, because there there is no dedicated grapheme unit. I put "character" into quotes, because the term is not really well defined. I just used it for a short and pregnant answer. I'm sure there's a better/more correct definition of graphem/phoneme, but it's probably also much longer and complicated.
Re: Copyright for Phobos to D Foundation
On Sun, 2016-05-29 at 11:24 +, Seb via Digitalmars-d wrote: > […] > > It could all be made electronically & automated, if it's > important to us. > See e.g. how the Python Software Foundation handles this: > > https://www.python.org/psf/contrib > > Can't we at least make it a requirement for future submissions? > So that we can slowly cleanup the mess instead of creating more. I note that the current Chair of the PSF is a practicing lawyer. Whatever the D Foundation does re copyright and ownership, it should not do it on the basis of what programmers say. -- Russel. = Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.win...@ekiga.net 41 Buckmaster Roadm: +44 7770 465 077 xmpp: rus...@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder signature.asc Description: This is a digitally signed message part
Re: Copyright for Phobos to D Foundation
On Sun, 2016-05-29 at 23:43 +1200, rikki cattermole via Digitalmars-d wrote: > On 29/05/2016 11:10 PM, Russel Winder via Digitalmars-d wrote: > > […] > > There is also that ldc and gdc Debian packages cannot both be > > installed > > and have gdc work :-( > > So you want gdc and ldc to both be default at the same time? > > Because with --compiler switch, and a little bit of shell magic its > not > all that hard to switch between them for default and one offs. Sorry I didn't add enough detail: this is not to do with Dub, or SCons, it is to do with Debian packaging, gdc, and ldc. The gdc module search path and the ldc module search path both have the same directory as the primary directory for search even though the install directories are different. Given that gdc and ldc are generally many D versions apart, gdc is trying to use modules from the future and it breaks. Using ldc is fine. So to use gdc you cannot have ldc installed from packages. -- Russel. = Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.win...@ekiga.net 41 Buckmaster Roadm: +44 7770 465 077 xmpp: rus...@winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder signature.asc Description: This is a digitally signed message part
[Issue 16093] New: Trivial case of passing a template function to another template function doesn't compile
https://issues.dlang.org/show_bug.cgi?id=16093 Issue ID: 16093 Summary: Trivial case of passing a template function to another template function doesn't compile Product: D Version: D2 Hardware: x86_64 OS: Linux Status: NEW Severity: normal Priority: P1 Component: dmd Assignee: nob...@puremagic.com Reporter: maxsamu...@gmail.com void bar(alias f)() { f(); } void main() { void f()() { } bar!f(); } Error: function test.main.f!().f is a nested function and cannot be accessed from test.bar!(f).bar Non-template nested functions are accepted: void main() { void f() { } bar!f(); // ok } --
Re: Transient ranges
On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. I don't think that should be a huge problem, but after having looked at the compiler code [1]: we should name it neither front nor popFront, because how would the compiler know that it is supposed to be transient and not a normal InputRange without front or popFront for which it should throw an error? Idea 1: New name that will make it easier to distinguish that transient ranges are something completly different to normal ranges. How about next? Problem 1: One can't use algorithms that work on transient ranges (map, reduce) anymore Idea 2: Help the compiler with @Transient or `enum transient = true` Problem 2: How would the "transientivity" be automatically forwarded to ranges that work on it. Btw thinking longer about it - transient ranges aren't bad per se. They objey the InputRange contract and e.g. the following works just fine. It's just impossible to distinguish between a transient and a non-transient InputRange. ``` // input: 1\n2\n\3\n4\n... void main() { import std.stdio, std.conv, std.algorithm; stdin .byLine .map!((a) => a.to!int) .sum .writeln; } ``` https://github.com/dlang/dmd/blob/master/src/statement.d#L2596
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 11:47:30 UTC, Tobias Müller wrote: On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote: Unicode graphemes are not always the same as graphemes in natural (written) languages. If <é> is composed in Unicode, it is still one grapheme in a written language, not two distinct characters. However, in natural languages two characters can be one grapheme, as in English , it represents the sound in `shower, shop, fish`. In German the same sound is represented by three characters as in `Schaf` ("sheep"). A bit nit-picky but we should make clear that we talk about "Unicode graphemes" that map to single characters on the written page. But is that at all possible across all languages? To avoid confusion and misunderstandings we should agree on the terminology first. No, this is well established terminology, you are confusing several things here: - A grapheme is a "character" as written on the page - A phoneme is a spoken "character" - A codepoint is the fundamental "unit" of unicode Graphemes are built from one or more codepoints. Phonemes are a different topic and not really covered by the unicode standard AFAIK. Except for the IPA notation, but these are again graphemes that represent phonemes. Ok, you have a point there, to be precise is a multigraph (a digraph)(cf. [1]). In French you can have multigraphs consisting of three or more characters /o/, as in Irish => /i:/. However, a phoneme is not necessarily a spoken "character" as represents one phoneme but consists of two "characters" or graphemes. can represent two different phonemes (voiced and unvoiced "th" as in `this` vs. `thorough`). My point was that we have to be _very_ careful not to mix our cultural experience with written text with machine representations. There's bound to be confusion. That's why we should always make clear what we refer to when we use the words grapheme, character, code point etc. [1] https://en.wikipedia.org/wiki/Grapheme
Re: Is placing data with align(32) on the stack with 16-byte alignment an error?
On Sunday, 29 May 2016 at 12:07:02 UTC, Marco Leise wrote: I'll try to be concise: The stack on x64 is 16-byte aligned, enough for SSE registers, but not the 32-byte AVX registers. Any data structure containing AVX registers, cannot be guaranteed to be correctly aligned on the stack, but we get no warning if we try anyways: align(32) struct Matrix4x4 { float[4][4] m; } void main() { import core.simd; Matrix4x4 matrix; // No warning float8 vector; // No warning } Now some people use align(64) just as a performance hint, for example to have a 64-byte data structure fill 1 cache-line exactly (and for all the other things like C interop, file alignment, etc.). On the other hand AVX is the first instruction set that makes use of alignments above 16 so the game has changed and will continue to do so with future x86 SIMD extensions. Perspective A: We now have "authorative" alignments that must be honored with explicit warnings/errors if not, and the status-quo: alignment hints that should be honored, but are silently ignored on the stack. The language could express this with an imagined "forcealign(32)" attribute, which disallows placing such data structures on the 16-byte aligned stack. ("forcealign" naturally overrides any smaller "align" attribute.) Perspective B: AVX vectors should generally be assumed to be unaligned. Unlike SSE, all but the "aligned load" instructions work with unaligned memory operands and the potential speed penalty. Aligned loads could be replaced with unaligned loads and the code would work again. But as compiler intrinsics continue to emit aligned loads for SIMD, this only works for AVX code written in asm - intrinsics continue to be a heisen-bug mine field. Thoughts? Some platforms don't even support unaligned loads/stores so alignment should always honored, IMO. Otherwise SIMD types would be unusable, because you can't assume that they can be placed on the stack with correct alignment.
Re: Transient ranges
On Sunday, 29 May 2016 at 11:36:37 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:28:11 UTC, ZombineDev wrote: On Sunday, 29 May 2016 at 11:15:19 UTC, Dicebot wrote: On 05/28/2016 08:27 PM, Joseph Rushton Wakeling wrote: On Saturday, 28 May 2016 at 01:48:08 UTC, Jonathan M Davis wrote: On Friday, May 27, 2016 23:42:24 Seb via Digitalmars-d wrote: So what about the convention to explicitely declare a `.transient` enum member on a range, if the front element value can change? Honestly, I don't think that supporting transient ranges is worth it. I have personally wondered if there was a case for a TransientRange concept where the only primitives defined are `empty` and `front`. `popFront()` is not defined because the whole point is that every single call to `front` will produce a different value. I would prefer such ranges to not have `front` and return new item from `popFront` instead but yes, I would much prefer it to existing form, transient or not. It is impossible to correctly define input range without caching front which may not be always possible and may have negative performance impact. Because of that, a lot of Phobos ranges compromise `front` consistency in favor of speed up - which only seems to work because most algorithms need to access `front` once. I believe this is biggest issue in D ranges design right now, by large margin. +1 I think making popFront return a value for transient ranges is a sound idea. It would allow to easily distinguish between InputRange and TransientRange with very simple CT introspection. The biggest blocker is to teach the compiler to recognize TransientRange types in foreach. Scratch that: Another option is to make popFront return a new range, ala slice[1..$] (like std.range.dropOne) which would have the benefit of allowing const/immutable ranges to work. This won't work safely, because the compiler would need to disallow access to the previous instance of the range (sort of Rust moved-from objects), but it's currently no possible. I proposed that idea because I have other uses for immutable ranges, unrelated to this discussion. Isnt the idea to communicate "hey! You may not possibly cache whatever value you get in an iteration"? If yes, just removing .front seems odd as, well, I can still obviously just cache the result of .popFront if I'm inexperienced with ranges. Maybe instead simply rename .front to .buffer for transient ranges? That name would more accurately convey that this is just a buffer the range uses to dump the current front value in, but that if you want to cache it you will need to duplicate it (because, hey, this is the buffer of the range, not yours). Not sure this is a great idea or a great name, but simply removing .front and returning the current iteration value from .popFront imho does not convey "you should not expect to be able to simply cache this by assignment" to me - arguably this is not a problem since how transient ranges work would be a documented thing, but thats just my 2 cents.
Re: Is placing data with align(32) on the stack with 16-byte alignment an error?
P.S.: From the following bug report, it looks like gcc and icc honor stack alignments >= 16: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44948 That would be a good solution for dmd, too. -- Marco
Re: faster splitter
On Sunday, 29 May 2016 at 12:22:23 UTC, qznc wrote: I played around with the benchmark. Some more numbers: $ make ldc ldmd2 -O -release -inline -noboundscheck *.d -ofbenchmark.ldc ./benchmark.ldc E: wrong result with Chris find E: wrong result with Chris find E: wrong result with Chris find std find: 153 ±25+66 (1934) -15 (7860) manual find: 122 ±28+80 (1812) -17 (8134) qznc find: 125 ±16+18 (4644) -15 (5126) Chris find: 148 ±29+75 (1976) -18 (7915) Andrei find: 114 ±23 +100 (1191) -13 (8770) (avg slowdown vs fastest; absolute deviation) $ make dmd dmd -O -release -inline -noboundscheck *.d -ofbenchmark.dmd ./benchmark.dmd E: wrong result with Chris find E: wrong result with Chris find E: wrong result with Chris find std find: 160 ±27+44 (3162) -20 (6709) manual find: 148 ±28+54 (2701) -19 (7178) qznc find: 102 ±3 +27 ( 766) -1 (9136) Chris find: 175 ±30+55 (2796) -21 (7106) Andrei find: 122 ±22+46 (2554) -14 (7351) (avg slowdown vs fastest; absolute deviation) The additional numbers on the right are the ±MAD separated by above or below the mean. For example Andrei find with ldc: Andrei find: 114 ±23 +100 (1191) -13 (8770) The mean slowdown is 114, which means 14% slower than the fastest one. The mean absolute deviation (MAD) is 23. More precisely, the mean deviation above the mean slowdown of 103 is 100 and -13 below the mean slowdown. 1191 of the 1 runs were above the mean slowdown and 8770 below. The 39 missing runs are equal to the mean slowdown. What bothers me is that changing the alphabet changes the numbers so much. Currently, if you restrict the alphabets for haystack and needle, the numbers change. The benchmark already does a random subset on each run, but there is definitely a bias. You can avoid "E: wrong result with Chris find" by using outer: for (auto i = 0; i < haystack.length-needle.length; i++) { if (haystack[i] != needle[0]) continue; for (size_t j = i+1, k = 1; k < needle.length; ++j, ++k) if (haystack[j] != needle[k]) continue outer; return haystack[i..$]; } It's a tad faster. I'm planning to test on more varied data and see where a bias may occur.
Re: faster splitter
I played around with the benchmark. Some more numbers: $ make ldc ldmd2 -O -release -inline -noboundscheck *.d -ofbenchmark.ldc ./benchmark.ldc E: wrong result with Chris find E: wrong result with Chris find E: wrong result with Chris find std find: 153 ±25+66 (1934) -15 (7860) manual find: 122 ±28+80 (1812) -17 (8134) qznc find: 125 ±16+18 (4644) -15 (5126) Chris find: 148 ±29+75 (1976) -18 (7915) Andrei find: 114 ±23 +100 (1191) -13 (8770) (avg slowdown vs fastest; absolute deviation) $ make dmd dmd -O -release -inline -noboundscheck *.d -ofbenchmark.dmd ./benchmark.dmd E: wrong result with Chris find E: wrong result with Chris find E: wrong result with Chris find std find: 160 ±27+44 (3162) -20 (6709) manual find: 148 ±28+54 (2701) -19 (7178) qznc find: 102 ±3 +27 ( 766) -1 (9136) Chris find: 175 ±30+55 (2796) -21 (7106) Andrei find: 122 ±22+46 (2554) -14 (7351) (avg slowdown vs fastest; absolute deviation) The additional numbers on the right are the ±MAD separated by above or below the mean. For example Andrei find with ldc: Andrei find: 114 ±23 +100 (1191) -13 (8770) The mean slowdown is 114, which means 14% slower than the fastest one. The mean absolute deviation (MAD) is 23. More precisely, the mean deviation above the mean slowdown of 103 is 100 and -13 below the mean slowdown. 1191 of the 1 runs were above the mean slowdown and 8770 below. The 39 missing runs are equal to the mean slowdown. What bothers me is that changing the alphabet changes the numbers so much. Currently, if you restrict the alphabets for haystack and needle, the numbers change. The benchmark already does a random subset on each run, but there is definitely a bias.
Is placing data with align(32) on the stack with 16-byte alignment an error?
I'll try to be concise: The stack on x64 is 16-byte aligned, enough for SSE registers, but not the 32-byte AVX registers. Any data structure containing AVX registers, cannot be guaranteed to be correctly aligned on the stack, but we get no warning if we try anyways: align(32) struct Matrix4x4 { float[4][4] m; } void main() { import core.simd; Matrix4x4 matrix; // No warning float8 vector; // No warning } Now some people use align(64) just as a performance hint, for example to have a 64-byte data structure fill 1 cache-line exactly (and for all the other things like C interop, file alignment, etc.). On the other hand AVX is the first instruction set that makes use of alignments above 16 so the game has changed and will continue to do so with future x86 SIMD extensions. Perspective A: We now have "authorative" alignments that must be honored with explicit warnings/errors if not, and the status-quo: alignment hints that should be honored, but are silently ignored on the stack. The language could express this with an imagined "forcealign(32)" attribute, which disallows placing such data structures on the 16-byte aligned stack. ("forcealign" naturally overrides any smaller "align" attribute.) Perspective B: AVX vectors should generally be assumed to be unaligned. Unlike SSE, all but the "aligned load" instructions work with unaligned memory operands and the potential speed penalty. Aligned loads could be replaced with unaligned loads and the code would work again. But as compiler intrinsics continue to emit aligned loads for SIMD, this only works for AVX code written in asm - intrinsics continue to be a heisen-bug mine field. Thoughts? -- Marco
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 11:47:30 UTC, Tobias Müller wrote: On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote: Unicode graphemes are not always the same as graphemes in natural (written) languages. If <é> is composed in Unicode, it is still one grapheme in a written language, not two distinct characters. However, in natural languages two characters can be one grapheme, as in English , it represents the sound in `shower, shop, fish`. In German the same sound is represented by three characters as in `Schaf` ("sheep"). A bit nit-picky but we should make clear that we talk about "Unicode graphemes" that map to single characters on the written page. But is that at all possible across all languages? To avoid confusion and misunderstandings we should agree on the terminology first. No, this is well established terminology, you are confusing several things here: - A grapheme is a "character" as written on the page - A phoneme is a spoken "character" - A codepoint is the fundamental "unit" of unicode Graphemes are built from one or more codepoints. Phonemes are a different topic and not really covered by the unicode standard AFAIK. Except for the IPA notation, but these are again graphemes that represent phonemes. I am pretty sure that a single grapheme in unicode does not correspond to your notion of "character". I am pretty sure that what you think of as a "character" is officially called "Grapheme Cluster" not "Grapheme". See here: http://www.unicode.org/glossary/#grapheme_cluster
Re: D, GTK, Qt, wx,…
El 29/05/16 a les 13:03, Russel Winder via Digitalmars-d-learn ha escrit: > GKT+ has a reputation for being dreadful on OSX and even worse on > Windows. Qt on the other hand has a reputation for being the most > portable – though clearly wx is (arguable) the most portable. > > We have GtkD which is brilliant, especially as it has GStreamer > support. > > From what I can tell QtD is in need of effort or restarting. > > Is there even a wxD? > > Or perhaps there is an alternative that fits the bill of being > production ready now, and either gives the same UI across all platforms > or provides a platform UI with no change of source code, just a > recompilation. > https://github.com/nomad-software/tkd
Re: Avoid GC with closures
On Sunday, 29 May 2016 at 11:16:57 UTC, Dicebot wrote: On 05/28/2016 09:58 PM, Iakh wrote: Yeah. It doesn't capture any context. But once it does it would be an error. Custom allocators are not very suitable for things like closures because of undefined lifetime. Even if it was allowed to replace allocator, you would be limited to either GC or RC based one anyway to keep things @safe. Maybe an interface for a ref counting allocator (that leverages Andrei's idea to use AffixAllocator's interface for storing RC metadata) can be used, provided that the interface is in druntime and the compiler knows how to use it. BTW, AffixAllocator's interface abstracts whether the metadata is stored next to the allocation or in a separate area, but the current design needs to be fixed w.r.t to shared-ness, because it breaks the type system: http://forum.dlang.org/post/bscksxwxuvzefymbi...@forum.dlang.org
Re: is my code to get CTFE instantiated object valid D ?
On Sunday, 29 May 2016 at 05:43:31 UTC, Mike Parker wrote: On Sunday, 29 May 2016 at 05:35:33 UTC, Mike Parker wrote: Well then, this completely breaks my understanding of variable scope. OK, I see now at [1] the following: " Immutable data doesn't have synchronization problems, so the compiler doesn't place it in TLS." I've read that page more than once, but I had forgotten this bit. Still, I don't see anything there about const. I would not expect const variables to behave the same way, given the weaker guarantee about modification. But if they are intended to behave that way, then, IMO, it should not be possible to reinitialize them in a static constructor. https://dlang.org/migrate-to-shared.html It's reasonable to treat const variables like immutable when the const variable has no indirections. However, it shouldn't allow rewriting the variable in each thread ctor. -Steve
Re: The Case Against Autodecode
On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote: Unicode graphemes are not always the same as graphemes in natural (written) languages. If <é> is composed in Unicode, it is still one grapheme in a written language, not two distinct characters. However, in natural languages two characters can be one grapheme, as in English , it represents the sound in `shower, shop, fish`. In German the same sound is represented by three characters as in `Schaf` ("sheep"). A bit nit-picky but we should make clear that we talk about "Unicode graphemes" that map to single characters on the written page. But is that at all possible across all languages? To avoid confusion and misunderstandings we should agree on the terminology first. No, this is well established terminology, you are confusing several things here: - A grapheme is a "character" as written on the page - A phoneme is a spoken "character" - A codepoint is the fundamental "unit" of unicode Graphemes are built from one or more codepoints. Phonemes are a different topic and not really covered by the unicode standard AFAIK. Except for the IPA notation, but these are again graphemes that represent phonemes.
Re: Split general into multiple threads
On Sunday, 29 May 2016 at 11:35:12 UTC, Seb wrote: On Sunday, 29 May 2016 at 11:28:05 UTC, Dicebot wrote: On 05/26/2016 08:07 PM, Seb wrote: I think we all agree that general is having to much traffic and according to CyberShadow [1] this again is just an approval issue, however I expect this a bit controversial, so please no OT! Only other category proposals. Proposed categories: - DMD - DRuntime - Phobos - Language design (or Idea pool) - D Foundation + resources - Events - Other (formerly known as General) I want to stress that whatever categories we pick, we have to adapt them anyways if we realize that something is noisy again or too silent. https://github.com/CyberShadow/DFeed/issues/66 Without moderators to move mismatching topic between groups any more fine grained separation will do more harm than good. NG in its current form is simply not a good tool for focused technical discussion and won't be. "Get rid of OT" idea sounds like mockery to anyone who doesn't spend his entire lifetime posting here because 90+% of NG posts are absolutely irrelevant. To avoid confusion the proposal is to reduce the number of forums from 14 (currently) to 7. So we are halfing the number and are _not_ introducing more fine grained separation. Instead the idea was rearange the forums to sth. like this: New users: - Learn / Help Community: - General - Annouce (Official annoucements) - Broadcast Development: - Core (Language and standard library development) - GDC - Third-party (Dub universe) I like this list better than the current, but with one change: taking LDC out of core and renaming it to LDC and LLVM so other D projects that leverage LLVM can be hosted there (e.g. SDC, Calypso, CPP2D, etc) and to be on par with GDC.