Re: shared - i need it to be useful
On 10/18/18 9:09 PM, Manu wrote: On Thu, Oct 18, 2018 at 5:30 PM Timon Gehr via Digitalmars-d wrote: On 18.10.18 23:34, Erik van Velzen wrote: If you have an object which can be used in both a thread-safe and a thread-unsafe way that's a bug or code smell. Then why do you not just make all members shared? Because with Manu's proposal, as soon as you have a shared method, all members effectively become shared. No they don't, only facets that overlap with the shared method. I tried to present an example before: struct Threadsafe { int x; Atomic!int y; void foo() shared { ++y; } // <- shared interaction only affects 'y' void bar() { ++x; ++y; } // <- not threadsafe, but does not violate foo's commitment; only interaction with 'y' has any commitment associated with it void unrelated() { ++x; } // <- no responsibilities are transposed here, you can continue to do whatever you like throughout the class where 'y' is not concerned } In practise, and in my direct experience, classes tend to have exactly one 'y', and either zero (pure utility), or many such 'x' members. Threadsafe API interacts with 'y', and the rest is just normal thread-local methods which interact with all members thread-locally, and may also interact with 'y' while not violating any threadsafety commitments. I promised I wouldn't respond, I'm going to break that (obviously). But that's because after reading this description I ACTUALLY understand what you are looking for. I'm going to write a fuller post later, but I can't right now. But the critical thing here is, you want a system where you can divvy up a type into pieces you share and pieces you don't. But then you *don't* want to have to share only the shared pieces. You want to share the whole thing and be sure that it can't access your unshared pieces. This critical requirement makes things a bit more interesting. For the record, the most difficult thing to reaching this understanding was that whenever I proposed anything, your answer was something like 'I just can't work with that', and when I asked why, you said 'because it's useless', etc. Fully explaining this point is very key to understanding your thinking. To be continued... -Steve
Re: shared - i need it to be useful
On 10/18/18 5:22 PM, Manu wrote: On Thu, Oct 18, 2018 at 12:15 PM Steven Schveighoffer via Digitalmars-d wrote: On 10/18/18 2:55 PM, Manu wrote: On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/18/18 10:11 AM, Simen Kjærås wrote: On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote: struct ThreadSafe { private int x; void increment() { ++x; // I know this is not shared, so no reason to use atomics } void increment() shared { atomicIncrement(&x); // use atomics, to avoid races } } But this isn't thread-safe, for the exact reasons described elsewhere in this thread (and in fact, incorrectly leveled at Manu's proposal). Someone could write this code: void foo() { ThreadSafe* a = new ThreadSafe(); shareAllOver(a); Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type ThreadSafe * And here you expect a user to perform an unsafe-cast (which they may not understand), and we have no language semantics to enforce the transfer of ownership. How do you assure that the user yields the thread-local instance? No, I expect them to do: auto a = new shared(ThreadSafe)(); I don't have any use for this design in my application. I can't use the model you prescribe, at all. Huh? This is the same thing you are asking for. How were you intending to make a thread-safe thing sharable? Surely it will be typed as shared, right? How else will you pass it to multiple threads? I think requiring the cast is un-principled in every way that D values. No cast is required. If you have shared data, it's shared. If you have thread local data, it's unshared. Allocate the data the way you expect to use it. All data is thread-local, and occasionally becomes shared during periods. I can't make use of the model you describe. If data is shared, it is shared. Once it is shared, it never goes back. In your model, everything is *assumed* shared, so that's what you need to do, initialize it as shared. It still works just as you like. Even if you never actually share it, or share it periodically. My proposal is more permissive, and allows a wider range of application designs. What are the disadvantages? The opposite is true. More designs are allowed by restricting casting as I have demonstrated many times. It's only if you intend to turn unshared data into shared data where you need an unsafe cast. It's unnecessary though, because threadsafe functions are threadsafe! You're pointlessly forcing un-safety. Why would I prefer a design that forces unsafe interactions to perform safe operations? No unsafe interactions are required for a type that defensively is shared. Just make it always shared, and you don't have any problems. It's not even as difficult as immutable, because you can still modify shared data. For instance, the shared constructor doesn't have to have special rules about initialization, it can just assume shared from the beginning. Your design us immutable, mine is const. No, your design is not const, const works on normal types. It's applicable to anything. Your design is only applicable to special types that experts write. It's not applicable to int, for instance. It feels more like a special library than a compiler feature. Tell me, how many occurrences of 'immutable' can you find in your software? ... how about const? I generally use inout whenever possible, or const when that is more appropriate. But that is for methods. For data, I generally use immutable when I want a constant. But like I said, something can't be both shared and unshared. So having shared pointers point at unshared data makes no sense -- once it's shared, it's shared. So shared really can't be akin to const. Which is more universally useful? If you had to choose one or the other, which one could you live without? I would hate to have a const where you couldn't read the data, I probably would rather have immutable. I said I would stop commenting on this thread, and I didn't keep that promise. I really am going to stop now. I'm pretty sure Walter will not agree with this mechanism, so until you convince him, I don't really need to be spending time on this. We seem to be completely understanding each others mechanisms, but not agreeing which one is correct, based on (from both sides) hypothetical types and usages. -Steve
Re: shared - i need it to be useful
On 10/18/18 2:42 PM, Stanislav Blinov wrote: On Thursday, 18 October 2018 at 18:26:27 UTC, Steven Schveighoffer wrote: On 10/18/18 1:47 PM, Stanislav Blinov wrote: On Thursday, 18 October 2018 at 17:17:37 UTC, Atila Neves wrote: On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote: 1. shared should behave exactly like const, except in addition to inhibiting write access, it also inhibits read access. How is this significantly different from now? - shared int i; ++i; Error: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(i, 1) instead. - There's not much one can do to modify a shared value as it is. i = 1; int x = i; shared int y = i; This should be fine, y is not shared when being created. 'y' isn't, but 'i' is. It's fine on amd64, but that's incidental. OH, I didn't even notice that `i` didn't have a type, so it was a continuation of the original example! I read it as declaring y as shared and assigning it to a thread-local (which it isn't actually). My bad. -Steve
Re: shared - i need it to be useful
On 10/18/18 2:55 PM, Manu wrote: On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/18/18 10:11 AM, Simen Kjærås wrote: On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote: struct ThreadSafe { private int x; void increment() { ++x; // I know this is not shared, so no reason to use atomics } void increment() shared { atomicIncrement(&x); // use atomics, to avoid races } } But this isn't thread-safe, for the exact reasons described elsewhere in this thread (and in fact, incorrectly leveled at Manu's proposal). Someone could write this code: void foo() { ThreadSafe* a = new ThreadSafe(); shareAllOver(a); Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type ThreadSafe * And here you expect a user to perform an unsafe-cast (which they may not understand), and we have no language semantics to enforce the transfer of ownership. How do you assure that the user yields the thread-local instance? No, I expect them to do: auto a = new shared(ThreadSafe)(); I think requiring the cast is un-principled in every way that D values. No cast is required. If you have shared data, it's shared. If you have thread local data, it's unshared. Allocate the data the way you expect to use it. It's only if you intend to turn unshared data into shared data where you need an unsafe cast. It's not even as difficult as immutable, because you can still modify shared data. For instance, the shared constructor doesn't have to have special rules about initialization, it can just assume shared from the beginning. -Steve
Re: shared - i need it to be useful
On 10/18/18 2:59 PM, Manu wrote: On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/18/18 10:11 AM, Simen Kjærås wrote: a.increment(); // unsafe, non-shared method call } When a.increment() is being called, you have no idea if anyone else is using the shared interface. I do, because unless you have cast the type to shared, I'm certain there is only thread-local aliasing to it. No, you can never be sure. Your assumption depends on the *user* engaging in an unsafe operation (the cast), and correctly perform a conventional act; they must correctly the safely transfer ownership. Not at all. No transfer of ownership is needed, no cast is needed. If you want to share something declare it shared. My proposal puts all requirements on the author, not the user. I think this is a much more trustworthy relationship, and in terms of cognitive load, author:users is a 1:many relationship, and I place the load on the '1', not the 'many. Sure, but we can create a system today where smart people make objects that do the right thing without compiler help. We don't need to break the guarantees of shared to do it. -Steve
Re: shared - i need it to be useful
On 10/18/18 2:24 PM, Manu wrote: I understand your argument, and I used to think this too... but I concluded differently for 1 simple reason: usability. You have not demonstrated why your proposal is usable, and the proposal to simply make shared not accessible while NOT introducing implicit conversion is somehow not usable. I find quite the opposite -- the implicit conversion introduces more pitfalls and less guarantees from the compiler. I have demonstrated these usability considerations in production. I am confident it's the right balance. Are these considerations the list below, or are they something else? If so, can you list them? I propose: 1. Normal people don't write thread-safety, a very small number of unusual people do this. I feel very good about biasing 100% of the cognitive load INSIDE the shared method. This means the expert, and ONLY the expert, must make decisions about thread-safety implementation. Thread safety is not easy. But it's also not generic. In terms of low-level things like atomics and lock-free implementations, those ARE generic and SHOULD only be written by experts. But other than that, you can't know how someone has designed all the conditions in their code. For example, you can have an expert write mutex locks and semaphores. But they can't tell you the proper order to lock different objects to ensure there's no deadlock. That's application specific. 2. Implicit conversion allows users to safely interact with safe things without doing unsafe casts. I think it's a complete design fail if you expect any user anywhere to perform an unsafe cast to call a perfectly thread-safe function. The user might not properly understand their obligations. I also do not expect anyone to perform unsafe casts in normal use. I expect them to use more generic well-written types in a shared-object library. Casting should be very rare. 3. The practical result of the above is, any complexity relating to safety is completely owned by the threadsafe author, and not cascaded to the user. You can't expect users to understand, and make correct decisions about threadsafety. Safety should be default position. I think these are great rules, and none are broken by keeping the explicit cast requirement in place. I recognise the potential loss of an unsafe optimised thread-local path. 1. This truly isn't a big deal. If this is really hurting you, you will notice on the profiler, and deploy a thread-exclusive path assuming the context supports it. This is a mischaracterization. The thread-local path is perfectly safe because only one thread can be accessing the data. That's why it's thread-local and not shared. 2. I will trade that for confidence in safe interaction every day of the week. Safety is the right default position here. You can be confident that any shared data is properly synchronized via the API provided. No confidence should be lost here. 2. You just need to make the unsafe thread-exclusive variant explicit, eg: It is explicit, the thread-exclusive variant is not marked shared, and cannot be called on data that is actually shared and needs synchronization. struct ThreadSafe { private int x; void unsafeIncrement() // <- make it explicit { ++x; // User has asserted that no sharing is possible, no reason to use atomics } void increment() shared { atomicIncrement(&x); // object may be shared } } This is more design by convention. I think this is quiet a reasonable and clearly documented compromise. I think absolutely-reliably-threadsafe-by-default is the right default position. And if you want to accept unsafe operations for optimsation circumstances, then you're welcome to deploy that in your code as you see fit. All thread-local operations are thread-safe by default, because there can be only one thread using it. That is the beauty of the current regime, regardless of how broken shared is -- unshared is solid. We shouldn't want to break that guarantee. If the machinery is not a library for distribution and local to your application, and you know for certain that your context is such that thread-local and shared are mutually exclusive, then you're free to make the unshared overload not-threadsafe; you can do this because you know your application context. You just shouldn't make widely distributed tooling this way. I can make widely distributed tooling that does both shared and unshared versions of the code, and ALL are thread safe. No choices are necessary, no compromise on performance, and no design by convention. I will indeed do this myself in some cases, because I know those facts about my application. But I wouldn't compromise the default design of shared for this optimisation potential... deliberately deployed optimisation is okay to be unsafe when taken in context. Except it's perfectly thread safe to use data without synchronization in one thread
Re: shared - i need it to be useful
On 10/18/18 1:47 PM, Stanislav Blinov wrote: On Thursday, 18 October 2018 at 17:17:37 UTC, Atila Neves wrote: On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote: 1. shared should behave exactly like const, except in addition to inhibiting write access, it also inhibits read access. How is this significantly different from now? - shared int i; ++i; Error: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(i, 1) instead. - There's not much one can do to modify a shared value as it is. i = 1; int x = i; shared int y = i; This should be fine, y is not shared when being created. However, this still is allowed, and shouldn't be: y = 5; -Steve
Re: shared - i need it to be useful
On 10/18/18 1:17 PM, Atila Neves wrote: On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote: 1. shared should behave exactly like const, except in addition to inhibiting write access, it also inhibits read access. How is this significantly different from now? - shared int i; ++i; i = i + 1; // OK(!) -Steve
Re: shared - i need it to be useful
On 10/18/18 10:11 AM, Simen Kjærås wrote: On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote: struct ThreadSafe { private int x; void increment() { ++x; // I know this is not shared, so no reason to use atomics } void increment() shared { atomicIncrement(&x); // use atomics, to avoid races } } But this isn't thread-safe, for the exact reasons described elsewhere in this thread (and in fact, incorrectly leveled at Manu's proposal). Someone could write this code: void foo() { ThreadSafe* a = new ThreadSafe(); shareAllOver(a); Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type ThreadSafe * a.increment(); // unsafe, non-shared method call } When a.increment() is being called, you have no idea if anyone else is using the shared interface. I do, because unless you have cast the type to shared, I'm certain there is only thread-local aliasing to it. This is one of the issues that MP (Manu's Proposal) tries to deal with. Under MP, your code would *not* be considered thread-safe, because the non-shared portion may interfere with the shared portion. You'd need to write two types: struct ThreadSafe { private int x; void increment() shared { atomicIncrement(&x); } } struct NotThreadSafe { private int x; void increment() { ++x; } } These two are different types with different semantics, and forcing them both into the same struct is an abomination. Why? What if I wanted to have an object that is local for a while, but then I want it to be shared (and I ensure carefully when I cast to shared that there are no other aliases to that)? In your case, the user of your type will need to ensure thread-safety. No, the contract the type provides is: if you DON'T cast unshared to shared or vice versa, the type is thread-safe. If you DO cast unshared to shared, then the type is thread-safe as long as you no longer use the unshared reference. This is EXACTLY how immutable works. You may not have any control over how he's doing things, while you *do* control the code in your own type (and module, since that also affects things). Under MP, the type is what needs to be thread-safe, and once it is, the chance of a user mucking things up is much lower. Under MP, the type is DEFENSIVELY thread-safe, locking or using atomics unnecessarily when it's thread-local. -Steve
Re: shared - i need it to be useful
On 10/18/18 9:35 AM, Steven Schveighoffer wrote: struct NotThreadsafe { private int x; void local() { ++x; // <- invalidates the method below, you violate the other function's `shared` promise } void notThreadsafe() shared { atomicIncrement(&x); } } [snip] But on top of that, if I can't implicitly cast mutable to shared, then this ACTUALLY IS thread safe, as long as all the casting in the module is sound (easy to search and verify), and hopefully all the casting is encapsulated in primitives like you have written. Because someone on the outside would have to cast a mutable item into a shared item, and this puts the responsibility on them to make sure it works. Another thing to point out -- I can make x public (not private), and it's STILL THREAD SAFE. -Steve
Re: shared - i need it to be useful
On 10/18/18 2:20 AM, Manu wrote: On Wed, Oct 17, 2018 at 5:05 AM Timon Gehr via Digitalmars-d wrote: [... all text ...] OMFG, I just spent about 3 hours writing a super-detailed reply to all of Timon's posts in aggregate... I clicked send... and it's gone. I don't know if this is a gmail thing, a mailing list thing... no idea... but it's... gone. I can't repeat that effort :( If it's gmail, it should be in sent folder, no? I've never had a gmail message that got sent fail to go into the sent box. -Steve
Re: shared - i need it to be useful
On 10/17/18 10:26 PM, Manu wrote: On Wed, Oct 17, 2018 at 6:50 PM Steven Schveighoffer via Digitalmars-d The implicit cast means that you have to look at more than just your method. You have to look at the entire module, and figure out all the interactions, to see if the thread safe method actually is thread safe. That's programming by convention, and fully trusting the programmer. I don't understand... how can the outer context affect the threadsafety of a properly encapsulated thing? [snip] You need to take it for an intellectual spin. Show me how it's corrupt rather than just presenting discomfort with the idea in theory. You're addicted to some concepts that you've carried around for a long time. There is no value in requiring casts, they're just a funky smell, and force the user to perform potentially unsafe manual conversions, or interactions that they don't understand. For example (your example): struct NotThreadsafe { private int x; void local() { ++x; // <- invalidates the method below, you violate the other function's `shared` promise } void notThreadsafe() shared { atomicIncrement(&x); } } First, note the comment. I can't look ONLY at the implementation of "notThreadSafe" (assuming the function name is less of a giveaway) in order to guarantee that it's actually thread safe. I have to look at the WHOLE MODULE. Anything could potentially do what local() does. I added private to x to at least give the appearance of thread safety. But on top of that, if I can't implicitly cast mutable to shared, then this ACTUALLY IS thread safe, as long as all the casting in the module is sound (easy to search and verify), and hopefully all the casting is encapsulated in primitives like you have written. Because someone on the outside would have to cast a mutable item into a shared item, and this puts the responsibility on them to make sure it works. I'm ALL FOR having shared be completely unusable as-is unless you cast (thanks for confirming what I suspected in your last post). It's the implicit casting which I think makes things way more difficult, and completely undercuts the utility of the compiler's mechanical checking. And on top of that, I WANT that implementation. If I know something is not shared, why would I ever want to use atomics on it? I don't like needlessly throwing away performance. This is how I would write it: struct ThreadSafe { private int x; void increment() { ++x; // I know this is not shared, so no reason to use atomics } void increment() shared { atomicIncrement(&x); // use atomics, to avoid races } } The beauty of shared not being implicitly castable, is it allows you to focus on the implementation at hand, with the knowledge that nothing else can meddle with it. The goal of mechanical checking should be to narrow the focus of what needs to be proven correct. -Steve
Re: shared - i need it to be useful
On 10/17/18 6:37 PM, Manu wrote: On Wed, Oct 17, 2018 at 12:35 PM Steven Schveighoffer via Digitalmars-d wrote: On 10/17/18 2:46 PM, Manu wrote: On Wed, Oct 17, 2018 at 10:30 AM Steven Schveighoffer via What the example demonstrates is that while you are trying to disallow implicit casting of a shared pointer to an unshared pointer, you have inadvertently allowed it by leaving behind an unshared pointer that is the same thing. This doesn't make sense... you're showing a thread-local program. The thread owning the unshared pointer is entitled to the unshared pointer. It can make as many copies at it likes. They are all thread-local. It's assumed that shared int pointer can be passed to another thread, right? Do I have to write a full program to demonstrate? And that shared(int)* provides no access. No other thread with that pointer can do anything with it. So then it's a misnomer -- it's not really shared, because I can't do anything with it. There's only one owning thread, and you can't violate that without unsafe casts. The what is the point of shared? Like why would you share data that NOBODY CAN USE? You can call shared methods. They promise threadsafety. That's a small subset of the program, but that's natural; only a very small subset of the program is safe to be called from a shared context. All I can see is that a shared method promises to be callable on shared or unshared data. In essence, it promises nothing. It's the programmer who must implement the thread safety, and there really is no help at all from the compiler for this. At some level, there will be either casts, or intrinsics, both of which are unsafe without knowing all the context of the object. In any case, it's simply a false guarantee of thread safety, which might as well be a convention of "any function which starts with TS_ is supposed to be thread safe". shared in the current form promises one thing and one thing only -- data marked as shared is actually sharable between threads, and data not marked as shared is actually not shared between threads. This new regime you are proposing does nothing extra or new, except break that guarantee. At SOME POINT, shared data needs to be readable and writable. Any correct system is going to dictate how that works. It's a good start to make shared data unusable unless you cast. But then to make it implicitly castable from unshared defeats the whole purpose. No. No casting! This is antiquated workflow.. I'm not trying to take it away from you, but it's not an interesting model for the future. `shared` can model more than just that. You can call threadsafe methods. Shared methods explicitly dictate how the system works, and in a very clear and obvious/intuitive way. The implicit cast makes using threadsafe objects more convenient when you only have one, which is extremely common. The implicit cast means that you have to look at more than just your method. You have to look at the entire module, and figure out all the interactions, to see if the thread safe method actually is thread safe. That's programming by convention, and fully trusting the programmer. I don't think this thread is going anywhere, so I'll just have to wait and see if someone else can explain it better. I'm a firm no on implicit casting from mutable to shared. -Steve
Re: shared - i need it to be useful
On 10/17/18 2:46 PM, Manu wrote: On Wed, Oct 17, 2018 at 10:30 AM Steven Schveighoffer via What the example demonstrates is that while you are trying to disallow implicit casting of a shared pointer to an unshared pointer, you have inadvertently allowed it by leaving behind an unshared pointer that is the same thing. This doesn't make sense... you're showing a thread-local program. The thread owning the unshared pointer is entitled to the unshared pointer. It can make as many copies at it likes. They are all thread-local. It's assumed that shared int pointer can be passed to another thread, right? Do I have to write a full program to demonstrate? There's only one owning thread, and you can't violate that without unsafe casts. The what is the point of shared? Like why would you share data that NOBODY CAN USE? At SOME POINT, shared data needs to be readable and writable. Any correct system is going to dictate how that works. It's a good start to make shared data unusable unless you cast. But then to make it implicitly castable from unshared defeats the whole purpose. In order for a datum to be safely shared, it must be accessed with synchronization or atomics by ALL parties. ** Absolutely ** If you have one party that can simply change it without those, you will get races. *** THIS IS NOT WHAT I'M PROPOSING *** I've explained it a few times now, but people aren't reading what I actually write, and just assume based on what shared already does that they know what I'm suggesting. You need to eject all presumptions from your mind, take the rules I offer as verbatim, and do thought experiments from there. What seems to be a mystery here is how one is to actually manipulate shared data. If it's not usable as shared data, how does one use it? That's why shared/unshared is more akin to mutable/immutable than mutable/const. Only if you misrepresent my suggestion. It's not misrepresentation, I'm trying to fill in the holes with the only logical possibilities I can think of. It's true that only one thread will have thread-local access. It's not valid any more than having one mutable alias to immutable data. And this is why the immutable analogy is invalid. It's like const. shared offers restricted access (like const), not a different class of thing. No, not at all. Somehow one must manipulate shared data. If shared data cannot be read or written, there is no reason to share it. So LOGICALLY, we have to assume, yes there actually IS a way to manipulate shared data through these very carefully constructed and guarded things. There is one thread with thread-local access, and many threads with shared access. If a shared (threadsafe) method can be defeated by threadlocal access, then it's **not threadsafe**, and the program is invalid. struct NotThreadsafe { int x; void local() { ++x; // <- invalidates the method below, you violate the other function's `shared` promise } void notThreadsafe() shared { atomicIncrement(&x); } } So the above program is invalid. Is it compilable with your added allowance of implicit casting to shared? If it's not compilable, why not? If it is compilable, how in the hell does your proposal help anything? I get the exact behavior today without any changes (except today, I need to explicitly cast, which puts the onus on me). struct Atomic(T) { void opUnary(string op : "++")() shared { atomicIncrement(&val); } private T val; } struct Threadsafe { Atomic!int x; void local() { ++x; } void threadsafe() shared { ++x; } } Naturally, local() is redundant, and it's perfectly fine for a thread-local to call threadsafe() via implicit conversion. In this case, yes. But that's not because of anything the compiler can prove. How does Atomic work? I thought shared data was not usable? I'm being pedantic because every time I say "well at some point you must be able to modify things", you explode. Complete the sentence: "In order to read or write shared data, you have to ..." Here's another one, where only a subset of the object is modeled to be threadsafe (this is particularly interesting to me): struct Threadsafe { int x; Atomic!int y; void notThreadsafe() { ++x; ++y; } void threadsafe() shared { ++y; } } In these examples, the thread-local function *does not* undermine the threadsafety of threadsafe(), it MUST NOT undermine the threadsafety of threadsafe(), or else threadsafe() **IS NOT THREADSAFE**. In the second example, you can see how it's possible and useful to do thread-local work without invalidating the objects threadsafety commitments. I've said this a bunch of times, there are 2 rules: 1. shared inhibits read and write access to members 2. `shared` methods must be threadsafe From there, shared becomes interesting and useful. Given rule 1, how does Atomic!int actually work, if it can't read or writ
Re: shared - i need it to be useful
On 10/17/18 12:27 PM, Nicholas Wilson wrote: On Wednesday, 17 October 2018 at 15:51:04 UTC, Steven Schveighoffer wrote: On 10/17/18 9:58 AM, Nicholas Wilson wrote: On Wednesday, 17 October 2018 at 13:25:28 UTC, Steven Schveighoffer wrote: It's identical to the top one. You now have a new unshared reference to shared data. This is done WITHOUT any agreed-upon synchronization. It isn't, you typo'd it (I originally missed it too). int *p3 = cast(int*)p2; vs int *p3 = p; It wasn't a typo. The first example assigns p2, the second assigns p (which is thread local) _not_ p2 (which is shared), I'm confused. Here they are again: int *p; shared int *p2 = p; int *p3 = cast(int*)p2; int *p; shared int *p2 = p; int *p3 = p; I'll put some asserts in that show they accomplish the same thing: assert(p3 is p2); assert(p3 is p); assert(p2 is p); What the example demonstrates is that while you are trying to disallow implicit casting of a shared pointer to an unshared pointer, you have inadvertently allowed it by leaving behind an unshared pointer that is the same thing. While we do implicitly allow mutable to cast to const, it's because const is a weak guarantee. It's a guarantee that the data may not change via *this* reference, but could change via other references. Shared doesn't have the same characteristics. In order for a datum to be safely shared, it must be accessed with synchronization or atomics by ALL parties. If you have one party that can simply change it without those, you will get races. That's why shared/unshared is more akin to mutable/immutable than mutable/const. It's true that only one thread will have thread-local access. It's not valid any more than having one mutable alias to immutable data. -Steve
Re: shared - i need it to be useful
On 10/17/18 9:58 AM, Nicholas Wilson wrote: On Wednesday, 17 October 2018 at 13:25:28 UTC, Steven Schveighoffer wrote: It's identical to the top one. You now have a new unshared reference to shared data. This is done WITHOUT any agreed-upon synchronization. It isn't, you typo'd it (I originally missed it too). int *p3 = cast(int*)p2; vs int *p3 = p; It wasn't a typo. It's identical in that both result in a thread-local pointer equivalent to p. Effectively, you can "cast" away shared without having to write a cast. I was trying to demonstrate the ineffectiveness of preventing implicit casting from shared to mutable if you allow unshared data to implicitly cast to shared. It's the same problem with mutable and immutable. It's why we can't allow the implicit casting. Explicit casting is OK as long as you don't later modify the data. In the same vein, explicit casting of local to shared is OK as long as you don't ever treat the data as local again. Which should requires a cast to say "I know what I'm doing, compiler". -Steve
Re: shared - i need it to be useful
On 10/17/18 10:33 AM, Nicholas Wilson wrote: On Wednesday, 17 October 2018 at 14:26:43 UTC, Timon Gehr wrote: On 17.10.2018 16:14, Nicholas Wilson wrote: I was thinking that mutable -> shared const as apposed to mutable -> shared would get around the issues that Timon posted. Unfortunately not. For example, the thread with the mutable reference is not obliged to actually make the changes that are performed on that reference visible to other threads. Yes, but that is covered by not being able to read non-atomically from a shared reference. All sides must participate in synchronization for it to make sense. The mutable side has no obligation to use atomics. It can use ++data, and race conditions will happen. -Steve
Re: shared - i need it to be useful
On 10/17/18 10:18 AM, Timon Gehr wrote: On 17.10.2018 15:40, Steven Schveighoffer wrote: On 10/17/18 8:02 AM, Timon Gehr wrote: Now, if a class has only shared members, that is another story. In this case, all references should implicitly convert to shared. There's a DIP I meant to write about this. (For all qualifiers, not just shared). When you say "shared members", you mean all the data is shared too or just the methods are shared? If not the data, D has a problem with encapsulation. Not only all the methods on the class must be shared, but ALL code in the entire module must be marked as using a shared class instance. Otherwise, other functions could modify the private data without using the proper synch mechanisms. We are better off requiring the cast, or enforcing that one must use a shared object to begin with. I think any sometimes-shared object is in any case going to benefit from parallel implementations for when the thing is unshared. -Steve The specific proposal was that, for example, if a class is defined like this: shared class C{ // ... } then shared(C) and C are implicitly convertible to each other. The change is not fully backwards-compatible, because right now, this annotation just makes all members (data and methods) shared, but child classes may introduce unshared members. OK, so the proposal is that all data and function members are shared. That makes sense. In one sense, because the class reference is conflated with the type modifier, having a C that isn't shared, actually have it's class data be shared, would be useful. -Steve
Re: shared - i need it to be useful
On 10/17/18 8:02 AM, Timon Gehr wrote: Now, if a class has only shared members, that is another story. In this case, all references should implicitly convert to shared. There's a DIP I meant to write about this. (For all qualifiers, not just shared). When you say "shared members", you mean all the data is shared too or just the methods are shared? If not the data, D has a problem with encapsulation. Not only all the methods on the class must be shared, but ALL code in the entire module must be marked as using a shared class instance. Otherwise, other functions could modify the private data without using the proper synch mechanisms. We are better off requiring the cast, or enforcing that one must use a shared object to begin with. I think any sometimes-shared object is in any case going to benefit from parallel implementations for when the thing is unshared. -Steve
Re: shared - i need it to be useful
On 10/16/18 8:26 PM, Manu wrote: On Tue, Oct 16, 2018 at 2:20 PM Steven Schveighoffer via Digitalmars-d wrote: On 10/16/18 4:26 PM, Manu wrote: On Tue, Oct 16, 2018 at 11:30 AM Steven Schveighoffer via Digitalmars-d wrote: int x; shared int *p = &x; // allow implicit conversion, currently error passToOtherThread(p); useHeavily(&x); What does this mean? It can't do anything... that's the whole point here. I think I'm struggling here with people bringing presumptions to the thread. You need to assume the rules I define in the OP for the experiment to work. OK, I wrote a whole big response to this, and I went and re-quoted the above, and now I think I understand what the point of your statement is. I'll first say that if you don't want to allow implicit casting of shared to mutable, It's critical that this is not allowed. It's totally unreasonable to cast from shared to thread-local without synchronisation. OK, so even with synchronization in the second thread when you cast, you still have a thread-local pointer in the originating thread WITHOUT synchronization. It's as bad as casting away const. Of course! But shared has a different problem from const. Const allows the data to change through another reference, shared cannot allow changes without synchronization. Changes without synchronization are *easy* with an unshared reference. Data can't be shared and unshared at the same time. then you can't allow implicit casting from mutable to shared. Because it's mutable, races can happen. I don't follow... You seem to be saying that shared data is unusable. But why the hell have it then? At some point it has to be usable. And the agreed-upon use is totally defeated if you also have some stray non-shared reference to it. There is in fact, no difference between: int *p; shared int *p2 = p; int *p3 = cast(int*)p2; Totally illegal!! You casted away shared. That's as bad as casting away const. But if you can't do anything with shared data, how do you use it? and this: int *p; shared int *p2 = p; int *p3 = p; There's nothing wrong with this... I don't understand the point? It's identical to the top one. You now have a new unshared reference to shared data. This is done WITHOUT any agreed-upon synchronization. So really, the effort to prevent the reverse cast is defeated by allowing the implicit cast. Only the caller has the thread-local instance. You can take a thread-local pointer to a thread-local within the context of a single thread. So, it's perfectly valid for `p` and `p3` to exist in a single scope. `p2` is fine here too... and if that shared pointer were to escape to another thread, it wouldn't be a threat, because it's not readable or writable, and you can't make it back into a thread-local pointer without carefully/deliberately deployed machinery. Huh? If shared data can never be used, why have it? Pretend that p is not a pointer to an int, but a pointer to an UNSHARED type that has shared methods on it and unshared methods (for when you don't need any sync). Now the shared methods will obey the sync, but the unshared ones won't. The result is races. I can't understand how you don't see that. There is a reason we disallow assigning from mutable to immutable without a cast. Yet, it is done in many cases, because you are sometimes building an immutable object with mutable pieces, and want to cast the final result. I don't think analogy to immutable has a place in this discussion, or at least, I don't understand the relevance... I think the reasonable analogy is const. No, immutable is more akin to shared because immutable and mutable are completely different. const can point at mutable or immutable data. shared can't be both shared and unshared. There's no comparison. Data is either shared or not shared, there is no middle ground. There is no equivalent of const to say "this data could be shared, or could be unshared". In this case, it's ON YOU to make sure it's correct, and the traditional mechanism for the compiler giving you the responsibility is to require a cast. I think what you're talking about are behaviours relating to casting shared *away*, and that's some next-level shit. Handling in that case is no different to the way it exists today. You must guarantee that the pointer you possess becomes thread-local before casting it to a thread-local pointer. In my application framework, I will never cast shared away under my proposed design. We don't have any such global locks. OK, so how does shared data actually operate? Somewhere, the magic has to turn into real code. If not casting away shared, what do you suggest? - OK, so here is where I think I misunderstood your point. When you said a lock-free queue would be unus
Re: shared - i need it to be useful
On 10/16/18 6:24 PM, Nicholas Wilson wrote: On Tuesday, 16 October 2018 at 21:19:26 UTC, Steven Schveighoffer wrote: OK, so here is where I think I misunderstood your point. When you said a lock-free queue would be unusable if it wasn't shared, I thought you meant it would be unusable if we didn't allow the implicit cast. But I realize now, you meant you should be able to use a lock-free queue without it being actually shared anywhere. What I say to this is that it doesn't need to be usable. I don't care to use a lock-free queue in a thread-local capacity. I'll just use a normal queue, which is easy to implement, and doesn't have to worry about race conditions or using atomics. A lock free queue is a special thing, very difficult to get right, and only really necessary if you are going to share it. And used for performance reasons! I think this comes up where the queue was originally shared, you acquired a lock on the thing it is a member of, and you want to continue using it through your exclusive reference. Isn't that a locking queue? I thought we were talking lock-free? -Steve
Re: shared - i need it to be useful
On 10/16/18 4:26 PM, Manu wrote: On Tue, Oct 16, 2018 at 11:30 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/16/18 2:10 PM, Manu wrote: On Tue, Oct 16, 2018 at 6:35 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/16/18 9:25 AM, Steven Schveighoffer wrote: On 10/15/18 2:46 PM, Manu wrote: From there, it opens up another critical opportunity; T* -> shared(T)* promotion. Const would be useless without T* -> const(T)* promotion. Shared suffers a similar problem. If you write a lock-free queue for instance, and all the methods are `shared` (ie, threadsafe), then under the current rules, you can't interact with the object when it's not shared, and that's fairly useless. Oh, I didn't see this part. Completely agree with Timon on this, no implicit conversions should be allowed. Why? int x; shared int *p = &x; // allow implicit conversion, currently error passToOtherThread(p); useHeavily(&x); What does this mean? It can't do anything... that's the whole point here. I think I'm struggling here with people bringing presumptions to the thread. You need to assume the rules I define in the OP for the experiment to work. OK, I wrote a whole big response to this, and I went and re-quoted the above, and now I think I understand what the point of your statement is. I'll first say that if you don't want to allow implicit casting of shared to mutable, then you can't allow implicit casting from mutable to shared. Because it's mutable, races can happen. There is in fact, no difference between: int *p; shared int *p2 = p; int *p3 = cast(int*)p2; and this: int *p; shared int *p2 = p; int *p3 = p; So really, the effort to prevent the reverse cast is defeated by allowing the implicit cast. There is a reason we disallow assigning from mutable to immutable without a cast. Yet, it is done in many cases, because you are sometimes building an immutable object with mutable pieces, and want to cast the final result. In this case, it's ON YOU to make sure it's correct, and the traditional mechanism for the compiler giving you the responsibility is to require a cast. - OK, so here is where I think I misunderstood your point. When you said a lock-free queue would be unusable if it wasn't shared, I thought you meant it would be unusable if we didn't allow the implicit cast. But I realize now, you meant you should be able to use a lock-free queue without it being actually shared anywhere. What I say to this is that it doesn't need to be usable. I don't care to use a lock-free queue in a thread-local capacity. I'll just use a normal queue, which is easy to implement, and doesn't have to worry about race conditions or using atomics. A lock free queue is a special thing, very difficult to get right, and only really necessary if you are going to share it. And used for performance reasons! Why would I want to incur performance penalties when using a lock-free queue in an unshared mode? I would actually expect 2 separate implementations of the primitives, one for shared one for unshared. What about primitives that would be implemented the same? In that case, the shared method becomes: auto method() { return (cast(Queue*)&this).method; } Is this "unusable"? Without a way to say, you can call this on shared or unshared instances, then we need to do it this way. But I would trust the queue to handle this properly depending on whether it was typed shared or not. -Steve
Re: shared - i need it to be useful
On 10/16/18 2:10 PM, Manu wrote: On Tue, Oct 16, 2018 at 6:35 AM Steven Schveighoffer via Digitalmars-d wrote: On 10/16/18 9:25 AM, Steven Schveighoffer wrote: On 10/15/18 2:46 PM, Manu wrote: From there, it opens up another critical opportunity; T* -> shared(T)* promotion. Const would be useless without T* -> const(T)* promotion. Shared suffers a similar problem. If you write a lock-free queue for instance, and all the methods are `shared` (ie, threadsafe), then under the current rules, you can't interact with the object when it's not shared, and that's fairly useless. Oh, I didn't see this part. Completely agree with Timon on this, no implicit conversions should be allowed. Why? int x; shared int *p = &x; // allow implicit conversion, currently error passToOtherThread(p); useHeavily(&x); How is this safe? Thread1 is using x without locking, while the other thread has to lock. In order for synchronization to work, both sides have to agree on a synchronization technique and abide by it. If you want to have a lock-free implementation of something, you can abstract the assignments and reads behind the proper mechanisms anyway, and still avoid locking (casting is not locking). Sorry, I don't understand what you're saying. Can you clarify? I'd still mark a lock-free implementation shared, and all its methods shared. shared does not mean you have to lock, just cast away shared. A lock-free container still has to do some special things to make sure it avoids races, and having an "unusable" state aids in enforcing this. -Steve
Re: shared - i need it to be useful
On 10/16/18 9:25 AM, Steven Schveighoffer wrote: On 10/15/18 2:46 PM, Manu wrote: From there, it opens up another critical opportunity; T* -> shared(T)* promotion. Const would be useless without T* -> const(T)* promotion. Shared suffers a similar problem. If you write a lock-free queue for instance, and all the methods are `shared` (ie, threadsafe), then under the current rules, you can't interact with the object when it's not shared, and that's fairly useless. Oh, I didn't see this part. Completely agree with Timon on this, no implicit conversions should be allowed. If you want to have a lock-free implementation of something, you can abstract the assignments and reads behind the proper mechanisms anyway, and still avoid locking (casting is not locking). -Steve
Re: shared - i need it to be useful
On 10/15/18 2:46 PM, Manu wrote: Okay, so I've been thinking on this for a while... I think I have a pretty good feel for how shared is meant to be. 1. shared should behave exactly like const, except in addition to inhibiting write access, it also inhibits read access. I think this is the foundation for a useful definition for shared, and it's REALLY easy to understand and explain. Current situation where you can arbitrarily access shared members undermines any value it has. Shared must assure you don't access members unsafely, and the only way to do that with respect to data members, is to inhibit access completely. I think shared is just const without read access. Assuming this world... how do you use shared? 1. traditional; assert that the object become thread-local by acquiring a lock, cast shared away 2. object may have shared methods; such methods CAN be called on shared instances. such methods may internally implement synchronisation to perform their function. perhaps methods of a lock-free queue structure for instance, or operator overloads on `Atomic!int`, etc. In practise, there is no functional change in usage from the current implementation, except we disallow unsafe accesses (which will make the thing useful). From there, it opens up another critical opportunity; T* -> shared(T)* promotion. Const would be useless without T* -> const(T)* promotion. Shared suffers a similar problem. If you write a lock-free queue for instance, and all the methods are `shared` (ie, threadsafe), then under the current rules, you can't interact with the object when it's not shared, and that's fairly useless. Assuming the rules above: "can't read or write to members", and the understanding that `shared` methods are expected to have threadsafe implementations (because that's the whole point), what are the risks from allowing T* -> shared(T)* conversion? All the risks that I think have been identified previously assume that you can arbitrarily modify the data. That's insanity... assume we fix that... I think the promotion actually becomes safe now...? Destroy... This is a step in the right direction. But there is still one problem -- shared is inherently transitive. So casting away shared is super-dangerous, even if you lock the shared data, because any of the subreferences will become unshared and read/writable. For instance: struct S { int x; int *y; } shared int z; auto s1 = shared(S)(1, &z); auto s2 = shared(S)(2, &z); S* s1locked = s1.lock; Now I have access to z via s1locked as an unshared int, and I never locked z. Potentially one could do the same thing via s2, and now there are 2 mutable references, potentially in 2 threads. All of this, of course, is manual. So technically we could manually implement it properly inside S. But this means shared doesn't help us much. We really need on top of shared, a way to specify something is tail-shared. That is, all the data in S is unshared, but anything it points to is still shared. That at least helps the person implementing the manual locking from doing stupid things himself. -Steve
Re: fork vs. posix_spawn (vfork)
On 10/14/18 7:36 AM, notna wrote: Hi D gurus. Did read an interesting post form GitLab [1] how they improved performance by 30x by just going to go_v1.9... because they again went from "fork" to "posix_spawn"... I've searched the GitHub DLANG org for "posix_spawn" and didn't find a hit... so asking myself and you is DLANG still on "fork" and could there be some performance improvement potential? [1] https://about.gitlab.com/2018/01/23/how-a-fix-in-go-19-sped-up-our-gitaly-service-by-30x/ Related: https://issues.dlang.org/show_bug.cgi?id=14770 -Steve
Re: D Logic bug
On 10/12/18 6:06 AM, Kagamin wrote: On Thursday, 11 October 2018 at 23:17:15 UTC, Jonathan Marler wrote: I had a look at the table again, looks like the ternary operator is on there, just called the "conditional operator". And to clarify, D's operator precedence is close to C/C++ but doesn't match exactly. This is likely a result of the grammar differences rather than an intention one. For example, the "Conditional operator" in D actually has a higher priority than an assignment, but in C++ it's the same and is evaluated right-to-left. So this expression would be different in C++ and D: a ? b : c = d In D it would be: (a ? b : c ) = d And in C++ would be: a ? b : (c = d) That's https://issues.dlang.org/show_bug.cgi?id=14186 Wow, interesting that C precedence is different from C++ here. -Steve
Re: D Logic bug
On 10/11/18 9:16 PM, Jonathan Marler wrote: On Thursday, 11 October 2018 at 23:29:05 UTC, Steven Schveighoffer wrote: On 10/11/18 7:17 PM, Jonathan Marler wrote: I had a look at the table again, looks like the ternary operator is on there, just called the "conditional operator". And to clarify, D's operator precedence is close to C/C++ but doesn't match exactly. This is likely a result of the grammar differences rather than an intention one. For example, the "Conditional operator" in D actually has a higher priority than an assignment, but in C++ it's the same and is evaluated right-to-left. So this expression would be different in C++ and D: Not in my C/D code. It would have copious parentheses everywhere :) Good :) Yep. General rule of thumb for me after having been burned many many times -- Always use parentheses to define order of operations when dealing with bitwise operations (and, or, xor) and for the ternary operator. I think I do make an exception when it's a simple assignment. i.e.: a = cond ? 1 : 2; That case is actually very strange, I don't know if it's something that's really common. Yes, that explains why myself, Jonathan Davis and certainly others didn't know there were actually differences between C++ and D Operator precedence :) I wasn't sure myself but having a quick look at each's operator precedence table made it easy to find an expression that behaves differently in both. I actually was curious whether DMC followed the rules (hey, maybe Walter just copied his existing code!), but it does follow C's rules. -Steve
Re: D Logic bug
On 10/11/18 7:17 PM, Jonathan Marler wrote: I had a look at the table again, looks like the ternary operator is on there, just called the "conditional operator". And to clarify, D's operator precedence is close to C/C++ but doesn't match exactly. This is likely a result of the grammar differences rather than an intention one. For example, the "Conditional operator" in D actually has a higher priority than an assignment, but in C++ it's the same and is evaluated right-to-left. So this expression would be different in C++ and D: Not in my C/D code. It would have copious parentheses everywhere :) That case is actually very strange, I don't know if it's something that's really common. -Steve
Re: LDC2 1.9.0 beta 1 bug
On 10/5/18 5:41 AM, Kagamin wrote: On Thursday, 4 October 2018 at 12:51:27 UTC, Shachar Shemesh wrote: More to the point, however, expanding the call to the second form means that I can *never* supply non-default values to arg1 and arg2. You wrote it yourself: f!()(true, 'S') This is a terrible workaround. It looks OK with no vararg parameters, but lousy if you have any. i.e.: f(arg1, arg2, arg3, true, 'S') becomes: f!(typeof(arg1), typeof(arg2), typeof(arg3))(arg1, arg2, arg3, true, 'S'); -Steve
Re: LDC2 1.9.0 beta 1 bug
On 10/4/18 8:51 AM, Shachar Shemesh wrote: I got this as a report from a user, not directly running this, which is why I'm not opening a bug report. Consider the following function: void f(ARGS...)(ARGS args, bool arg1 = true, char arg2 = 'H'); Now consider the following call to it: f(true, 'S'); Theoretically, this can either be calling f!()(true, 'S') or f!(bool, char)(true, 'S', true, 'H'); Under 1.8.0, it would do the former. Under 1.9.0-beta1, the later. Why is this a bug? Two reasons. First, this is a change of behavior. More to the point, however, expanding the call to the second form means that I can *never* supply non-default values to arg1 and arg2. You are correct that it's a change in behavior. Johan brought this up earlier when the release happened [1], and I agree with both you and him that the behavior change requires at least a deprecation cycle. But it doesn't seem to be getting traction with the people who have made the decision in the first place (and Walter simply said to post a bug report, which has happened). I will point out a couple things: 1. Yes you can supply non-default values to arg1 and arg2, you just can't use ifti. I can't begin to describe how useless this is. 2. The problem with the original behavior is that you couldn't *actually use* the default parameters. In other words, this doesn't compile: f(); So technically, it was simply an error to provide default parameters in a template variadic (the explicit instantiation workaround was allowed, but again, useless). My argument in the bug report is that the whole reason it was added (to allow file and line numbers to be runtime parameters in exception constructors) is more correctly fixed by fixing another issue, https://issues.dlang.org/show_bug.cgi?id=18919, and the expected way that default parameters behave should be implemented instead. Both the old way and the new way have large inconsistency problems. -Steve [1] https://forum.dlang.org/post/myuhmpfygyufxpucv...@forum.dlang.org
Re: `shared`...
On 10/1/18 7:09 PM, Manu wrote: On Mon, Oct 1, 2018 at 8:55 AM Timon Gehr via Digitalmars-d wrote: On 01.10.2018 04:29, Manu wrote: struct Bob { void setThing() shared; } As I understand, `shared` attribution intends to guarantee that I dun synchronisation internally. This method is declared shared, so if I have shared instances, I can call it... because it must handle thread-safety internally. void f(ref shared Bob a, ref Bob b) { a.setThing(); // I have a shared object, can call shared method b.setThing(); // ERROR } This is the bit of the design that doesn't make sense to me... The method is shared, which suggests that it must handle thread-safety. My instance `b` is NOT shared, that is, it is thread-local. So, I know that there's not a bunch of threads banging on this object... but the shared method should still work! A method that handles thread-safety doesn't suddenly not work when it's only accessed from a single thread. ... shared on a method does not mean "this function handles thread-safety". It means "the `this` pointer of this function is not guaranteed to be thread-local". You can't implicitly create an alias of a reference that is supposed to be thread-local such that the resulting reference can be freely shared among threads. I don't understand. That's the point of `scope`... is that it won't escape the reference. 'freely shared' is the antithesis of `scope`. I feel like I don't understand the design... mutable -> shared should work the same as mutable -> const... because surely that's safe? No. The main point of shared (and the main thing you need to understand) is that it guarantees that if something is _not_ `shared` is is not shared among threads. Your analogy is not correct, going from thread-local to shared is like going from mutable to immutable. We're talking about `mutable` -> `shared scope`. That's like going from mutable to const. `shared scope` doesn't say "I can share this", what it says is "this may be shared, but *I won't share it*", and that's the key. By passing a thread-local as `shared scope`, the receiver accepts that the argument _may_ be shared (it's not in this case), but it will not become shared in the call. That's the point of scope, no? If the suggested typing rule was implemented, we would have the following way to break the type system, allowing arbitrary aliasing between mutable and shared references, completely defeating `shared`: class C{ /*...*/ } shared(C) sharedGlobal; struct Bob{ C unshared; void setThing() shared{ sharedGlobal=unshared; } } void main(){ C c = new C(); // unshared! Bob(c).setThing(); shared(D) d = sharedGlobal; // shared! assert(c !is d); // would fail (currently does not even compile) // sendToOtherThread(d); // c.someMethod(); // (potential) race condition on unshared data } Your entire example depends on escaping references. I think you missed the point? The problem with mutable wildcards is that you can assign them. This exposes the problem in your design. The reason const works is because you can't mutate it. Shared is not the same. simple example: void foo(scope shared int *a, scope shared int *b) { a = b; } If I can bind a to a local mutable int pointer, and b as a pointer to global shared int, the assignment is now considered OK (types and scopes are the same), but now my local points at a shared int without the shared adornments. The common wildcard you need between shared and mutable is *unique*. That is, even though it's typed as shared or unshared, the compiler has guaranteed there is no other reference to that data. In that case, you can move data from one place to another without compromising the system (as you assign from one unique pointer to another, the original must have to be nullified, otherwise the wildcard still would not work, and the unique property would cease to be accurate). IMO, the correct way to deal with shared would be to make it 100% unusable. Not readable, or writable. And then you have to cast away shared to make it work (and hopefully performing the correct locking to make sure your changes are defined). I don't think there's a magic bullet that can fix this. -Steve
Re: `shared`...
On 10/1/18 7:56 PM, Steven Schveighoffer wrote: On 10/1/18 7:09 PM, Manu wrote: Your entire example depends on escaping references. I think you missed the point? The problem with mutable wildcards is that you can assign them. This exposes the problem in your design. The reason const works is because you can't mutate it. Shared is not the same. simple example: void foo(scope shared int *a, scope shared int *b) { a = b; } Haha, of course, this has no effect! In order for it to show the problem, a has to be ref'd. -Steve
Re: Updating D beyond Unicode 2.0
On 9/26/18 4:43 PM, Walter Bright wrote: But expanding it seems of vanishingly little value. Note that each thing that gets added to D adds weight to it, and it needs to pull its weight. Nothing is free. It may be the weight is already there in the form of unicode symbol support, just the range of the characters supported isn't good enough for some languages. It might be like replacing your refrigerator -- you get an upgrade, but it's not going to take up any more space because you get rid of the old one. I would like to see the PR before passing judgment on the heft of the change. The value is simply in the consistency -- when some of the words for your language can be valid symbols but others can't, then it becomes a weird guessing game as to what is supported. It would be like saying all identifiers can have any letters except `q`. Sure, you can get around that, but it's weirdly exclusive. I claim complete ignorance as to what is required, it hasn't been technically laid out what is at stake, and I'm not bilingual anyway. It could be true that I'm completely misunderstanding the positions of others. -Steve
Re: BetterC and CTFE mismatch
On 9/26/18 5:08 AM, Sebastiaan Koppe wrote: On Wednesday, 26 September 2018 at 08:22:26 UTC, Simen Kjærås wrote: This is essentially an arbitrary restriction. The basic reason is if a function is compiled (even just for CTFE), it ends up in the object files, and you've asked for only betterC functions to end up in the object files. -- Simen So anything I do at CTFE has to be betterC as well? That is a bummer. This is an artificial, and not really intended, limitation. Essentially, CTFE has to be a real function. If it's defined, it's expected to be callable from runtime as well as CTFE. But I can't see why, if you don't call from runtime, it should matter. I think this has to do with the places betterC is enforced in the compiler. I'll try to workaround this, but I would like to see this fixed. Is there anything I can do to move this forward? I'd suggest a bug report if one hasn't been made. -Steve
Re: Updating D beyond Unicode 2.0
On 9/26/18 5:54 AM, rjframe wrote: On Fri, 21 Sep 2018 16:27:46 +, Neia Neutuladh wrote: I've got this coded up and can submit a PR, but I thought I'd get feedback here first. Does anyone see any horrible potential problems here? Or is there an interestingly better option? Does this need a DIP? I just want to point out since this thread is still living that there have been very few answers to the actual question ("should I submit my PR?"). Walter did answer the question, with the reasons that Unicode identifier support is not useful/helpful and could cause issues with tooling. Which is likely correct; and if we really want to follow this logic, Unicode identifier support should be removed from D entirely. This is a non-starter. We can't break people's code, especially for trivial reasons like 'you shouldn't code that way because others don't like it'. I'm pretty sure Walter would be against removing Unicode support for identifiers. I don't recall seeing anyone in favor providing technical reasons, save the OP. There doesn't necessarily need to be a technical reason. In fact, there really isn't one -- people can get by with using ASCII identifiers just fine (and many/most people do). Supporting Unicode would be purely for social or inclusive reasons (it may make D more approachable to non-English speaking schoolchildren for instance). As an only-English speaking person, it doesn't bother me either way to have Unicode identifiers. But the fact that we *already* support Unicode identifiers leads me to expect that we support *all* Unicode identifiers. It doesn't make a whole lot of sense to only support some of them. Especially since the work is done, it makes sense to me to ask for the PR for review. Worst case scenario, it sits there until we need it. I suggested this as well. https://forum.dlang.org/post/poaq1q$its$1...@digitalmars.com I think it stands a good chance of getting incorporated, just for the simple fact that it's enabling and not disruptive. -Steve
Re: Updating D beyond Unicode 2.0
On 9/26/18 2:50 AM, Shachar Shemesh wrote: On 25/09/18 15:35, Dukc wrote: Another reason is that something may not have a good translation to English. If there is an enum type listing city names, it is IMO better to write them as normal, using Unicode. CityName.seinäjoki, not CityName.seinaejoki. This sounded like a very compelling example, until I gave it a second thought. I now fail to see how this example translates to a real-life scenario. City names (data, changes over time) as enums (compile time set) seem like a horrible idea. That may sound like a very technical objection to an otherwise valid point, but it really think that's not the case. The properties that cause city names to be poor candidates for enum values are the same as those that make them Unicode candidates. Hm... I could see actually some "clever" use of opDispatch being used to define cities or other such names. In any case, I think the biggest pro for supporting Unicode symbol names is -- we already support Unicode symbol names. It doesn't make a whole lot of sense to only support some of them. -Steve
Re: Forums intermittently going down?
On 9/25/18 5:05 PM, H. S. Teoh wrote: On Tue, Sep 25, 2018 at 08:41:51PM +, Vladimir Panteleev via Digitalmars-d wrote: On Tuesday, 25 September 2018 at 18:26:58 UTC, CharlesM wrote: Yeah it happened again today. I heard this site was made in D, maybe is because the GC? No, just old server hardware and database fragmentation. Wow, that's GC-phobia like I've never seen before! Well, I thought it might be GC related also. It behaves similarly to how you would expect a GC pause to behave (several fast responses, then one that takes 5 seconds to come back). But lately, I've noticed I just get the "down for maintenance" message more than a delayed response. In any case, I generally don't use the forum except read-only mode on my phone. For posting, I'm generally using NNTP. I'll note that when I started running into DB slowdowns on a system (not related to D), adding one index fixed the issue. Sometimes linear searches are fast enough to hide in plain sight :) -Steve
Re: Updating D beyond Unicode 2.0
On 9/24/18 3:18 PM, Patrick Schluter wrote: On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote: 2. There are no rules about what *encoding* is acceptable, it's implementation defined. So various compilers have different rules as to what will be accepted in the actual source code. In fact, I read somewhere that not even ASCII is guaranteed to be supported. Indeed. IBM mainframes have C compilers too but not ASCII. They code in EBCDIC. That's why for instance it's not portable to do things like if(c >= 'A' && c <= 'Z') printf("CAPITAL LETTER\n"); is not true in EBCDIC. Right. But it's just a side-note -- I'd guess all modern compilers support ASCII, and definitely ones that we would want to interoperate with. Besides, that example is more concerned about *input data* encoding, not *source code* encoding. If the above is written in ASCII, then I would assume that the bytes in the source file are the ASCII bytes, and probably the IBM compilers would not know what to do with such files (it would all be gibberish if you opened on an EBCDIC editor). You'd first have to translate it to EBCDIC, which is a red flag that likely this isn't going to work :) -Steve
Re: Updating D beyond Unicode 2.0
On 9/24/18 2:20 PM, Martin Tschierschke wrote: On Monday, 24 September 2018 at 14:34:21 UTC, Steven Schveighoffer wrote: On 9/24/18 10:14 AM, Adam D. Ruppe wrote: On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote: Part of the reason, which I haven't read here yet, is that all the keywords are in English. Eh, those are kinda opaque sequences anyway, since the meanings aren't quite what the normal dictionary definition is anyway. Look up "int" in the dictionary... or "void", or even "string". They are just a handful of magic sequences we learn with the programming language. (And in languages like Rust, "fn", lol.) Well, even on top of that, the standard library is full of English words that read very coherently when used together (if you understand English). I can't imagine a long chain of English algorithms with some Chinese one pasted in the middle looks very good :) I suppose you could alias them all... You might get really funny error messages. 🙂 can't be casted to int. Haha, it could be cynical as well int can’t be casted to int🤔 Oh, the games we could play. -Steve
Re: Updating D beyond Unicode 2.0
On 9/24/18 10:14 AM, Adam D. Ruppe wrote: On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote: Part of the reason, which I haven't read here yet, is that all the keywords are in English. Eh, those are kinda opaque sequences anyway, since the meanings aren't quite what the normal dictionary definition is anyway. Look up "int" in the dictionary... or "void", or even "string". They are just a handful of magic sequences we learn with the programming language. (And in languages like Rust, "fn", lol.) Well, even on top of that, the standard library is full of English words that read very coherently when used together (if you understand English). I can't imagine a long chain of English algorithms with some Chinese one pasted in the middle looks very good :) I suppose you could alias them all... -Steve
Re: Updating D beyond Unicode 2.0
On 9/22/18 12:56 PM, Neia Neutuladh wrote: On Saturday, 22 September 2018 at 12:35:27 UTC, Steven Schveighoffer wrote: But aren't we arguing about the wrong thing here? D already accepts non-ASCII identifiers. Walter was doing that thing that people in the US who only speak English tend to do: forgetting that other people speak other languages, and that people who speak English can learn other languages to work with people who don't speak English. I don't think he was doing that. I think what he was saying was, D tried to accommodate users who don't normally speak English, and they still use English (for the most part) for coding. I'm actually surprised there isn't much code out there that is written with other identifiers besides ASCII, given that C99 supported them. I assumed it was because they weren't supported. Now I learn that they are supported, yet almost all C code I've ever seen is written in English. Perhaps that's just because I don't frequent foreign language sites though :) But many people here speak English as a second language, and vouch for their cultures still using English to write code. He was saying it's inevitably a mistake to use non-ASCII characters in identifiers and that nobody does use them in practice. I would expect people probably do try to use them in practice, it's just that the problems they run into aren't worth the effort (tool/environment support). But I have no first or even second hand experience with this. It does seem like Walter has a lot of experience with it though. Walter talking like that sounds like he'd like to remove support for non-ASCII identifiers from the language. I've gotten by without maintaining a set of personal patches on top of DMD so far, and I'd like it if I didn't have to start. I don't think he was saying that. I think he was against expanding support for further Unicode identifiers because the the first effort did not produce any measurable benefit. I'd be shocked from the recent positions of Walter and Andrei if they decided to remove non-ASCII identifiers that are currently supported, thereby breaking any existing code. What languages need an upgrade to unicode symbol names? In other words, what symbols aren't possible with the current support? Chinese and Japanese have gained about eleven thousand symbols since Unicode 2. Unicode 2 covers 25 writing systems, while Unicode 11 covers 146. Just updating to Unicode 3 would give us Cherokee, Ge'ez (multiple languages), Khmer (Cambodian), Mongolian, Burmese, Sinhala (Sri Lanka), Thaana (Maldivian), Canadian aboriginal syllabics, and Yi (Nuosu). Very interesting! I would agree that we should at least add support for unicode symbols that are used in spoken languages, especially if we already have support for symbols that aren't ASCII already. I don't see the downside, especially if you can already use Unicode 2.0 symbols for identifiers (the ship has already sailed). It could be a good incentive to get kids in countries where English isn't commonly spoken to try D out as a first programming language ;) Using your native language to show example code could be a huge benefit for teaching coding. My recommendation is to put the PR up for review (that you said you had ready) and see what happens. Having an actual patch to talk about could change minds. At the very least, it's worth not wasting your efforts that you have already spent. Even if it does need a DIP, the PR can show that one less piece of effort is needed to get it implemented. -Steve
Re: Updating D beyond Unicode 2.0
On 9/24/18 12:23 AM, Neia Neutuladh wrote: On Monday, 24 September 2018 at 01:39:43 UTC, Walter Bright wrote: On 9/23/2018 3:23 PM, Neia Neutuladh wrote: Okay, that's why you previously selected C99 as the standard for what characters to allow. Do you want to update to match C11? It's been out for the better part of a decade, after all. I wasn't aware it changed in C11. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf page 522 (PDF numbering) or 504 (internal numbering). Outside the BMP, almost everything is allowed, including many things that are not currently mapped to any Unicode value. Within the BMP, a heck of a lot of stuff is allowed, including a lot that D doesn't currently allow. GCC hasn't even updated to the C99 standard here, as far as I can tell, but clang-5.0 is up to date. I searched around for the current state of symbol names in C, and found some really crappy rules, though maybe this site isn't up to date?: https://en.cppreference.com/w/c/language/identifier What I understand from that is: 1. Yes, you can use any unicode character you want in C/C++ (seemingly since C99) 2. There are no rules about what *encoding* is acceptable, it's implementation defined. So various compilers have different rules as to what will be accepted in the actual source code. In fact, I read somewhere that not even ASCII is guaranteed to be supported. The result being, that you have to write the identifiers with an ASCII escape sequence in order for it to be actually portable. Which to me, completely defeats the purpose of using such identifiers in the first place. For example, on that page, they have a line that works in clang, not in GCC (tagged as implementation defined): char *🐱 = "cat"; The portable version looks like this: char *\U0001f431 = "cat"; Seriously, who wants to use that? Now, D can potentially do better (especially when all front-ends are the same) and support such things in the spec, but I think the argument "because C supports it" is kind of bunk. Or am I reading it wrong? In any case, I would expect that symbol name support should be focused only on languages which people use, not emojis. If there are words in Chinese or Japanese that can't be expressed using D, while other words can, it would seem inconsistent to a Chinese or Japanese speaking user, and I think we should work to fix that. I just have no idea what the state of that is. I also tend to agree that most code is going to be written in English, even when the primary language of the user is not. Part of the reason, which I haven't read here yet, is that all the keywords are in English. Someone has to kind of understand those to get the meaning of some constructs, and it's going to read strangely with the non-english words. One group which I believe hasn't spoken up yet is the group making the hunt framework, whom I believe are all Chinese? At least their web site is. It would be good to hear from a group like that which has large experience writing mature D code (it appears all to be in English) and how they feel about the support. -Steve
Re: Updating D beyond Unicode 2.0
On 9/22/18 8:58 AM, Jonathan M Davis wrote: On Saturday, September 22, 2018 6:37:09 AM MDT Steven Schveighoffer via Digitalmars-d wrote: On 9/22/18 4:52 AM, Jonathan M Davis wrote: I was laughing out loud when reading about composing "family" emojis with zero-width joiners. If you told me that was a tech parody, I'd have believed it. Honestly, I was horrified to find out that emojis were even in Unicode. It makes no sense whatsover. Emojis are supposed to be sequences of characters that can be interepreted as images. Treating them like Unicode symbols is like treating entire words like Unicode symbols. It's just plain stupid and a clear sign that Unicode has gone completely off the rails (if it was ever on them). Unfortunately, it's the best tool that we have for the job. But aren't some (many?) Chinese/Japanese characters representing whole words? It's true that they're not characters in the sense that Roman characters are characters, but they're still part of the alphabets for those languages. Emojis are specifically formed from sequences of characters - e.g. :) is two characters which are already expressible on their own. They're meant to represent a smiley face, but it's a sequence of characters already. There's no need whatsoever to represent anything extra Unicode. It's already enough of a disaster that there are multiple ways to represent the same character in Unicode without nonsense like emojis. It's stuff like this that really makes me wish that we could come up with a new standard that would replace Unicode, but that's likely a pipe dream at this point. But there are tons of emojis that have nothing to do with sequences of characters. Like houses, or planes, or whatever. I don't even know what the sequences of characters are for them. I think it started out like that, but turned into something else. Either way, I can't imagine any benefit from using emojis in symbol names. -Steve
Re: Updating D beyond Unicode 2.0
On 9/22/18 4:52 AM, Jonathan M Davis wrote: I was laughing out loud when reading about composing "family" emojis with zero-width joiners. If you told me that was a tech parody, I'd have believed it. Honestly, I was horrified to find out that emojis were even in Unicode. It makes no sense whatsover. Emojis are supposed to be sequences of characters that can be interepreted as images. Treating them like Unicode symbols is like treating entire words like Unicode symbols. It's just plain stupid and a clear sign that Unicode has gone completely off the rails (if it was ever on them). Unfortunately, it's the best tool that we have for the job. But aren't some (many?) Chinese/Japanese characters representing whole words? -Steve
Re: Updating D beyond Unicode 2.0
On 9/21/18 9:08 PM, Neia Neutuladh wrote: On Friday, 21 September 2018 at 20:25:54 UTC, Walter Bright wrote: But identifiers? I haven't seen hardly any use of non-ascii identifiers in C, C++, or D. In fact, I've seen zero use of it outside of test cases. I don't see much point in expanding the support of it. If people use such identifiers, the result would most likely be annoyance rather than illumination when people who don't know that language have to work on the code. you *do* know that not every codebase has people working on it who only know English, right? If I took a software development job in China, I'd need to learn Chinese. I'd expect the codebase to be in Chinese. Because a Chinese company generally operates in Chinese, and they're likely to have a lot of employees who only speak Chinese. And no, you can't just transcribe Chinese into ASCII. Same for Spanish, Norwegian, German, Polish, Russian -- heck, it's almost easier to list out the languages you *don't* need non-ASCII characters for. Anyway, here's some more D code using non-ASCII identifiers, in case you need examples: https://git.ikeran.org/dhasenan/muzikilo But aren't we arguing about the wrong thing here? D already accepts non-ASCII identifiers. What languages need an upgrade to unicode symbol names? In other words, what symbols aren't possible with the current support? Or maybe I'm misunderstanding something. -Steve
Re: Jai compiles 80,000 lines of code in under a second
On 9/21/18 10:19 AM, Nicholas Wilson wrote: On Friday, 21 September 2018 at 09:21:34 UTC, Petar Kirov [ZombineDev] wrote: I have been watching Jonathan Blow's Jai for a while myself. There are many interesting ideas there, and many of them are what made me like D so much in the first place. It's very important to note that the speed claims he has been making are all a matter of developer discipline. You can have an infinite loop executed at compile-time in both D and Jai. You're going to OOM pretty fast in D if you try :) I can see the marketing now, "D finds infinite loops in compile-time code way faster than Jai!". -Steve
Re: Truly @nogc Exceptions?
On 9/20/18 1:58 PM, Adam D. Ruppe wrote: On Thursday, 20 September 2018 at 17:14:12 UTC, Steven Schveighoffer wrote: I don't know how a performance problem can occur on an error being thrown anyway -- the process is about to end. Walter's objection was code size - it would throw stuff out of cache lines, even if it doesn't need to actually run. So like this line: int[] a; a = a[1 .. $]; With no bounds checking is just inc a.ptr; but with bounds checking it becomes something more like mov ecx, a.length cmp ecx, 1 // in other words, if length >= offset jae proceed push line push file call d_arraybounds // throws the error proceed: inc a.ptr Now, what my patch did was just, right before push line, it inserted "push length; push offset;". I believe this to be trivial since they are already loaded in registers or immediates and thus just a couple bytes for those instructions, but Walter (as I recall, this was a while ago and I didn't look up his exact words when writing this) said even a couple bytes are important for such a common operation as it throws off the L1 caches. I never got around to actually measuring the performance impact to prove one way or another. But... even if that is a problem, dmd -O will usually rearrange that to avoid the jump on the in-bounds case, and I'm sure ldc/gdc do too, so the extra pushes' instruction bytes are off the main execution path anyway and thus shouldn't waste cache space. idk though. regardless, to me, the extra info is *well* worth the cost anyway. Sounds like a case of premature optimization at best. Besides, if it is a performance issue, you aren't doing bounds checks on every slice/index anyway. I know in iopipe, to squeeze out every bit of performance, I avoid bounds checks when I know from previous asserts the bounds are correct. -Steve
Re: Truly @nogc Exceptions?
On 9/20/18 12:24 PM, Adam D. Ruppe wrote: On Thursday, 20 September 2018 at 15:52:03 UTC, Steven Schveighoffer wrote: I needed to know what the slice parameters that were failing were. Aye. Note that RangeError is called by the compiler though, so you gotta patch dmd to make it pass the arguments to it for index. Ugh. I did a PR for this once but it got shot down because of an allegeded (without evidence btw) performance degradation. Ugh. Well, you can always override that. Just do the check yourself and throw the error you want ;) In my case, that's what I did anyway. I don't know how a performance problem can occur on an error being thrown anyway -- the process is about to end. -Steve
Re: Truly @nogc Exceptions?
On 9/20/18 11:33 AM, Adam D. Ruppe wrote: On Wednesday, 19 September 2018 at 21:16:00 UTC, Steven Schveighoffer wrote: As Andrei says -- Destroy! Nah, I agree. Actually, I'm of the opinion that string error messages in exceptions ought to be considered harmful: you shouldn't be doing strings at all. All the useful information should be in the type - the class name and the members with details. Well, defining a new class can sometimes be a mild hassle... but for really common ones, we really should just do it, and other ones can be done as templated classes or templated factory functions that define a new class right there and then. http://arsdnet.net/dcode/exception.d That's the proof-of-concept I wrote for this years ago, go to the bottom of the file for the usage example. It uses a reflection mixin to make writing the new classes easy, and I even wrote an enforce thing that can add more info by creating a subclass that stores arguments to functions so it can print it all (assuming they are sane to copy like strings or value things lol) enforce!fopen("nofile.txt".ptr, "rb".ptr); MyExceptionBase@exception.d(38): fopen call failed filename = nofile.txt mode = rb Awesome! This is just what I was thinking of. In fact, I did something similar locally since I needed to know what the slice parameters that were failing were. I still had to trick the @nogc to get around the "new Exception" piece. The printMembers thing is nice. I think for druntime/phobos, however, we should have a base that just calls a virtual function with the idea that the message is printed, and then a further-derived type could do the printMembers thing if that's what you want. -Steve
Re: Truly @nogc Exceptions?
On 9/20/18 11:06 AM, H. S. Teoh wrote: On Thu, Sep 20, 2018 at 08:48:13AM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...] But this means you still have to build msg when throwing the error/exception. It's not needed until you print it, and there's no reason anyway to make it allocate, even with RAII. For some reason D forces msg to be built, but it does't e.g. build the entire stack trace string before hand, or build the string that shows the exception class name or the file/line beforehand. [...] IIRC, originally the stacktrace was also built at exception construction time. But it was causing a major performance hit, so eventually someone changed the code to construct it lazily (i.e., only when the catcher actually tries to look it up). I think it makes sense to also make .msg lazy, if the exception object is already carrying enough info to build the message when the catcher asks for it. And if the catcher doesn't ask for it, we saved an extra GC allocation (which is a plus even if we're not trying to go @nogc). Except we DON'T construct the stack trace string, even lazily. If you look at the code I posted, it's output directly to the output buffer (via the sink delegate), without ever having allocated. I think we can do that for the message too (why not, it's all supported). But either one (using GC at print time, or lazily outputting to buffer at print time) solves the biggest problem -- being able to construct an exception without the GC. Plus, this nudges developers of exceptions to store more useful data. If you catch an exception that has details in it, possibly it is only going to be in the string, which you now have to *parse* to get out what the problem was. If instead it was standard practice just to store the details, and then construct the string later, more useful information would be available in the form of fields/accessors. Think about this -- every ErrnoException that is thrown allocates its message via the GC on construction. Even if you catch that and just look at the errno code. Even with dip1008: https://github.com/dlang/phobos/blob/6a15dfbe18f9151379f6337f53a3c41d12dee939/std/exception.d#L1625 -Steve
Re: Truly @nogc Exceptions?
On 9/20/18 6:48 AM, Atila Neves wrote: On Wednesday, 19 September 2018 at 21:16:00 UTC, Steven Schveighoffer wrote: Given dip1008, we now can throw exceptions inside @nogc code! This is really cool, and helps make code that uses exceptions or errors @nogc. Except... The mechanism to report what actually went wrong for an exception is a string passed to the exception during *construction*. Given that you likely want to make such an exception inside a @nogc function, you are limited to passing a compile-time-generated string (either a literal or one generated via CTFE). I expressed my concern for DIP1008 and the `msg` field when it was first announced. I think the fix is easy and a one line change to dmd. I also expressed this on that thread but was apparently ignored. What's the fix? Have the compiler insert a call to the exception's destructor at the end of the `catch(scope Exception)` block. That's it. The `msg` field is just a slice, point it to RAII managed memory and you're good to go. Give me deterministic destruction of exceptions caught by scope when using dip1008 and I'll give you @nogc exception throwing immediately. I've even already written the code! I thought it already did that? How is the exception destroyed when dip1008 is enabled? But this means you still have to build msg when throwing the error/exception. It's not needed until you print it, and there's no reason anyway to make it allocate, even with RAII. For some reason D forces msg to be built, but it does't e.g. build the entire stack trace string before hand, or build the string that shows the exception class name or the file/line beforehand. -Steve
Re: Truly @nogc Exceptions?
On 9/19/18 7:53 PM, Seb wrote: On Wednesday, 19 September 2018 at 21:28:56 UTC, Steven Schveighoffer wrote: On 9/19/18 5:16 PM, Steven Schveighoffer wrote: One further thing: I didn't make the sink version of message @nogc, but in actuality, it could be. We recently introduced support for output ranges in the formatting of Phobos: https://dlang.org/changelog/2.079.0.html#toString Output ranges have the advantage that they could be @nogc and because of the templatization also @safe. I don't think that will work here, as Throwable is a class. All I can think of is that you would have 2 versions, a @nogc one that takes a @nogc delegate, and one that is not. Of course, your exception type could define something completely separate, and you can deal with it in your own project as needed. If there was a way to say "this is @nogc if you give it a @nogc delegate, and not if you don't", that would be useful. The compiler could verify it at compile-time. -Steve
Re: Truly @nogc Exceptions?
On 9/19/18 5:16 PM, Steven Schveighoffer wrote: One further thing: I didn't make the sink version of message @nogc, but in actuality, it could be. Notice how it allocates using the stack. Even if we needed some indeterminate amount of memory, it would be simple to use C malloc/free, or alloca. But traditionally, we don't put any attributes on these base functions. Would it make sense in this case? Aand, no we can't. Because the sink could actually allocate. -Steve
Truly @nogc Exceptions?
Given dip1008, we now can throw exceptions inside @nogc code! This is really cool, and helps make code that uses exceptions or errors @nogc. Except... The mechanism to report what actually went wrong for an exception is a string passed to the exception during *construction*. Given that you likely want to make such an exception inside a @nogc function, you are limited to passing a compile-time-generated string (either a literal or one generated via CTFE). To demonstrate what I mean, let me give you an example member function inside a type containing 2 fields, x and y: void foo(int[] arr) { auto x = arr[x .. y]; } There are 2 ways this can throw a range error: a) x > y b) y > arr.length But which is it? And what are x and y, or even the array length? The error message we get is basic (module name and line number aren't important here): core.exception.RangeError@testerror.d(6): Range violation Not good enough -- we have all the information present to give a more detailed message. Why not: Attempted slice with wrong ordered parameters, 5 .. 4 or Slice parameter 6 is greater than length 5 All that information is available, yet we don't see anything like that. Let's look at the base of all exception and error types to see why we don't have such a thing. The part which prints this message is the member function toString inside Throwable, repeated here for your reading pleasure [1]: void toString(scope void delegate(in char[]) sink) const { import core.internal.string : unsignedToTempString; char[20] tmpBuff = void; sink(typeid(this).name); sink("@"); sink(file); sink("("); sink(unsignedToTempString(line, tmpBuff, 10)); sink(")"); if (msg.length) { sink(": "); sink(msg); } if (info) { try { sink("\n"); foreach (t; info) { sink("\n"); sink(t); } } catch (Throwable) { // ignore more errors } } } (Side Note: there is an overload for toString which takes no delegate and returns a string. But since this overload is present, doing e.g. writeln(myEx) will use it) Note how this *doesn't* allocate anything. But hang on, what about the part that actually prints the message: sink(typeid(this).name); sink("@"); sink(file); sink("("); sink(unsignedToTempString(line, tmpBuff, 10)); sink(")"); if (msg.length) { sink(": "); sink(msg); } Hm... Note how the file name, and the line number are all *members* of the exception, and there was no need to allocate a special string to contain the message we saw. So it *is* possible to have a custom message without allocation. It's just that the only interface for details is via the `msg` string member field -- which is only set on construction. We can do better. I noticed that there is a @__future member function inside Throwable called message. This function returns the message that the Throwable is supposed to display (defaulting to return msg). I believe this was inserted at Sociomantic's request, because they need to be able to have a custom message rendered at *print* time, not *construction* time [2]. This makes sense -- why do we need to allocate some string that will never be printed (in the case where an exception is caught and handled)? This helps alleviate the problem a bit, as we could construct our message at print-time when the @nogc requirement is no longer present. But we can do even better. What if we added ALSO a function: void message(scope void delegate(in char[]) sink) In essence, this does *exactly* what the const(char)[] returning form of message does, but it doesn't require any allocation, nor storage of the data to print inside the exception. We can print numbers (and other things) and combine them together with strings just like the toString function does. We can then replace the code for printing the message inside toString with this: bool printedColon = false; void subSink(in char[] data) { if(!printedColon && data.length > 0) { sink(": "); printedColon = true; } sink(data); } message(&subSink); In this case, we then have a MUCH better mechanism to implement our desired output from the slice error: class RangeSliceError : Throwable { size_t lower; size_t upper; size_t len; ... override void message(scope void delegate(in char[]) sink) { import core.internal.string : unsignedToTempString; char[20] tmpBuff = void; if (lower > upper) { sink("Attempted slice with wrong ordered parameters "); sink(unsignedToTempString(lower, tmpBuff, 10));
Re: Small @nogc experience report
On 9/19/18 1:13 PM, Shachar Shemesh wrote: There is a catch, though. Writing Mecca with @nogc required re-implementing quite a bit of druntime. Mecca uses its own exception allocations (mkEx, just saw it's not yet documented, it's under mecca.lib.exception). The same module also has "enforceNGC". We also have our own asserts. This is partially to support our internal logging facility, that needs a static list of formats, but it also solves a very important problem with D's @nogc: void func() @nogc { assert(condition, string); // string is useless without actual info about what went wrong. assert(condition, format(string, arg, arg)); // No good - format is not @nogc ASSERT!"format"(condition, arg, arg); // @nogc and convenient } So, yes, we do use @nogc, but it took a *lot* of work to do it. I'm running into this coincidentally right now, when trying to debug a PR. I found I'm getting a range error deep inside a phobos function. But because Phobos is trying to be pure @nogc nothrow @safe, I can do almost nothing to display what is wrong. What I ended up doing is making an extern(C) hook that had the "right" attributes, even though it's not @nogc (let's face it, you are about to crash anyway). But it got me thinking, what a useless interface to display errors we have! Inside Throwable, there is the function toString(someDelegate sink) which prints out the exception trace. Near the front there is this: if (msg.length) { sink(": "); sink(msg); } My, wouldn't it be nice to be able to override this! And forget about the whole msg BS. When an exception trace is printed, there are almost no restrictions as to what can be done. We should delay the generation of the message until then as well! Not to mention that if we can output things piecemeal through the sink, we don't even have to allocate at all. I'm going to write up a more detailed post on this, but it's annoying to throw exceptions without any information EXCEPT what can be converted into a string at runtime at the time of exception. All that is missing is this hook to generate the message. -Steve
Re: extern(C++, ns) is wrong
On 9/18/18 9:49 PM, Jonathan M Davis wrote: On Tuesday, September 18, 2018 6:22:55 PM MDT Manu via Digitalmars-d wrote: https://github.com/dlang/dmd/pull/8667 O_O Thank you Walter for coming to the party! Oh, wow. I sure wasn't expecting that. I thought that he'd made it pretty clear that a DIP was needed, and even then, it didn't seem likely that it would be accepted. This is awesome. I guess that he finally came around. I think a big part is that the implementation was done. I think there's a big difference between "I don't really love this, but crap, I'll have to implement it all" and "I don't really love this, but the implementation isn't too intrusive, and all I have to do is click the merge button, OK." I sure wish I had the skills to hack dmd, there are so many ideas I'd like to see implemented in the language :) Anyways, great to see this merged! -Steve
Re: extern(C++, ns) is wrong
On 9/14/18 6:41 PM, Neia Neutuladh wrote: Specifically, Walter wants this to compile: module whatever; extern(C++, foo) void doStuff(); extern(C++, bar) void doStuff(); And he's not too concerned that you might have to use doubly fully qualified names to refer to C++ symbols, like: import core.stdcpp.sstream; import core.stdcpp.vector; core.stdcpp.vector.std.vector v; This is probably the best explanation of why the current situation sucks. -Steve
Re: int/longRe: DIP 1015--removal of integer & character literal conversion to bool--Final Review
On 9/15/18 8:36 PM, Nicholas Wilson wrote: Without it, I get a (possibly quite a lot of) deprecation warnings and I have to insert a cast to the corresponding type, e.g. f(cast(int)E.a)/g(cast(long)(a - b)), to verify the behaviour under the new system and silence the deprecation warning (absolutely necessary if using `-de`). Then I have to delete them after stage 2, but what if I want to support older compilers? Well then I have to wait until they are sufficiently old enough. As precedent, we do have -transition=intpromote, which disables the requirement for casting smaller integers to int first. So the way I would expect someone to migrate their project: 1. Examine all deprecations, looking for ones where I actually *WANT* the bool version to be called. Insert cast there. 2. Enable the -transition=nobooldemote or whatever we call it. 3. Once the deprecation period is over, remove the -transition switch. If I wanted a version to be compilable with older versions of dmd, then I would have to cast. Hm... another option is to have a switch identify "useless" casts once the deprecation period is over. Or add it into dfix. -Steve
Re: DIP 1015--removal of integer & character literal conversion to bool--Final Review
On 9/15/18 6:29 PM, Mike Franklin wrote: On Saturday, 15 September 2018 at 20:07:06 UTC, Steven Schveighoffer wrote: Looks pretty good to me. The only question I have is on this part: enum YesNo : bool { no, yes } // Existing implementation: OK // After stage 1: Deprecation warning // After stage 2: Error // Remedy: `enum YesNo : bool { no = false, yes = true }` Why is this necessary? I can't see how there are integer literals being used here, or how implicitly going from `false` to `true` in the 2 items being enumerated is going to be confusing. You're right, I just tested the implementation, and this is not necessary. I'll remove it. Thanks! Then I have no objections, looks like a nice positive change to me! -Steve
Re: DIP 1015--removal of integer & character literal conversion to bool--Final Review
On 9/14/18 6:41 AM, Mike Parker wrote: DIP 1015, "Deprecation and removal of implicit conversion from integer and character literals to bool", is now ready for Final Review. This is a last chance for community feedback before the DIP is handed off to Walter and Andrei for the Formal Assessment. Please read the procedures document for details on what is expected in this review stage: https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#final-review The current revision of the DIP for this review is located here: https://github.com/dlang/DIPs/blob/299f81c2352fae4c7fa097de71308d773dcd9d01/DIPs/DIP1015.md In it you'll find a link to and summary of the previous review round. This round of review will continue until 11:59 pm ET on September 28 unless I call it off before then. Thanks in advance for your participation. Looks pretty good to me. The only question I have is on this part: enum YesNo : bool { no, yes } // Existing implementation: OK // After stage 1: Deprecation warning // After stage 2: Error // Remedy: `enum YesNo : bool { no = false, yes = true }` Why is this necessary? I can't see how there are integer literals being used here, or how implicitly going from `false` to `true` in the 2 items being enumerated is going to be confusing. -Steve
Re: More fun with autodecoding
On 9/15/18 12:04 PM, Neia Neutuladh wrote: On Saturday, 15 September 2018 at 15:31:00 UTC, Steven Schveighoffer wrote: The problem I had was that it wasn't clear to me which constraint was failing. My bias brought me to "it must be autodecoding again!". But objectively, I should have examined all the constraints to see what was wrong. All C++ concepts seem to do (haven't used them) is help identify easier which requirements are failing. They also make it so your automated documentation can post a link to something that describes the type in more cases. std.algorithm would still be relatively horked, but a lot of functions could be declared as yielding, for instance, ForwardRange!(ElementType!(TRange)). True, we currently rely on convention there. But this really is simply documentation at a different (admittedly more verified) level. We can fix all these problems by simply identifying the constraint clauses that fail. By color coding the error message identifying which ones are true and which are false, we can pinpoint the error without changing the language. I wish. I had a look at std.algorithm.searching.canFind as the first thing I thought to check. Its constraints are of the form: bool canFind(Range)(Range haystack) if (is(typeof(find!pred(haystack The compiler can helpfully point out that the specific constraint that failed was is(...), which does absolutely no good in trying to track down the problem. is(typeof(...)) constraints might be useless here, but we have started to move away from such things in general (see for instance isInputRange and friends). But there could actually be a solution -- just recursively play out the items at compile time (probably with the verbose switch) to see what underlying cause there is. Other than that, you can then write find(myrange) and see what comes up. In my case even, the problem was hasSlicing, which itself is a complicated template, and wouldn't have helped me diagnose the real problem. A recursive display of what things failed would help, but even if I could trigger a way to diagnose hasSlicing, instead of copying all the constraints locally, it's still a much better situation. I'm really thinking of exploring how this could play out, just toying with the compiler to do this would give me experience in how the thing works. -Steve
Re: Proposal: __not(keyword)
On 9/14/18 11:06 AM, Adam D. Ruppe wrote: It also affects attrs brought through definitions though: shared class foo { int a; // automatically shared cuz of the above line of code __not(shared) int b; // no longer shared } Aside from Jonathan's point, which I agree with, that the cost(bool) mechanism would be preferable in generic code (think not just negating existing attributes, but determining how to forward them), the above is different then just negation. Making something unshared *inside* something that is shared breaks transitivity, and IMO the above simply would be the same as not having any attribute there. In other words, I would expect: shared foo f; static assert(is(typeof(f.b)) == shared(int)); I'm not sure how the current behavior works, but definitely wanted to clarify that we can't change something like that without a major language upheaval. -Steve
Re: More fun with autodecoding
On 9/13/18 3:53 PM, H. S. Teoh wrote: On Thu, Sep 13, 2018 at 06:32:54PM -0400, Nick Sabalausky (Abscissa) via Digitalmars-d wrote: On 09/11/2018 09:06 AM, Steven Schveighoffer wrote: Then I found the true culprit was isForwardRange!R. This led me to requestion my sanity, and finally realized I forgot the empty function. This is one reason template-based interfaces like ranges should be required to declare themselves as deliberately implementing said interface. Sure, we can tell people they should always `static assert(isForwardRage!MyType)`, but that's coding by convention and clearly isn't always going to happen. No, please don't. I've used C# and Swift, and this sucks compared to duck typing. Yeah, I find myself writing `static assert(isInputRange!MyType)` all the time these days, because you just never can be too sure you didn't screw up and cause things to mysteriously fail, even though they shouldn't. Although I used to be a supporter of free-form sig constraints (and still am to some extent) and a hater of Concepts like in C++, more and more I'm beginning to realize the wisdom of Concepts rather than free-for-all ducktyping. It's one of those things that work well in small programs and fast, one-shot projects, but don't generalize so well as you scale up to larger and larger projects. The problem I had was that it wasn't clear to me which constraint was failing. My bias brought me to "it must be autodecoding again!". But objectively, I should have examined all the constraints to see what was wrong. All C++ concepts seem to do (haven't used them) is help identify easier which requirements are failing. We can fix all these problems by simply identifying the constraint clauses that fail. By color coding the error message identifying which ones are true and which are false, we can pinpoint the error without changing the language. Once you fix the issue, it doesn't error any more, so the idea of duck typing and constraints is sound, it's just difficult to diagnose. -Steve
Re: More fun with autodecoding
On 9/11/18 7:58 AM, jmh530 wrote: Is there any reason why this is not sufficient? [1] https://run.dlang.io/is/lu6nQ0 That's OK if you are the only one defining S. But what if float is handled elsewhere? -Steve
Re: More fun with autodecoding
On 9/10/18 7:00 PM, Nicholas Wilson wrote: On Monday, 10 September 2018 at 20:44:46 UTC, Andrei Alexandrescu wrote: On 9/10/18 12:46 PM, Steven Schveighoffer wrote: On 9/10/18 8:58 AM, Steven Schveighoffer wrote: I'll have to figure out why my specialized range doesn't allow splitting based on " ". And the answer is: I'm an idiot. Forgot to define empty :) Also my slicing operator accepted ints and not size_t. I guess a better error message would be in order. https://github.com/dlang/DIPs/pull/131 will help narrow down the cause. While this would help eventually, I'd prefer something that just transforms all the existing code into useful error messages. See my response to Andrei. -Steve
Re: More fun with autodecoding
On 9/10/18 1:44 PM, Andrei Alexandrescu wrote: On 9/10/18 12:46 PM, Steven Schveighoffer wrote: On 9/10/18 8:58 AM, Steven Schveighoffer wrote: I'll have to figure out why my specialized range doesn't allow splitting based on " ". And the answer is: I'm an idiot. Forgot to define empty :) Also my slicing operator accepted ints and not size_t. I guess a better error message would be in order. A better error message would help prevent the painful diagnosis that I had to do to actually find the issue. So the error I got was this: source/bufref.d(346,36): Error: template std.algorithm.iteration.splitter cannot deduce function from argument types !()(Result, string), candidates are: /Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(3792,6): std.algorithm.iteration.splitter(alias pred = "a == b", Range, Separator)(Range r, Separator s) if (is(typeof(binaryFun!pred(r.front, s)) : bool) && (hasSlicing!Range && hasLength!Range || isNarrowString!Range)) /Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4163,6): std.algorithm.iteration.splitter(alias pred = "a == b", Range, Separator)(Range r, Separator s) if (is(typeof(binaryFun!pred(r.front, s.front)) : bool) && (hasSlicing!Range || isNarrowString!Range) && isForwardRange!Separator && (hasLength!Separator || isNarrowString!Separator)) /Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4350,6): std.algorithm.iteration.splitter(alias isTerminator, Range)(Range r) if (isForwardRange!Range && is(typeof(unaryFun!isTerminator(r.front /Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4573,6): std.algorithm.iteration.splitter(C)(C[] s) if (isSomeChar!C) This means I had to look at each line, figure out which overload I'm calling, and then copy all the constraints locally, seeing which ones were true and which ones false. But it didn't stop there. The problem was hasSlicing!Range. If you look at hasSlicing, it looks like this: enum bool hasSlicing(R) = isForwardRange!R && !isNarrowString!R && is(ReturnType!((R r) => r[1 .. 1].length) == size_t) && (is(typeof(lvalueOf!R[1 .. 1]) == R) || isInfinite!R) && (!is(typeof(lvalueOf!R[0 .. $])) || is(typeof(lvalueOf!R[0 .. $]) == R)) && (!is(typeof(lvalueOf!R[0 .. $])) || isInfinite!R || is(typeof(lvalueOf!R[0 .. $ - 1]) == R)) && is(typeof((ref R r) { static assert(isForwardRange!(typeof(r[1 .. 2]))); })); Now I had to instrument a whole slew of items. I pasted this whole thing this into my code, added an alias to my range type for R, and then changed the big boolean expression to a bunch of static asserts. Then I found the true culprit was isForwardRange!R. This led me to requestion my sanity, and finally realized I forgot the empty function. A fabulous fantastic mechanism that would have saved me some time is simply coloring the clauses of the template constraint that failed red, the ones that passed green, and the ones that weren't evaluated grey. Furthermore, it would be good to either recursively continue this for red clauses like `hasSlicing` which have so much underneath. Either that or a way to trigger the colored evaluation on demand. If I were a dmd guru, I'd look at doing this myself. I may still try and hack it in just to see if I can do it. -- Finally, there is a possible bug in the definition of hasSlicing: it doesn't require the slice parameters be size_t, but there are places (e.g. inside std.algorithm.searching.find) that pass in range.length .. range.length for slicing the range. In my implementation I had used ints as the parameters for opSlice. So I started seeing errors deep inside std.algorithm saying there was no overload for slicing. Again the sanity was questioned, and I figured out the error and now it's actually working. -Steve
Re: More fun with autodecoding
On 9/10/18 8:58 AM, Steven Schveighoffer wrote: I'll have to figure out why my specialized range doesn't allow splitting based on " ". And the answer is: I'm an idiot. Forgot to define empty :) Also my slicing operator accepted ints and not size_t. -Steve
Re: More fun with autodecoding
On 9/8/18 8:36 AM, Steven Schveighoffer wrote: On 8/9/18 2:44 AM, Walter Bright wrote: On 8/8/2018 2:01 PM, Steven Schveighoffer wrote: Here's where I'm struggling -- because a string provides indexing, slicing, length, etc. but Phobos ignores that. I can't make a new type that does the same thing. Not only that, but I'm finding the specializations of algorithms only work on the type "string", and nothing else. One of the worst things about autodecoding is it is special, it *only* steps in for strings. Fortunately, however, that specialness enabled us to save things with byCodePoint and byCodeUnit. So it turns out that technically the problem here, even though it seemed like an autodecoding problem, is a problem with splitter. splitter doesn't deal with encodings of character ranges at all. For instance, when you have this: "abc 123".byCodeUnit.splitter; What happens is splitter only has one overload that takes one parameter, and that requires a character *array*, not a range. So the byCodeUnit result is aliased-this to its original, and surprise! the elements from that splitter are string. Next, I tried to use a parameter: "abc 123".byCodeUnit.splitter(" "); Nope, still devolves to string. It turns out it can't figure out how to split character ranges using a character array as input. Hm... I made some erroneous assumptions in determining these problems. 1. There is no alias this for the source in ByCodeUnitImpl. I'm not sure how it was working when I tested before, but byCodeUnit definitely doesn't have it, and doesn't compile with the no-arg splitter call. 2. The .splitter(" ") does actually work and return a range of ByCodeUnitImpl elements. So some of my analysis must have been based on bad testing. However, the issue with the no-arg splitter is still there, and I still think it should be fixed. I'll have to figure out why my specialized range doesn't allow splitting based on " ". -Steve
Re: More fun with autodecoding
On 9/8/18 8:36 AM, Steven Schveighoffer wrote: I'll work on adding some issues to the tracker, and potentially doing some PRs so they can be fixed. https://issues.dlang.org/show_bug.cgi?id=19238 https://github.com/dlang/phobos/pull/6700 -Steve
Re: Will the core.stdc module be updated for newer versions of C?
On 9/7/18 6:12 PM, solidstate1991 wrote: While for the most part it still works very well, however when porting Mago I found a few functions that are not present in C99 (most notably wcsncpy_s). It will be updated when you update it ;) There is just so much in the stdc libraries that it's difficult to achieve 100% coverage. The intention is for any time you have a #include for some C standard header, you can do import core.stdc.someFilePath in D. If there are missing functions, and they aren't OS specific, please file a bug report, and if you're up to it, add the function in a PR. -Steve
Re: More fun with autodecoding
On 9/10/18 1:45 AM, Chris wrote: After a while your code will be cluttered with absurd stuff like this. `.byCodeUnit`, `.byGrapheme`, `.array` etc. Due to my experience with `splitter` et. al. I tried to create my own parser to have better control over every step. I considered that, but I'm still trying to make this buffer reference thing work. Phobos just needs to be fixed. This is actually not as hopeless as I once thought. But what needs to happen is all of Phobos algorithms need to be tested with byCodeUnit et. al. After a few *minutes* of testing things I ran into this bug [1] that didn't get fixed till early 2018. I never started to write my own step-by-step parser. I'm glad I didn't. It actually was fixed accidentally in 2017 in this PR: https://github.com/dlang/druntime/pull/1952. The bug was closed in 2018 when someone noticed the code no longer failed. Essentially, the whole string switch algorithm was replaced with a completely rewritten better approach. This is a great example of why we should be moving more of the compiler magic into the library -- it's just easier to write and understand there. I wish people began to realize that string handling is a basic necessity and that the correct handling of strings is of utmost importance. Please keep us updated on how things work out (or not) for you. Absolutely, D needs to have great support for string parsing and manipulation. The potential is awesome. I will keep it up, what I'm trying to fix is the fact that using std.algorithm to extract pieces from a buffer, but then using the position in that buffer to determine things (i.e. parsing) is really difficult without some stupid requirements like pointer math. [Please, nobody answer my post pointing out that a) we don't understand Unicode and b) that it's an insult to the Universe to draw attention to flaws that keep pestering us on an almost daily basis - without trying to fix them ourselves stante pede. As is clear from Steve's efforts, the Universe doesn't seem to care.) I don't characterize it as the universe not caring. Phobos has a legacy problem with string handling, and it needs to somehow be addressed -- either by painfully extracting the problem, or painfully working around it. I don't think anyone here thinks there isn't a problem or that it's insulting to bring it up. But anything that needs to be done is painful either way, which is why it's not happening very fast. -Steve
Re: More fun with autodecoding
On 8/9/18 2:44 AM, Walter Bright wrote: On 8/8/2018 2:01 PM, Steven Schveighoffer wrote: Here's where I'm struggling -- because a string provides indexing, slicing, length, etc. but Phobos ignores that. I can't make a new type that does the same thing. Not only that, but I'm finding the specializations of algorithms only work on the type "string", and nothing else. One of the worst things about autodecoding is it is special, it *only* steps in for strings. Fortunately, however, that specialness enabled us to save things with byCodePoint and byCodeUnit. So it turns out that technically the problem here, even though it seemed like an autodecoding problem, is a problem with splitter. splitter doesn't deal with encodings of character ranges at all. For instance, when you have this: "abc 123".byCodeUnit.splitter; What happens is splitter only has one overload that takes one parameter, and that requires a character *array*, not a range. So the byCodeUnit result is aliased-this to its original, and surprise! the elements from that splitter are string. Next, I tried to use a parameter: "abc 123".byCodeUnit.splitter(" "); Nope, still devolves to string. It turns out it can't figure out how to split character ranges using a character array as input. The only thing that does seem to work is this: "abc 123".byCodeUnit.splitter(" ".byCodeUnit); But this goes against most algorithms in Phobos that deal with character ranges -- generally you can use any width character range, and it just works. Having a drop-in replacement for string would require splitter to handle these transcodings (and I think in general, algorithms should be able to handle them as well). Not only that, but the specialized splitter that takes no separator can split on multiple spaces, a feature I want to have for my drop-in replacement. I'll work on adding some issues to the tracker, and potentially doing some PRs so they can be fixed. -Steve
Re: More fun with autodecoding
On 9/8/18 8:36 AM, Steven Schveighoffer wrote: Sent this when I was on a plane, and for some reason it posted with the timestamp when I hit "send later", not when I connected just now. So this is to bring the previous message back to the forefront. -Steve
Re: This is why I don't use D.
On 9/5/18 4:40 PM, Nick Sabalausky (Abscissa) wrote: On 09/04/2018 09:58 PM, Jonathan M Davis wrote: On Tuesday, September 4, 2018 7:18:17 PM MDT James Blachly via Digitalmars-d wrote: Are you talking about this? https://github.com/clinei/3ddemo which hasn't been updated since February 2016? This is part of why it's sometimes been discussed that we need a way to indicate which dub packages are currently maintained and work. What we need is for DUB to quit pretending the compiler (and DUB itself, for that matter) isn't a dependency just like any other. I pointed this out years ago over at DUB's GitHub project, but pretty much just got silence. The compiler doesn't change all that often, and when it does, it's usually a long deprecation cycle. The problem really is that phobos/druntime change all the time. And they are tied to the compiler. I think some sort of semver scheme really should be implemented for the compiler and phobos. But we need more manpower to handle that. -Steve
Re: This is why I don't use D.
On 9/5/18 11:46 AM, Dennis wrote: On Wednesday, 5 September 2018 at 13:27:48 UTC, Steven Schveighoffer wrote: 3ddemo has one commit. In February 2016. I think it would be an amazing feat indeed if a project with one version builds for more than 2 years in any language. This problem is not about 3ddemo. I can totally relate to the OP, when I started learning D (we're talking April 2017 here) I tried many OpenGL demos and GUI libraries. I like learning by example, so I tried a lot of them on both Ubuntu and Windows. My success rate of building them was below 20%, and even if it did succeed, it often still had deprecation warnings, or linking errors when loading the required shared libraries, or glitches like messed up text rendering. I would try to fix it myself, but the error messages were not clear at all for a beginner and Googling them yielded few results. I should say, I have little experience with or understanding of building opengl stuff. My experience with trying stuff is that it's very finnicky about which libraries are installed or how the environment has to be properly set up. Even after I got 3ddemo to compile, it didn't run, wouldn't open certain libraries. So I think it would be nice if these experiences were better, but I don't know what the core D projects need to do here. My guess is that there is not a lot of manpower making proper easy-to-use 3d libraries. We're not even only talking about small unmaintained projects here: at the time I tried it, Gtk-D was broken.[1] Out of frustration I carried on in C# for a while, and guess what: the first best OpenTK demo I found basically worked first try. Now I didn't give up on D, but I can totally understand that others (like OP) don't have the patience to put up with this. I think Gtk-D has gotten a lot better (not my experience, but Tilix seems to be doing good with it) since then. While we can't force volunteers to keep their D projects up to date, we could try to give some incentive by notifying them via code.dlang.org, or give users information on what compiler / environment is required for dub packages to build. It might prevent some new users from leaving D out of frustration. I think a "known good" configuration entry, even if it's manual, would be a good thing to add. -Steve
Re: This is why I don't use D.
On 9/4/18 8:49 PM, Everlast wrote: I downloaded 3ddemo, extracted, built and I get these errors: logger 2.66.0: building configuration "library"... \dub\packages\logger-2.66.0\logger\std\historical\logger\core.d(1717,16): Error: cannot implicitly convert expression logger of type shared(Logger) to std.historical.logger.core.Logger \dub\packages\logger-2.66.0\logger\std\historical\logger\core.d(261,21): Error: no property fracSec for type const(SysTime), did you mean std.datetime.systime.SysTime.fracSecs? \dub\packages\logger-2.66.0\logger\std\historical\logger\filelogger.d(86,27): Error: template instance `std.historical.logger.core.systimeToISOString!(LockingTextWriter)` error instantiating dmd.exe failed with exit code 1. This is typical with most of my trials with D... something is always broken all the time and I'm expected to jump through a bunch of hoops to get it to work. File a issue, fix it myself, use a different library, etc. I'm expected to waste my time fixing a problem that really should not exist or should have a high degree of automation to help fix it. I really have better things to do with my time so I won't invest it in D. 3ddemo has one commit. In February 2016. I think it would be an amazing feat indeed if a project with one version builds for more than 2 years in any language. I built it successfully with DMD 2.076 (I just picked a random old version). So it's still usable, you just have to know what version of the compiler to use. I'd say it would be nice to record which version it builds with in some way on code.dlang.org. This attitude of "It's your problem" is going to kill D. I wouldn't say the attitude is "It's your problem", but more that you can't expect a completely unmaintained, scantily tested piece of software to magically work because it's written in D. In this phase of D's life, things just aren't going to stay buildable. We are making too many changes to the language and the standard library to say that D is going to build things today that were buildable 2+ years ago. In time, this will settle down, and D will be much more stable. I'd recommend coming back and checking again later. But I would definitely suggest not looking for older projects to test with. There is really no incentive for me to use D except for it's language features... everything else it does, besides performance, is shit compared to what most other languages do. Really, D wins on very few metrics but the D fanboys will only focus on those. Sounds like you have other problems than buildability, so maybe D just isn't right for you. Thanks for stopping by! -Steve
Re: This thread on Hacker News terrifies me
On 9/1/18 6:29 AM, Shachar Shemesh wrote: On 31/08/18 23:22, Steven Schveighoffer wrote: On 8/31/18 3:50 PM, Walter Bright wrote: https://news.ycombinator.com/item?id=17880722 Typical comments: "`assertAndContinue` crashes in dev and logs an error and keeps going in prod. Each time we want to verify a runtime assumption, we decide which type of assert to use. We prefer `assertAndContinue` (and I push for it in code review)," e.g. D's assert. Well, actually, D doesn't log an error in production. I think it's the music of the thing rather than the thing itself. Mecca has ASSERT, which is a condition always checked and that always crashes the program if it fails, and DBG_ASSERT, which, like D's built in assert, is skipped in release mode (essentially, an assert where you can log what went wrong without using the GC needing format). When you compare this to what Walter was quoting, you get the same end result, but a vastly different intention. It's one thing to say "this ASSERT is cheap enough to be tested in production, while this DBG_ASSERT one is optimized out". It's another to say "well, in production we want to keep going no matter what, so we'll just ignore the asserts". Which is exactly what Phobos and Druntime do (ignore asserts in production). I'm not sure how the intention makes any difference. The obvious position of D is that asserts and bounds checks shouldn't be used in production -- that is how we ship our libraries. It is what the "-release" switch does. How else could it be interpreted? -Steve
Re: This thread on Hacker News terrifies me
On 8/31/18 3:50 PM, Walter Bright wrote: https://news.ycombinator.com/item?id=17880722 Typical comments: "`assertAndContinue` crashes in dev and logs an error and keeps going in prod. Each time we want to verify a runtime assumption, we decide which type of assert to use. We prefer `assertAndContinue` (and I push for it in code review)," e.g. D's assert. Well, actually, D doesn't log an error in production. -Steve
Re: std.encoding:2554 - Unrecognized Encoding: big5 - Please help!
On 8/28/18 9:38 AM, spikespaz wrote: I have a user who submitted a bug report for one of my projects. The error is in std\encoding.d on line 2554. The problem arises when he types "google.com.hk/search?q={{query}}" (exact string) into this function: https://github.com/spikespaz/search-deflector/blob/master/source/setup.d#L253-L278 Here is the issue on GH: https://github.com/spikespaz/search-deflector/issues/12 I really can't figure this one out and any help would be appreciated. Thanks. The answer is: the encoding "big5" isn't supported by std.encoding. You can add a new subclass and register it, to handle the encoding. Maybe this article can help you write one: https://en.wikipedia.org/wiki/Big5 You want to subclass EncodingScheme. See instructions at the top of https://dlang.org/phobos/std_encoding.html -Steve
Re: Is @safe still a work-in-progress?
On 8/24/18 10:25 PM, Walter Bright wrote: On 8/23/2018 6:32 AM, Steven Schveighoffer wrote: Furthermore any member function (or UFCS function for that matter) REQUIRES the first parameter to be the aggregate. How do you make a member function that stuffs the return into a different parameter properly typecheck? What I propose is that the function interface be refactored so it does fit into these patterns. Is that an unreasonable requirement? I don't know. But it doesn't seem to be, as I haven't run into it yet. So this would mean a member function would have to be refactored into a different function with a different calling syntax. i.e: x.foo(target); would have to be refactored to: target.foo(x); or foo(target, x); Aside from the adjustment in name that is necessary to make this read correctly, that may cause other problems (sometimes non-member functions aren't available if it's a template instantiation). Phobos doesn't do this by accident. It's how constructors work (see above) and how pipeline programming works. Constructors I agree are reasonable to consider `this` to be the return value. On that point, I would say we should definitely go ahead with making that rule, and I think it will lead to no confusion whatsoever. pipeline programming depends on returning something other than `void`, so I don't see how this applies. grep Phobos for instances of put() and see its signature. It's part of pipeline programming, and it's all over the place. I stand partly corrected! Indeed you can put a void-returning function at the *end* of a pipeline call, I hadn't thought of that. But in terms of put, strictly speaking, any call of some.pipeline.put(x) is wrong. It should be put(some.pipeline, x), to avoid issues with how put was designed. It would restrict your legitimate calls. Maybe that's a good thing. Having multiple simultaneous routes of data out of a function is not good practice (note that it is impossible with functional programming). If you absolutely must have it, the exit routes can be aggregated into a struct, then pass that struct as the first argument. Maybe it's better to designate one sink, and have that be the result. I know that after inout was implemented, there were definitely cases where one wanted to have multiple inout routes (i.e. independent traces between multiple parameters for copying mutability). It may be the same for this, I don't know. I want to stress that it may be a valid solution, but we should strive to prove the solutions are the best possible rather than just use duct-tape methodology. I don't know how to prove anything with programming languages. I don't mean prove like mathematical proof. I mean try to consider how this affects all cases instead of just the one case that will make phobos compile. "show" is a better verb than "prove". It should even be considered that perhaps there are better solutions even than the approach dip1000 has taken. People have hypothesized that for several years, and so far none have been forthcoming beyond a few hand-wavy generalities. I'm just saying if dip1000 cannot fix all the problems, that instead of adding weird exceptions, or the classic "you're just doing it wrong", maybe we should reconsider the approach. Another case which was brought up and pretty much ignored was this one: https://forum.dlang.org/post/qkrdpmdqaxjadgvso...@forum.dlang.org I also want to point out that the attitude of 'we could just fix it, but nobody will pull my request' is unhelpful. We want to make sure we have the best solution possible, don't take criticism as meaningless petty bickering. People are genuinely trying to make sure D is improved. Hostility towards reviews or debate doesn't foster that. I'm not hostile to debate. I just don't care for "this is uncharted territory, so let's do nothing" which has been going on for probably 4 years now, coincident with "scope is incomplete, D sux". I.e. lead, follow, or get out of the way :-) I'm opting for the latter, as the idea of band-aid PRs to get Phobos compiling with dip1000 just to see if dip1000 is going to work seems like the wrong approach to me. The consequence of this is that getting out of the way means your PRs don't get pulled. -Steve
Re: Is @safe still a work-in-progress?
On 8/24/18 10:28 PM, Walter Bright wrote: On 8/23/2018 8:14 AM, Steven Schveighoffer wrote: If I had to design a specific way to allow the common case to be easy, but still provide a mechanism for the uncommon cases, I would say: 1. define a compiler-recognized attribute (e.g. @__sink). 2. If @__sink is applied to any parameter, that is effectively the return value. 3. In the absence of a @__sink designation on non-void-returning functions, it applies to the return value. 4. In the absence of a @__sink designation on void returning functions, it applies to the first parameter. 5. Inference of @__sink happens even on non-templates. 6. If @__sink is attributed on multiple parameters, you assume all return parameters are assigned to all @__sink parameters for the purposes of verifying lifetimes are not exceeded. 'ref' is already @__sink. No, otherwise we wouldn't need the patch you are pushing. -Steve
Re: D is dead
On 8/24/18 6:16 PM, Jonathan M Davis wrote: On Friday, August 24, 2018 7:46:57 AM MDT Mike Franklin via Digitalmars-d wrote: On Friday, 24 August 2018 at 13:21:25 UTC, Jonathan M Davis wrote: I think that you're crazy. No, I just see more potential in D than you do. To be clear, I'm not calling you crazy in general. I'm calling the idea of bypassing libc to call syscalls directly under any kind of normal circumstances crazy. There is tons of work to be done around here to improve D, and IMHO, reimplementing OS functions just because they're written in C is a total waste of time and an invitation for bugs - in addition to making the druntime code that much less portable, since it bypasses the API layer that was standardized for POSIX systems. Let me say that I both agree with Jonathan and with Mike. I think we should reduce Phobos dependence on the user-library part of libc, while at the same time, not re-inventing how the OS bindings are called. For example, using Martin's std.io library instead of . I really don't want to see dlang have to maintain posix system calls on all supported OSes when that's already being done for us. Windows makes this simpler -- the system calls are separate from the C runtime. It would be nice if Posix systems were that way, but it's both silly to reinvent the system calls (they are on every OS anyways, and in shared-library form), and a maintenance nightmare. For platforms that DON'T have an OS abstraction, or it's split out from the user library part of libc, it would be perfectly acceptable to write a shim there if needed. I'd be surprised if it's not already present in C form. -Steve
Re: Embrace the from template?
On 8/24/18 6:29 PM, Jonathan Marler wrote: On Friday, 24 August 2018 at 20:36:06 UTC, Seb wrote: On Friday, 24 August 2018 at 20:04:22 UTC, Jonathan Marler wrote: I'd gladly fix it but alas, my pull requests are ignored :( They aren't! It's just that sometimes the review queue is pretty full. I have told you before that your contributions are very welcome (like they are from everyone else) and if there's anything blocking your productivity you can always ping me on Slack. Don't tempt me to start contributing again :) I had months where I got almost no attention on a dozen or so PRs...I love to contribute but I'd have to be mad to continue throwing dozens of hours of work away. I thought we were going to get the unittest import problem solved, but then you closed the PR abruptly (we did get phobos to stop compiling with -dip1000 so we could work around the linker errors). In any case, I can understand the feeling of frustration. I also have no power to force others to review who make important decisions, so I can't guarantee it won't happen again. I myself would love to have the time to get more reps with the compiler code, but I'm hopelessly lost when reviewing dmd stuff currently. If the problem gets solved I'll willingly start working again, but I don't think anything's changed. I'll just be blunt -- I don't think "the problem" is ever going to get solved. This is the world of volunteer OSS development, and nobody has control over anyone's time but themselves. Things could go great for a month and then stagnate. If you hit on something that some VIP is looking to solve, it may get a lot of attention. But I would recommend letting a PR stay open, pinging reviewers, etc. instead of closing them. Don't give up hope that it will not ever be merged. -Steve
Re: Dicebot on leaving D: It is anarchy driven development in all its glory.
On 8/24/18 5:12 PM, Meta wrote: On Friday, 24 August 2018 at 17:12:53 UTC, H. S. Teoh wrote: I got bitten by this just yesterday. Update dmd git master, update vibe.d git master, now my vibe.d project doesn't compile anymore due to some silly string.d error somewhere in one of vibe.d's dependencies. :-/ While we're airing grievances about code breakages, I hit this little gem the other day, and it annoyed me enough to complain about it: https://github.com/dlang/phobos/pull/5291#issuecomment-414196174 What really gets me is the actual removal of the symbol. If it had been left there with a deprecation message, I would've caught the problem immediately at the source and fixed it in a few minutes. Instead, I spent an hour or so tracing "execution" paths through a codebase that I'm unfamiliar with to figure out why a static if branch is no longer being taken. According to this comment: https://github.com/dlang/phobos/pull/5291#issuecomment-360929553 There was no way to get a deprecation to work. When we can't get a deprecation to work, we face a hard decision -- actually break code right away, print lots of crappy errors, or just leave the bug unfixed. -Steve
Re: D is dead
On 8/23/18 12:22 PM, Shachar Shemesh wrote: On 23/08/18 17:01, Steven Schveighoffer wrote: I'm not saying all bugs you file will be fixed, but all bugs you *don't* file will definitely not be fixed. So far, my experience is that it has about the same chances of being fixed both ways, and not filing takes less effort. I have had much better success with bugs being fixed for issues that I file vs. hoping someone fixes it without a report, not just in D's ecosystem, but pretty much anywhere. But that's the choice you make. We'll have to disagree on that one. It's hard to fix bugs without reports, but hey, maybe you will get lucky and someone fixes them by accident. -Steve
Re: D is dead
On 8/23/18 12:27 PM, Shachar Shemesh wrote: On 23/08/18 17:01, Steven Schveighoffer wrote: So interestingly, you are accepting the sockaddr by VALUE. Indeed. The reason is that I cannot accept them by reference, as then you wouldn't be able to pass lvalues* in. Another controversial decision by D. *rvalues you meant. One that is actively being addressed, at least by the community: https://github.com/dlang/DIPs/blob/master/DIPs/DIP1016.md No guarantees it gets through, but this is probably further than anyone has ever gotten before on this topic (and it's a very old topic). Had that been C++, I'd definitely get a const ref instead. If you want to use inheritance this is a given, in D or in C++. What this simply means is your identification of the problem is simply wrong -- it's not that you can't make subtypes with structs (you can), it's that you can't accept rvalues by reference, and accepting by reference is required for inheritance. -Steve
Re: Is @safe still a work-in-progress?
On 8/23/18 9:32 AM, Steven Schveighoffer wrote: On 8/23/18 4:58 AM, Walter Bright wrote: On 8/22/2018 6:50 AM, Steven Schveighoffer wrote: As for things being made "more flexible in the future" this basically translates to code breakage. For example, if you are depending on only the first parameter being considered the "return" value, and all of a sudden it changes to encompass all your parameters, your existing code may fail to compile, even if it's correctly safe and properly annotated. It's a good point. But I don't see an obvious use case for considering all the ref parameters as being returns. You would have to consider the shortest liftetime and assume everything goes there. It would restrict your legitimate calls. Only mitigating factor may be if you take the ones you aren't going to modify as const or inout. Actually, thinking about this, the shortest lifetime is dictated by how it is called, so there is no valid way to determine which one makes sense when compiling the function. In order for this to work, you'd have to attribute it somehow. I can see that is likely going to be way more cumbersome than it's worth. If I had to design a specific way to allow the common case to be easy, but still provide a mechanism for the uncommon cases, I would say: 1. define a compiler-recognized attribute (e.g. @__sink). 2. If @__sink is applied to any parameter, that is effectively the return value. 3. In the absence of a @__sink designation on non-void-returning functions, it applies to the return value. 4. In the absence of a @__sink designation on void returning functions, it applies to the first parameter. 5. Inference of @__sink happens even on non-templates. 6. If @__sink is attributed on multiple parameters, you assume all return parameters are assigned to all @__sink parameters for the purposes of verifying lifetimes are not exceeded. Ugly to specify, but might actually be pretty non-intrusive to use. -Steve
Re: D is dead
On 8/23/18 9:22 AM, Shachar Shemesh wrote: On the other hand, look at ConnectedSocket.connect: https://weka-io.github.io/mecca/docs/mecca/reactor/io/fd/ConnectedSocket.connect.html Why do I need two forms? What good is that? Why is the second form a template? Answer: Because in D, structs can't inherit, and I cannot define an implicit cast. What I'd really want to do is to have SockAddrIPv4 be implicitly castable to SockAddr, so that I can pass a SockAddrIPv4 to any function that expects SockAddr. Except what I'd _really_ like to do is for them to be the same thing. I'd like inheritance. Except I can't do that for structs, and if I defined SockAddr as a class, I'd mandate allocating it on the GC, violating the whole point behind writing Mecca to begin with. So interestingly, you are accepting the sockaddr by VALUE. Which eliminates any possibility of using inheritance meaningfully anyway (except that depending how you define SockAddr, it may include all the data of the full derived address, sockaddr is quirky that way, and NOT like true inheritance). You CAN use inheritance, just like you would with classes, but you have to pass by reference for it to make sense struct SockAddr { int addressFamily; // forget what this really is called ... } struct SockAddrIPv4 { SockAddr base; ref SockAddr getBase() { return base; } alias getBase this; ... } Now, you can pass SockAddrIPv4 into a ref SockAddr, check the address family, and cast to the correct thing. Just like you would with classes and inheritance. You can even define nice mechanisms for this. e.g.: struct SockAddr { ... ref T cast(T)() if (isSomeSockaddr!T) { assert(addressFamily == T.requiredAddressFamily); return *cast(T*)&this; } } To summarize: Weka isn't ditching D, and people aren't even particularly angry about it. It has problems, and we've learned to live with them, and that's that. This sounds more like what I would have expected, so thank you for clarifying. The general consensus, however, is that these problems will not be resolved (we used to file bugs in Bugzilla. We stopped doing that because we saw nothing happens with them), and as far as the future of the language goes, that's bad news. Bugs do get fixed, there is just no assigned timeframe for having them fixed. An all volunteer workforce has this issue. It took 10 (I think) years for bug 304 to get fixed. It was a huge pain in the ass, but it did get fixed. I wouldn't stop filing them, definitely file them. If they are blocking your work, complain about them loudly, every day. But not filing them doesn't help anyone. I'm not saying all bugs you file will be fixed, but all bugs you *don't* file will definitely not be fixed. -Steve
Re: Is @safe still a work-in-progress?
On 8/23/18 4:58 AM, Walter Bright wrote: On 8/22/2018 6:50 AM, Steven Schveighoffer wrote: What about: size_t put(sink, parameters...) Does this qualify as the sink being the "return" type? Obviously the real return can't contain any references, so it trivially can be ruled out as the destination of any escaping parameters. Your reasoning is correct, but currently it only applies with 'void' return types. Or how about a member function that takes a ref parameter? Is `this` the "return" or is the ref parameter the "return"? `this` is the ref parameter. In particular, consider a constructor: struct S { int* p; this(return scope int* p) { this.p = p; } } int i; S s = S(&i); This code appears in Phobos, and it is very reasonable to expect it to check as safe. What I mean to say is, we have a semantic today -- the return value is hooked to any `return` parameters, end of story. This is clear, concise, and easy to understand. You are saying that in some cases, the return value is actually deposited in the `this` parameter. In cases where the actual return type is void, OK, I see that we can tack on that rule without issues. Furthermore any member function (or UFCS function for that matter) REQUIRES the first parameter to be the aggregate. How do you make a member function that stuffs the return into a different parameter properly typecheck? What rule do we tack on then? It's going to be confusing to anyone who writes their API thinking about how it's call syntax reads, not how the compiler wants to do flow analysis. Not to mention, the keyword is `return`, not `returnorfirstparam`. It's still going to be confusing no matter how you look at it. My problem with the idea is that it is going to seem flaky -- we are using convention to dictate what is actually the return parameter, vs. what semantically happens inside the function. It's going to confuse anyone trying to do it a different way. I've experienced this in the past with things like toHash, where if you didn't define it with the exact signature, it wouldn't actually be used. I realize obviously, that `put` is already specified. But as I said in the bug report, we should think twice about defining rules based solely on how Phobos does things, and calling that the solution. Phobos doesn't do this by accident. It's how constructors work (see above) and how pipeline programming works. Constructors I agree are reasonable to consider `this` to be the return value. On that point, I would say we should definitely go ahead with making that rule, and I think it will lead to no confusion whatsoever. pipeline programming depends on returning something other than `void`, so I don't see how this applies. It's more when you are setting members via properties where this comes into play. We need it -- we need this ability to tell the compiler "this parameter connects to this other parameter". I just don't know if the proposed rules are a) good enough for the general case, and b) don't cause more confusion than they are worth. As for things being made "more flexible in the future" this basically translates to code breakage. For example, if you are depending on only the first parameter being considered the "return" value, and all of a sudden it changes to encompass all your parameters, your existing code may fail to compile, even if it's correctly safe and properly annotated. It's a good point. But I don't see an obvious use case for considering all the ref parameters as being returns. You would have to consider the shortest liftetime and assume everything goes there. It would restrict your legitimate calls. Only mitigating factor may be if you take the ones you aren't going to modify as const or inout. > I want to ensure Atila is successful with this. But that means Phobos has to compile with dip1000. So I need to make it work. I think it's a very worthy goal to make Phobos work, and a great proof of concept for dip1000's veracity. However, one-off rules just to make it work with existing code go against that goal IMO. Rules that stand on their own I think will fare better than ones that are loopholes to allow existing code to compile. I couldn't come up with a better idea than this, and this one works. I want to stress that it may be a valid solution, but we should strive to prove the solutions are the best possible rather than just use duct-tape methodology. It should even be considered that perhaps there are better solutions even than the approach dip1000 has taken. I also want to point out that the attitude of 'we could just fix it, but nobody will pull my request' is unhelpful. We want to make sure we have the best solution possible, don't take criticism as meaningless petty bickering. People are genuinely trying to make sure D is improved. Hostility towards reviews or debate doesn't foster that. -Steve
Re: D is dead
On 8/23/18 8:03 AM, Walter Bright wrote: On 8/23/2018 4:31 AM, Shachar Shemesh wrote: This is in the language spec: How many people know that without resorting to the specs. This is a little unfair. It's plainly stated in the documentation for foreach. Heck, I wrote a C compiler and the library for it, and yesterday I had to look up again how strncmp worked. I refer to the documentation regularly. Back when I designed digital circuits, I had a well-worn TTL data book on my desk, too. If it wasn't documented, or documented confusingly, it would be a fair point. On the point of opApply, the choice is quite obvious. Why would you put opApply in an aggregate if you didn't want to control foreach behavior? Once you think about it, there shouldn't really be any more discussion. Does it matter if it allows copying or not? For the preference for opApply, no. But it does for empty/front/popFront, which is exactly my point. If front() returns by ref, then no copying happens. If front() returns by value, then a copy is made. This should not be surprising behavior. I think he means, if the range ITSELF doesn't allow copying, it won't work with foreach (because foreach makes a copy), but it will work with opApply. If you're referring to #14246, I posted a PR for it. I don't see how that is pretending it isn't a problem. It is. When I first reported this, about 3 and a half years ago, the forum explained to me that this is working as expected. The forum can be anyone saying anything. A more reliable answer would be the bugzilla entry being closed as "invalid", which did not happen. There have been several people who I have spoken with in person, and also seen posted here, that say the forum is unfriendly or not open to criticism of D. I feel it's the opposite (in fact, most of the die-hard supporters are very critical of D), but everyone has their own experiences. There are many people who post short curt answers, maybe even cynical. But this isn't necessarily the authoritative answer. Where I see this happening, I usually try to respond with a more correct answer (even though my voice isn't authoratative exactly), but the sad truth is that we can't spend all our day making sure we have a super-pleasant forum where every answer is valid and nobody is rude. In reply to Shachar's general point: This whole thread seems very gloomy and final, but I feel like the tone does not match in my mind how D is progressing. "Every single one of the people [at Weka] rushing to defend D at the time has since come around." Seems like you all have decided to either ditch D internally, maybe moving forward, or accepted that Weka will fail eventually due to the choice of D? It sure reads that way. This is in SHARP contrast to the presentation that Liran gave at Dconf this year, touting D as a major reason Weka was able to develop what they did, and to some degree, your showcase of how Mecca works. My experience with D is that it has gotten much better over the years. I suppose that having worked with the earlier versions, and seeing what has happened gives me a different perspective. I guess I just don't have that feeling that there are some unfixable problems that will "kill" the language. Everything in a programming language is fixable, it just matters how much pain you are willing to deal with to fix it. If we get to a point where there really is a sticking point, D3 can be born. I do feel that we need, in general, more developers working on the compiler itself. So many of the problems need compiler changes, and the learning curve to me just seems so high to get into it. -Steve
Re: Friends don't let friends use inout with scope and -dip1000
On 8/22/18 4:17 AM, Kagamin wrote: On Tuesday, 21 August 2018 at 14:04:15 UTC, Steven Schveighoffer wrote: I would guess it's no different than other inferred attributes. I would also guess that it only gets promoted to a return parameter if it's actually returned. If we can't have properly typed parameters, it feels like it has potential to prevent some patterns. But scope is not part of the type, nor is return. One of my biggest concerns about dip1000 is that the "scope-ness" or "return-ness" of a variable is hidden from the type system. It's just the compiler doing flow analysis and throwing you an error when it can't work the thing out. I'm more worried about not being able to express the flow in a way that the compiler understands, and having it complain about things that are actually safe. This prevents automatic scope promotion: template escape(T) { int[] escape1(scope int[] r) { return r; } alias escape=escape1; } But that's not valid dip1000 code. If you call it, it should give a compiler error (r *does* escape its scope). -Steve
Re: Is @safe still a work-in-progress?
On 8/22/18 5:23 AM, Walter Bright wrote: On 8/21/2018 6:07 PM, Mike Franklin wrote: The proposed idea wants to make the first parameter, if it's `ref`, special. This is because Phobos is written with functions of the form: void put(sink, parameters...) which corresponds to: sink.put(parameters...) The two forms are fairly interchangeable, made more so by the Uniform Function Call Syntax. > Why not the first `ref` parameter regardless of whether it's the absolute first in the list. Why not the last `ref` parameter? Why not all `ref` parameters? Good question. If this fairly restricted form solves the problems, then there is no need for the more flexible form. Things can always be made more flexible in the future, but tightening things can be pretty disruptive. Hence, unless there is an obvious and fairly strong case case for the flexibility, then it should be avoided for now. What about: size_t put(sink, parameters...) Does this qualify as the sink being the "return" type? Obviously the real return can't contain any references, so it trivially can be ruled out as the destination of any escaping parameters. Or how about a member function that takes a ref parameter? Is `this` the "return" or is the ref parameter the "return"? My problem with the idea is that it is going to seem flaky -- we are using convention to dictate what is actually the return parameter, vs. what semantically happens inside the function. It's going to confuse anyone trying to do it a different way. I've experienced this in the past with things like toHash, where if you didn't define it with the exact signature, it wouldn't actually be used. I realize obviously, that `put` is already specified. But as I said in the bug report, we should think twice about defining rules based solely on how Phobos does things, and calling that the solution. As for things being made "more flexible in the future" this basically translates to code breakage. For example, if you are depending on only the first parameter being considered the "return" value, and all of a sudden it changes to encompass all your parameters, your existing code may fail to compile, even if it's correctly safe and properly annotated. I want to ensure Atila is successful with this. But that means Phobos has to compile with dip1000. So I need to make it work. I think it's a very worthy goal to make Phobos work, and a great proof of concept for dip1000's veracity. However, one-off rules just to make it work with existing code go against that goal IMO. Rules that stand on their own I think will fare better than ones that are loopholes to allow existing code to compile. -Steve
Re: Engine of forum
On 8/21/18 10:08 AM, Ali wrote: On Tuesday, 21 August 2018 at 05:30:07 UTC, Walter Bright wrote: Ask 10 people, and you'll get 10 different answers on what a better forum would be. Actually I think we can get 8 out of those 10 to agree, rust, ocaml, fsharp, nim, scala, clojure .. all use https://www.discourse.org/ I think this software is nowadays regarded and the best Cool! Does it support an interface on top of a newsgroup server? Priority #1 in these parts. If people leave because of the forum software, changing it won't change that. I also agree with that, most people who leave probably leave for more objective reasons, like that the language doesn't answer their needs, or they didnt find the libraries they needed within the ecosystem etc... But what I really meant, is that out of those who leaves, there is possible a very small percentage who left, because they couldnt communicate effectively with the community, and that better communication channels in general ( and a better forum software as an example) could have kept them around for longer , replacing the forum software is a small change, a small win, and I expect small returns. But a small win, is a win On the contrary, many of the regular contributors here, don't give a lick about the forum software, as long as it's primarily backed by the newsgroup server. Many, including myself use the NG server, many others use the mailing list interface. If the NG was ditched, I would have a big problem communicating, as I hate dealing with web forums. The forum software probably could be better in terms of formatting code (see for example vibe.d's forums which are ALSO NG backed and have code formatting features). Other than that, editing posts just doesn't make sense in terms of a mailing list or newsgroup. And it also doesn't make sense in terms of a discussion where things you thought you read mysteriously change. -Steve
Re: Friends don't let friends use inout with scope and -dip1000
On 8/21/18 9:42 AM, Kagamin wrote: except for templated functions: int[] escape(scope int[] r) { return r; //error, can't return scoped argument } int[] escape(return int[] r) { return r; //ok, just as planned } int[] escape(return scope int[] r) { return r; //ok, `return scope` reduced to just `return` } int[] escape(T)(scope int[] r) { return r; //ok! `scope` silently promoted to `return` } You can't have strictly scoped parameter in a templated function - it's silently promoted to return parameter. Is this intended? I would guess it's no different than other inferred attributes. I would also guess that it only gets promoted to a return parameter if it's actually returned. As long as the *result* is scoped like the parameter. In the case of the OP in this thread, there is definitely a problem with inout and the connection to the return value. -Steve
Re: Is @safe still a work-in-progress?
On 8/17/18 11:04 PM, Walter Bright wrote: On 8/17/2018 11:17 AM, bachmeier wrote: This is a good example of D needing to evolve or fizzle out. I don't see evidence that the community has yet figured out how to evolve the language. If it had, these problems would not be around for so many years. We deprecate features of D all the time. (Remember the D1 => D2 wrenching change?) Hm... if you are going for "all the time", the example of D1 to D2 transition is pretty dated. I'd say more like the addition of UDAs was a big evolution. Or maybe UFCS. The reason @safe cannot be default at the moment it because -dip1000 needs work, and nobody is willing to pitch in and review/pull my PRs on it. I would, but I have no idea how dip1000 is supposed to work. I think only you understand it. Even looking at the PR that you have been citing over and over, I can't make heads or tails of what it does or what it allows. -Steve
Re: Friends don't let friends use inout with scope and -dip1000
On 8/20/18 5:43 AM, Nicholas Wilson wrote: On Monday, 20 August 2018 at 09:31:09 UTC, Atila Neves wrote: On Friday, 17 August 2018 at 13:39:29 UTC, Steven Schveighoffer wrote: // used to be scope int* ptr() { return ints; } scope inout(int)* ptr() inout { return ints; } Does scope apply to the return value or the `this` reference? I assumed the return value. I think I've read DIP1000 about a dozen times now and I still get confused. As opposed to `const` or `immutable`, `scope(T)` isn't a thing so... I don't know? A type constructor affects the type of something. So const(int) is an int that is const. const int is actually NOT a type constructor, but a storage class. It's main effect is to make the int actually const(int), but can have other effects (e.g. if it's a global, it may be put into global storage instead of thread-local). scope is not a type constructor, ever. So how do you specify the return type is scope? How do you specify a difference between the scope of the 'this' pointer, and the scope of the return value? I'm super-confused as to what dip1000 actually is doing, and how to use it. What usually happens is that qualifiers to the left of the name apply to the return type and those to the right apply `this`. Not that that _should_ make any difference since lifetime ints == lifetime this No: const int* foo() const { return null; } Error: redundant const attribute. Up until 2.080, this was a deprecation, and the result was int * What happens if you remove the return type? (i.e. scope auto) And write what instead? scope ptr() inout { return ints; } ? Yes, this is what I was thinking. -Steve
Re: Is @safe still a work-in-progress?
On 8/17/18 1:26 PM, jmh530 wrote: On Friday, 17 August 2018 at 14:26:07 UTC, H. S. Teoh wrote: [...] And that is exactly why the whole implementation of @safe is currently rather laughable. By blacklisting rather than whitelisting, we basically open the door wide open to loopholes -- anything that we haven't thought of yet could potentially be a @safe-breaking combination, and we wouldn't know until somebody discovers and reports it. Sadly, it seems there is little interest in reimplementing @safe to use whitelisting instead of blacklisting. T Fundamentally, I see it as a good idea. Walter has talked about how important memory safety is for D. People thinking their @safe code is safe is a big problem when that turns out to not be the case. Imagine the black eye D would have if a company was hacked because of something like this? This will always be a possibility thanks to @trusted. IMO, the problem is that you can't just replace @safe as it is now. You could introduce something like @whitelist or @safewhitelist and begin implementing it, but it would probably be some time before it could replace @safe. Like when @whitelist is only breaking unsafe code. I have to say, I don't see how this all helps. In theory, black-listing and white-listing will get you to the same position. Mechanisms to get or use pointers aren't really being added to the language any more, so the set of "things" to "list" either black or white is finite. In this thread, we are talking about something that should have been black-listed LONG ago, but was not because it was "too useful" (i.e. would break too much code). If @safe code was white-listed, nobody would use it until it was finished, so it would be theoretical anyway. Nobody wants a feature that is @safe, but not useful. However, a bigger problem is that we have a bug that is "fixed" (slicing static arrays) but only if you use a feature that doesn't work correctly (dip1000). Why? I think the bug should be reopened until dip1000 is the default, or it simply gets fixed (i.e. without requiring dip1000). -Steve
Re: Friends don't let friends use inout with scope and -dip1000
On 8/17/18 3:36 AM, Atila Neves wrote: Here's a struct: - struct MyStruct { import core.stdc.stdlib; int* ints; this(int size) @trusted { ints = cast(int*) malloc(size); } ~this() @trusted { free(ints); } scope int* ptr() { return ints; } } - Let's try and be evil with -dip1000: - @safe: // struct MyStruct ... const(int) *gInt; void main() { auto s = MyStruct(10); gInt = s.ptr; } - % dmd -dip1000 scope_inout.d scope_inout.d(26): Error: scope variable this may not be returned Yay! What if instead of `auto` I write `const` instead (or immutable)? This is D we're talking about, so none of this boilerplate nonsense of writing two (or three) basically identical functions. So: - // used to be scope int* ptr() { return ints; } scope inout(int)* ptr() inout { return ints; } Does scope apply to the return value or the `this` reference? What happens if you remove the return type? (i.e. scope auto) - % dmd -dip1000 scope_inout.d % echo $? 0 # nope, no error here Wait, what? Turns out now it compiles. After some under-the-breath mumbling I go hit issues.dlang.org and realise that the issue already exists: https://issues.dlang.org/show_bug.cgi?id=17935 I don't see what this bug report has to do with the given case. For reasons unfathomable to me, this is considered the _correct_ behaviour. Weirder still, writing out the boilerplate that `inout` is supposed to save us (mutable, const and immutable versions) doesn't compile, which is what one would expect. So: @safe + inout + scope + dip1000 + custom memory allocation in D gets us to the usability of C++ circa 1998. At least now we have valgrind and asan I guess. "What about template this?", I hear you ask. It kinda works. Sorta. Kinda. Behold: scope auto ptr(this T)() { return ints; } After changing the definition of `ptr` this way the code compiles fine and `ints` is escaped. Huh. However, if you change `auto s` to `scope s`, it fails to compile as intended. Very weird. This seems like a straight up bug. If you change the destructor to `scope` then it also fails to compile even if it's `auto s`. Because, _obviously_, that's totally different. I'd file an issue but given that the original one is considered not a bug for some reason, I have no idea about what I just wrote is right or not. What I do know is I found multiple ways to do nasty things to memory under the guise of @safe and -dip1000, and my understanding was that the compiler would save me from myself. In the meanwhile I'm staying away from `inout` and putting `scope` on my destructors even if I don't quite understand when destructors should be `scope`. Probably always? I have no idea. This doesn't surprise me. I'm beginning to question whether scope shouldn't have been a type constructor instead of a storage class. It's treated almost like a type constructor in most places, but the language grammar makes it difficult to be specific as to what part it applies. -Steve
Re: More fun with autodecoding
On 8/8/18 4:13 PM, Walter Bright wrote: On 8/6/2018 6:57 AM, Steven Schveighoffer wrote: But I'm not sure if the performance is going to be the same, since now it will likely FORCE autodecoding on the algorithms that have specialized versions to AVOID autodecoding (I think). Autodecoding is expensive which is why the algorithms defeat it. Nearly none actually need it. You can get decoding if needed by using .byDchar or .by!dchar (forgot which it was). There is byCodePoint and byCodeUnit, whereas byCodePoint forces auto decoding. The problem is, I want to use this wrapper just like it was a string in all respects (including the performance gains had by ignoring auto-decoding). Not trying to give too much away about the library I'm writing, but the problem I'm trying to solve is parsing out tokens from a buffer. I want to delineate the whole, as well as the parts, but it's difficult to get back to the original buffer once you split and slice up the buffer using phobos functions. Consider that you are searching for something in a buffer. Phobos provides all you need to narrow down your range to the thing you are looking for. But it doesn't give you a way to figure out where you are in the whole buffer. Up till now, I've done it by weird length math, but it gets tiring (see for instance: https://github.com/schveiguy/fastaq/blob/master/source/fasta/fasta.d#L125). I just want to know where the darned thing I've narrowed down is in the original range! So this wrapper I thought would be a way to use things like you always do, but at any point, you just extract a piece of information (a buffer reference) that shows where it is in the original buffer. It's quite easy to do that part, the problem is getting it to be a drop-in replacement for the original type. Here's where I'm struggling -- because a string provides indexing, slicing, length, etc. but Phobos ignores that. I can't make a new type that does the same thing. Not only that, but I'm finding the specializations of algorithms only work on the type "string", and nothing else. I'll try using byCodeUnit and see how it fares. -Steve