Re: shared - i need it to be useful

2018-10-19 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 9:09 PM, Manu wrote:

On Thu, Oct 18, 2018 at 5:30 PM Timon Gehr via Digitalmars-d
 wrote:


On 18.10.18 23:34, Erik van Velzen wrote:

If you have an object which can be used in both a thread-safe and a
thread-unsafe way that's a bug or code smell.


Then why do you not just make all members shared? Because with Manu's
proposal, as soon as you have a shared method, all members effectively
become shared.


No they don't, only facets that overlap with the shared method.
I tried to present an example before:

struct Threadsafe
{
   int x;
   Atomic!int y;
   void foo() shared { ++y; } // <- shared interaction only affects 'y'
   void bar() { ++x; ++y; } // <- not threadsafe, but does not violate
foo's commitment; only interaction with 'y' has any commitment
associated with it
   void unrelated() { ++x; } // <- no responsibilities are transposed
here, you can continue to do whatever you like throughout the class
where 'y' is not concerned
}

In practise, and in my direct experience, classes tend to have exactly
one 'y', and either zero (pure utility), or many such 'x' members.
Threadsafe API interacts with 'y', and the rest is just normal
thread-local methods which interact with all members thread-locally,
and may also interact with 'y' while not violating any threadsafety
commitments.


I promised I wouldn't respond, I'm going to break that (obviously).

But that's because after reading this description I ACTUALLY understand 
what you are looking for.


I'm going to write a fuller post later, but I can't right now. But the 
critical thing here is, you want a system where you can divvy up a type 
into pieces you share and pieces you don't. But then you *don't* want to 
have to share only the shared pieces. You want to share the whole thing 
and be sure that it can't access your unshared pieces.


This critical requirement makes things a bit more interesting. For the 
record, the most difficult thing to reaching this understanding was that 
whenever I proposed anything, your answer was something like 'I just 
can't work with that', and when I asked why, you said 'because it's 
useless', etc. Fully explaining this point is very key to understanding 
your thinking.


To be continued...

-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 5:22 PM, Manu wrote:

On Thu, Oct 18, 2018 at 12:15 PM Steven Schveighoffer via
Digitalmars-d  wrote:


On 10/18/18 2:55 PM, Manu wrote:

On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/18/18 10:11 AM, Simen Kjærås wrote:

On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote:

struct ThreadSafe
{
 private int x;
 void increment()
 {
++x; // I know this is not shared, so no reason to use atomics
 }
 void increment() shared
 {
atomicIncrement(&x); // use atomics, to avoid races
 }
}


But this isn't thread-safe, for the exact reasons described elsewhere in
this thread (and in fact, incorrectly leveled at Manu's proposal).
Someone could write this code:

void foo() {
   ThreadSafe* a = new ThreadSafe();
   shareAllOver(a);


Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type
ThreadSafe *


And here you expect a user to perform an unsafe-cast (which they may
not understand), and we have no language semantics to enforce the
transfer of ownership. How do you assure that the user yields the
thread-local instance?


No, I expect them to do:

auto a = new shared(ThreadSafe)();


I don't have any use for this design in my application.
I can't use the model you prescribe, at all.


Huh? This is the same thing you are asking for. How were you intending 
to make a thread-safe thing sharable? Surely it will be typed as shared, 
right? How else will you pass it to multiple threads?





I think requiring the cast is un-principled in every way that D values.


No cast is required. If you have shared data, it's shared. If you have
thread local data, it's unshared. Allocate the data the way you expect
to use it.


All data is thread-local, and occasionally becomes shared during periods.
I can't make use of the model you describe.


If data is shared, it is shared. Once it is shared, it never goes back.

In your model, everything is *assumed* shared, so that's what you need 
to do, initialize it as shared. It still works just as you like. Even if 
you never actually share it, or share it periodically.



My proposal is more permissive, and allows a wider range of
application designs. What are the disadvantages?


The opposite is true. More designs are allowed by restricting casting as 
I have demonstrated many times.



It's only if you intend to turn unshared data into shared data where you
need an unsafe cast.


It's unnecessary though, because threadsafe functions are threadsafe!
You're pointlessly forcing un-safety. Why would I prefer a design that
forces unsafe interactions to perform safe operations?


No unsafe interactions are required for a type that defensively is 
shared. Just make it always shared, and you don't have any problems.



It's not even as difficult as immutable, because you can still modify
shared data. For instance, the shared constructor doesn't have to have
special rules about initialization, it can just assume shared from the
beginning.


Your design us immutable, mine is const.


No, your design is not const, const works on normal types. It's 
applicable to anything.


Your design is only applicable to special types that experts write. It's 
not applicable to int, for instance. It feels more like a special 
library than a compiler feature.



Tell me, how many occurrences of 'immutable' can you find in your
software? ... how about const?


I generally use inout whenever possible, or const when that is more 
appropriate. But that is for methods.


For data, I generally use immutable when I want a constant.

But like I said, something can't be both shared and unshared. So having 
shared pointers point at unshared data makes no sense -- once it's 
shared, it's shared. So shared really can't be akin to const.



Which is more universally useful? If you had to choose one or the
other, which one could you live without?


I would hate to have a const where you couldn't read the data, I 
probably would rather have immutable.


I said I would stop commenting on this thread, and I didn't keep that 
promise. I really am going to stop now. I'm pretty sure Walter will not 
agree with this mechanism, so until you convince him, I don't really 
need to be spending time on this.


We seem to be completely understanding each others mechanisms, but not 
agreeing which one is correct, based on (from both sides) hypothetical 
types and usages.


-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 2:42 PM, Stanislav Blinov wrote:

On Thursday, 18 October 2018 at 18:26:27 UTC, Steven Schveighoffer wrote:

On 10/18/18 1:47 PM, Stanislav Blinov wrote:

On Thursday, 18 October 2018 at 17:17:37 UTC, Atila Neves wrote:

On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote:
1. shared should behave exactly like const, except in addition to 
inhibiting write access, it also inhibits read access.


How is this significantly different from now?

-
shared int i;
++i;

Error: read-modify-write operations are not allowed for shared 
variables. Use core.atomic.atomicOp!"+="(i, 1) instead.

-

There's not much one can do to modify a shared value as it is.


i = 1;
int x = i;
shared int y = i;


This should be fine, y is not shared when being created.


'y' isn't, but 'i' is. It's fine on amd64, but that's incidental.


OH, I didn't even notice that `i` didn't have a type, so it was a 
continuation of the original example! I read it as declaring y as shared 
and assigning it to a thread-local (which it isn't actually).


My bad.

-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 2:55 PM, Manu wrote:

On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/18/18 10:11 AM, Simen Kjærås wrote:

On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote:

struct ThreadSafe
{
private int x;
void increment()
{
   ++x; // I know this is not shared, so no reason to use atomics
}
void increment() shared
{
   atomicIncrement(&x); // use atomics, to avoid races
}
}


But this isn't thread-safe, for the exact reasons described elsewhere in
this thread (and in fact, incorrectly leveled at Manu's proposal).
Someone could write this code:

void foo() {
  ThreadSafe* a = new ThreadSafe();
  shareAllOver(a);


Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type
ThreadSafe *


And here you expect a user to perform an unsafe-cast (which they may
not understand), and we have no language semantics to enforce the
transfer of ownership. How do you assure that the user yields the
thread-local instance?


No, I expect them to do:

auto a = new shared(ThreadSafe)();


I think requiring the cast is un-principled in every way that D values.


No cast is required. If you have shared data, it's shared. If you have 
thread local data, it's unshared. Allocate the data the way you expect 
to use it.


It's only if you intend to turn unshared data into shared data where you 
need an unsafe cast.


It's not even as difficult as immutable, because you can still modify 
shared data. For instance, the shared constructor doesn't have to have 
special rules about initialization, it can just assume shared from the 
beginning.


-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 2:59 PM, Manu wrote:

On Thu, Oct 18, 2018 at 7:20 AM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/18/18 10:11 AM, Simen Kjærås wrote:

  a.increment(); // unsafe, non-shared method call
}

When a.increment() is being called, you have no idea if anyone else is
using the shared interface.


I do, because unless you have cast the type to shared, I'm certain there
is only thread-local aliasing to it.


No, you can never be sure. Your assumption depends on the *user*
engaging in an unsafe operation (the cast), and correctly perform a
conventional act; they must correctly the safely transfer ownership.


Not at all. No transfer of ownership is needed, no cast is needed. If 
you want to share something declare it shared.



My proposal puts all requirements on the author, not the user. I think
this is a much more trustworthy relationship, and in terms of
cognitive load, author:users is a 1:many relationship, and I place the
load on the '1', not the 'many.


Sure, but we can create a system today where smart people make objects 
that do the right thing without compiler help. We don't need to break 
the guarantees of shared to do it.


-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 2:24 PM, Manu wrote:

I understand your argument, and I used to think this too... but I
concluded differently for 1 simple reason: usability.


You have not demonstrated why your proposal is usable, and the proposal 
to simply make shared not accessible while NOT introducing implicit 
conversion is somehow not usable.


I find quite the opposite -- the implicit conversion introduces more 
pitfalls and less guarantees from the compiler.



I have demonstrated these usability considerations in production. I am
confident it's the right balance.


Are these considerations the list below, or are they something else? If 
so, can you list them?



I propose:
  1. Normal people don't write thread-safety, a very small number of
unusual people do this. I feel very good about biasing 100% of the
cognitive load INSIDE the shared method. This means the expert, and
ONLY the expert, must make decisions about thread-safety
implementation.


Thread safety is not easy. But it's also not generic.

In terms of low-level things like atomics and lock-free implementations, 
those ARE generic and SHOULD only be written by experts. But other than 
that, you can't know how someone has designed all the conditions in 
their code.


For example, you can have an expert write mutex locks and semaphores. 
But they can't tell you the proper order to lock different objects to 
ensure there's no deadlock. That's application specific.



  2. Implicit conversion allows users to safely interact with safe
things without doing unsafe casts. I think it's a complete design fail
if you expect any user anywhere to perform an unsafe cast to call a
perfectly thread-safe function. The user might not properly understand
their obligations.


I also do not expect anyone to perform unsafe casts in normal use. I 
expect them to use more generic well-written types in a shared-object 
library. Casting should be very rare.



  3. The practical result of the above is, any complexity relating to
safety is completely owned by the threadsafe author, and not cascaded
to the user. You can't expect users to understand, and make correct
decisions about threadsafety. Safety should be default position.


I think these are great rules, and none are broken by keeping the 
explicit cast requirement in place.



I recognise the potential loss of an unsafe optimised thread-local path.
1. This truly isn't a big deal. If this is really hurting you, you
will notice on the profiler, and deploy a thread-exclusive path
assuming the context supports it.


This is a mischaracterization. The thread-local path is perfectly safe 
because only one thread can be accessing the data. That's why it's 
thread-local and not shared.



2. I will trade that for confidence in safe interaction every day of
the week. Safety is the right default position here.


You can be confident that any shared data is properly synchronized via 
the API provided. No confidence should be lost here.



2. You just need to make the unsafe thread-exclusive variant explicit, eg:


It is explicit, the thread-exclusive variant is not marked shared, and 
cannot be called on data that is actually shared and needs synchronization.





struct ThreadSafe
{
 private int x;
 void unsafeIncrement() // <- make it explicit
 {
++x; // User has asserted that no sharing is possible, no reason to use 
atomics
 }
 void increment() shared
 {
atomicIncrement(&x); // object may be shared
 }
}


This is more design by convention.



I think this is quiet a reasonable and clearly documented compromise.
I think absolutely-reliably-threadsafe-by-default is the right default
position. And if you want to accept unsafe operations for optimsation
circumstances, then you're welcome to deploy that in your code as you
see fit.


All thread-local operations are thread-safe by default, because there 
can be only one thread using it. That is the beauty of the current 
regime, regardless of how broken shared is -- unshared is solid. We 
shouldn't want to break that guarantee.



If the machinery is not a library for distribution and local to your
application, and you know for certain that your context is such that
thread-local and shared are mutually exclusive, then you're free to
make the unshared overload not-threadsafe; you can do this because you
know your application context.
You just shouldn't make widely distributed tooling this way.


I can make widely distributed tooling that does both shared and unshared 
versions of the code, and ALL are thread safe. No choices are necessary, 
no compromise on performance, and no design by convention.



I will indeed do this myself in some cases, because I know those facts
about my application.
But I wouldn't compromise the default design of shared for this
optimisation potential... deliberately deployed optimisation is okay
to be unsafe when taken in context.



Except it's perfectly thread safe to use data without synchronization in 
one thread

Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 1:47 PM, Stanislav Blinov wrote:

On Thursday, 18 October 2018 at 17:17:37 UTC, Atila Neves wrote:

On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote:
1. shared should behave exactly like const, except in addition to 
inhibiting write access, it also inhibits read access.


How is this significantly different from now?

-
shared int i;
++i;

Error: read-modify-write operations are not allowed for shared 
variables. Use core.atomic.atomicOp!"+="(i, 1) instead.

-

There's not much one can do to modify a shared value as it is.


i = 1;
int x = i;
shared int y = i;


This should be fine, y is not shared when being created.

However, this still is allowed, and shouldn't be:

y = 5;

-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 1:17 PM, Atila Neves wrote:

On Monday, 15 October 2018 at 18:46:45 UTC, Manu wrote:
1. shared should behave exactly like const, except in addition to 
inhibiting write access, it also inhibits read access.


How is this significantly different from now?

-
shared int i;
++i;


 i = i + 1; // OK(!)

-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 10:11 AM, Simen Kjærås wrote:

On Thursday, 18 October 2018 at 13:35:22 UTC, Steven Schveighoffer wrote:

struct ThreadSafe
{
   private int x;
   void increment()
   {
  ++x; // I know this is not shared, so no reason to use atomics
   }
   void increment() shared
   {
  atomicIncrement(&x); // use atomics, to avoid races
   }
}


But this isn't thread-safe, for the exact reasons described elsewhere in 
this thread (and in fact, incorrectly leveled at Manu's proposal). 
Someone could write this code:


void foo() {
     ThreadSafe* a = new ThreadSafe();
     shareAllOver(a);


Error: cannot call function shareAllOver(shared(ThreadSafe) *) with type 
ThreadSafe *



     a.increment(); // unsafe, non-shared method call
}

When a.increment() is being called, you have no idea if anyone else is 
using the shared interface.


I do, because unless you have cast the type to shared, I'm certain there 
is only thread-local aliasing to it.


This is one of the issues that MP (Manu's Proposal) tries to deal with. 
Under MP, your code would *not* be considered thread-safe, because the 
non-shared portion may interfere with the shared portion. You'd need to 
write two types:


struct ThreadSafe {
     private int x;
     void increment() shared {
     atomicIncrement(&x);
     }
}

struct NotThreadSafe {
     private int x;
     void increment() {
     ++x;
     }
}

These two are different types with different semantics, and forcing them 
both into the same struct is an abomination.


Why? What if I wanted to have an object that is local for a while, but 
then I want it to be shared (and I ensure carefully when I cast to 
shared that there are no other aliases to that)?


In your case, the user of your type will need to ensure thread-safety. 


No, the contract the type provides is: if you DON'T cast unshared to 
shared or vice versa, the type is thread-safe.


If you DO cast unshared to shared, then the type is thread-safe as long 
as you no longer use the unshared reference.


This is EXACTLY how immutable works.

You may not have any control over how he's doing things, while you *do* 
control the code in your own type (and module, since that also affects 
things). Under MP, the type is what needs to be thread-safe, and once it 
is, the chance of a user mucking things up is much lower.


Under MP, the type is DEFENSIVELY thread-safe, locking or using atomics 
unnecessarily when it's thread-local.


-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 9:35 AM, Steven Schveighoffer wrote:


struct NotThreadsafe
{
   private int x;
   void local()
   {
     ++x; // <- invalidates the method below, you violate the other
function's `shared` promise
   }
   void notThreadsafe() shared
   {
     atomicIncrement(&x);
   }
}



[snip]

But on top of that, if I can't implicitly cast mutable to shared, then 
this ACTUALLY IS thread safe, as long as all the casting in the module 
is sound (easy to search and verify), and hopefully all the casting is 
encapsulated in primitives like you have written. Because someone on the 
outside would have to cast a mutable item into a shared item, and this 
puts the responsibility on them to make sure it works.




Another thing to point out -- I can make x public (not private), and 
it's STILL THREAD SAFE.


-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/18/18 2:20 AM, Manu wrote:

On Wed, Oct 17, 2018 at 5:05 AM Timon Gehr via Digitalmars-d
 wrote:


[... all text ...]


OMFG, I just spent about 3 hours writing a super-detailed reply to all
of Timon's posts in aggregate... I clicked send... and it's gone.
I don't know if this is a gmail thing, a mailing list thing... no
idea... but it's... gone.
I can't repeat that effort :(



If it's gmail, it should be in sent folder, no?

I've never had a gmail message that got sent fail to go into the sent box.

-Steve


Re: shared - i need it to be useful

2018-10-18 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 10:26 PM, Manu wrote:

On Wed, Oct 17, 2018 at 6:50 PM Steven Schveighoffer via Digitalmars-d


The implicit cast means that you have to look at more than just your
method. You have to look at the entire module, and figure out all the
interactions, to see if the thread safe method actually is thread safe.
That's programming by convention, and fully trusting the programmer.


I don't understand... how can the outer context affect the
threadsafety of a properly encapsulated thing?


[snip]


You need to take it for an intellectual spin. Show me how it's corrupt
rather than just presenting discomfort with the idea in theory.
You're addicted to some concepts that you've carried around for a long
time. There is no value in requiring casts, they're just a funky
smell, and force the user to perform potentially unsafe manual
conversions, or interactions that they don't understand.


For example (your example):

struct NotThreadsafe
{
  private int x;
  void local()
  {
++x; // <- invalidates the method below, you violate the other
function's `shared` promise
  }
  void notThreadsafe() shared
  {
atomicIncrement(&x);
  }
}

First, note the comment. I can't look ONLY at the implementation of 
"notThreadSafe" (assuming the function name is less of a giveaway) in 
order to guarantee that it's actually thread safe. I have to look at the 
WHOLE MODULE. Anything could potentially do what local() does. I added 
private to x to at least give the appearance of thread safety.


But on top of that, if I can't implicitly cast mutable to shared, then 
this ACTUALLY IS thread safe, as long as all the casting in the module 
is sound (easy to search and verify), and hopefully all the casting is 
encapsulated in primitives like you have written. Because someone on the 
outside would have to cast a mutable item into a shared item, and this 
puts the responsibility on them to make sure it works.


I'm ALL FOR having shared be completely unusable as-is unless you cast 
(thanks for confirming what I suspected in your last post). It's the 
implicit casting which I think makes things way more difficult, and 
completely undercuts the utility of the compiler's mechanical checking.


And on top of that, I WANT that implementation. If I know something is 
not shared, why would I ever want to use atomics on it? I don't like 
needlessly throwing away performance. This is how I would write it:


struct ThreadSafe
{
   private int x;
   void increment()
   {
  ++x; // I know this is not shared, so no reason to use atomics
   }
   void increment() shared
   {
  atomicIncrement(&x); // use atomics, to avoid races
   }
}

The beauty of shared not being implicitly castable, is it allows you to 
focus on the implementation at hand, with the knowledge that nothing 
else can meddle with it. The goal of mechanical checking should be to 
narrow the focus of what needs to be proven correct.


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 6:37 PM, Manu wrote:

On Wed, Oct 17, 2018 at 12:35 PM Steven Schveighoffer via
Digitalmars-d  wrote:


On 10/17/18 2:46 PM, Manu wrote:

On Wed, Oct 17, 2018 at 10:30 AM Steven Schveighoffer via



What the example demonstrates is that while you are trying to disallow
implicit casting of a shared pointer to an unshared pointer, you have
inadvertently allowed it by leaving behind an unshared pointer that is
the same thing.


This doesn't make sense... you're showing a thread-local program.
The thread owning the unshared pointer is entitled to the unshared
pointer. It can make as many copies at it likes. They are all
thread-local.


It's assumed that shared int pointer can be passed to another thread,
right? Do I have to write a full program to demonstrate?


And that shared(int)* provides no access. No other thread with that
pointer can do anything with it.


So then it's a misnomer -- it's not really shared, because I can't do 
anything with it.





There's only one owning thread, and you can't violate that without unsafe casts.


The what is the point of shared? Like why would you share data that
NOBODY CAN USE?


You can call shared methods. They promise threadsafety.
That's a small subset of the program, but that's natural; only a very
small subset of the program is safe to be called from a shared
context.


All I can see is that a shared method promises to be callable on shared 
or unshared data. In essence, it promises nothing.


It's the programmer who must implement the thread safety, and there 
really is no help at all from the compiler for this. At some level, 
there will be either casts, or intrinsics, both of which are unsafe 
without knowing all the context of the object. In any case, it's simply 
a false guarantee of thread safety, which might as well be a convention 
of "any function which starts with TS_ is supposed to be thread safe".


shared in the current form promises one thing and one thing only -- data 
marked as shared is actually sharable between threads, and data not 
marked as shared is actually not shared between threads. This new regime 
you are proposing does nothing extra or new, except break that guarantee.



At SOME POINT, shared data needs to be readable and writable. Any
correct system is going to dictate how that works. It's a good start to
make shared data unusable unless you cast. But then to make it
implicitly castable from unshared defeats the whole purpose.


No. No casting! This is antiquated workflow.. I'm not trying to take
it away from you, but it's not an interesting model for the future.
`shared` can model more than just that.
You can call threadsafe methods. Shared methods explicitly dictate how
the system works, and in a very clear and obvious/intuitive way.

The implicit cast makes using threadsafe objects more convenient when
you only have one, which is extremely common.


The implicit cast means that you have to look at more than just your 
method. You have to look at the entire module, and figure out all the 
interactions, to see if the thread safe method actually is thread safe. 
That's programming by convention, and fully trusting the programmer.


I don't think this thread is going anywhere, so I'll just have to wait 
and see if someone else can explain it better. I'm a firm no on implicit 
casting from mutable to shared.


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 2:46 PM, Manu wrote:

On Wed, Oct 17, 2018 at 10:30 AM Steven Schveighoffer via



What the example demonstrates is that while you are trying to disallow
implicit casting of a shared pointer to an unshared pointer, you have
inadvertently allowed it by leaving behind an unshared pointer that is
the same thing.


This doesn't make sense... you're showing a thread-local program.
The thread owning the unshared pointer is entitled to the unshared
pointer. It can make as many copies at it likes. They are all
thread-local.


It's assumed that shared int pointer can be passed to another thread, 
right? Do I have to write a full program to demonstrate?



There's only one owning thread, and you can't violate that without unsafe casts.


The what is the point of shared? Like why would you share data that 
NOBODY CAN USE?


At SOME POINT, shared data needs to be readable and writable. Any 
correct system is going to dictate how that works. It's a good start to 
make shared data unusable unless you cast. But then to make it 
implicitly castable from unshared defeats the whole purpose.



In order for a datum to be
safely shared, it must be accessed with synchronization or atomics by
ALL parties.


** Absolutely **


If you have one party that can simply change it without
those, you will get races.


*** THIS IS NOT WHAT I'M PROPOSING ***

I've explained it a few times now, but people aren't reading what I
actually write, and just assume based on what shared already does that
they know what I'm suggesting.
You need to eject all presumptions from your mind, take the rules I
offer as verbatim, and do thought experiments from there.


What seems to be a mystery here is how one is to actually manipulate 
shared data. If it's not usable as shared data, how does one use it?





That's why shared/unshared is more akin to mutable/immutable than
mutable/const.


Only if you misrepresent my suggestion.


It's not misrepresentation, I'm trying to fill in the holes with the 
only logical possibilities I can think of.





It's true that only one thread will have thread-local access. It's not
valid any more than having one mutable alias to immutable data.


And this is why the immutable analogy is invalid. It's like const.
shared offers restricted access (like const), not a different class of
thing.


No, not at all. Somehow one must manipulate shared data. If shared data 
cannot be read or written, there is no reason to share it.


So LOGICALLY, we have to assume, yes there actually IS a way to 
manipulate shared data through these very carefully constructed and 
guarded things.



There is one thread with thread-local access, and many threads with
shared access.

If a shared (threadsafe) method can be defeated by threadlocal access,
then it's **not threadsafe**, and the program is invalid.

struct NotThreadsafe
{
   int x;
   void local()
   {
 ++x; // <- invalidates the method below, you violate the other
function's `shared` promise
   }
   void notThreadsafe() shared
   {
 atomicIncrement(&x);
   }
}


So the above program is invalid. Is it compilable with your added 
allowance of implicit casting to shared? If it's not compilable, why 
not? If it is compilable, how in the hell does your proposal help 
anything? I get the exact behavior today without any changes (except 
today, I need to explicitly cast, which puts the onus on me).




struct Atomic(T)
{
   void opUnary(string op : "++")() shared { atomicIncrement(&val); }
   private T val;
}
struct Threadsafe
{
   Atomic!int x;
   void local()
   {
 ++x;
   }
   void threadsafe() shared
   {
 ++x;
   }
}

Naturally, local() is redundant, and it's perfectly fine for a
thread-local to call threadsafe() via implicit conversion.


In this case, yes. But that's not because of anything the compiler can 
prove.


How does Atomic work? I thought shared data was not usable? I'm being 
pedantic because every time I say "well at some point you must be able 
to modify things", you explode.


Complete the sentence: "In order to read or write shared data, you have 
to ..."




Here's another one, where only a subset of the object is modeled to be
threadsafe (this is particularly interesting to me):

struct Threadsafe
{
   int x;
   Atomic!int y;

   void notThreadsafe()
   {
 ++x;
 ++y;
   }
   void threadsafe() shared
   {
 ++y;
   }
}

In these examples, the thread-local function *does not* undermine the
threadsafety of threadsafe(), it MUST NOT undermine the threadsafety
of threadsafe(), or else threadsafe() **IS NOT THREADSAFE**.
In the second example, you can see how it's possible and useful to do
thread-local work without invalidating the objects threadsafety
commitments.


I've said this a bunch of times, there are 2 rules:
1. shared inhibits read and write access to members
2. `shared` methods must be threadsafe


From there, shared becomes interesting and useful.




Given rule 1, how does Atomic!int actually work, if it can't read or 
writ

Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 12:27 PM, Nicholas Wilson wrote:

On Wednesday, 17 October 2018 at 15:51:04 UTC, Steven Schveighoffer wrote:

On 10/17/18 9:58 AM, Nicholas Wilson wrote:
On Wednesday, 17 October 2018 at 13:25:28 UTC, Steven Schveighoffer 
wrote:
It's identical to the top one. You now have a new unshared reference 
to shared data. This is done WITHOUT any agreed-upon synchronization.


It isn't, you typo'd it (I originally missed it too).

int *p3 = cast(int*)p2;


vs


int *p3 = p;


It wasn't a typo.


The first example assigns p2, the second assigns p (which is thread 
local) _not_ p2 (which is shared), I'm confused.




Here they are again:

int *p;
shared int *p2 = p;
int *p3 = cast(int*)p2;

int *p;
shared int *p2 = p;
int *p3 = p;


I'll put some asserts in that show they accomplish the same thing:

assert(p3 is p2);
assert(p3 is p);
assert(p2 is p);

What the example demonstrates is that while you are trying to disallow 
implicit casting of a shared pointer to an unshared pointer, you have 
inadvertently allowed it by leaving behind an unshared pointer that is 
the same thing.


While we do implicitly allow mutable to cast to const, it's because 
const is a weak guarantee. It's a guarantee that the data may not change 
via *this* reference, but could change via other references.


Shared doesn't have the same characteristics. In order for a datum to be 
safely shared, it must be accessed with synchronization or atomics by 
ALL parties. If you have one party that can simply change it without 
those, you will get races.


That's why shared/unshared is more akin to mutable/immutable than 
mutable/const.


It's true that only one thread will have thread-local access. It's not 
valid any more than having one mutable alias to immutable data.


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 9:58 AM, Nicholas Wilson wrote:

On Wednesday, 17 October 2018 at 13:25:28 UTC, Steven Schveighoffer wrote:
It's identical to the top one. You now have a new unshared reference 
to shared data. This is done WITHOUT any agreed-upon synchronization.


It isn't, you typo'd it (I originally missed it too).

int *p3 = cast(int*)p2;


vs


int *p3 = p;


It wasn't a typo.

It's identical in that both result in a thread-local pointer equivalent 
to p. Effectively, you can "cast" away shared without having to write a 
cast.


I was trying to demonstrate the ineffectiveness of preventing implicit 
casting from shared to mutable if you allow unshared data to implicitly 
cast to shared.


It's the same problem with mutable and immutable. It's why we can't 
allow the implicit casting. Explicit casting is OK as long as you don't 
later modify the data.


In the same vein, explicit casting of local to shared is OK as long as 
you don't ever treat the data as local again. Which should requires a 
cast to say "I know what I'm doing, compiler".


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 10:33 AM, Nicholas Wilson wrote:

On Wednesday, 17 October 2018 at 14:26:43 UTC, Timon Gehr wrote:

On 17.10.2018 16:14, Nicholas Wilson wrote:


I was thinking that mutable -> shared const as apposed to mutable -> 
shared would get around the issues that Timon posted.


Unfortunately not. For example, the thread with the mutable reference 
is not obliged to actually make the changes that are performed on that 
reference visible to other threads.


Yes, but that is covered by not being able to read non-atomically from a 
shared reference.


All sides must participate in synchronization for it to make sense. The 
mutable side has no obligation to use atomics. It can use ++data, and 
race conditions will happen.


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 10:18 AM, Timon Gehr wrote:

On 17.10.2018 15:40, Steven Schveighoffer wrote:

On 10/17/18 8:02 AM, Timon Gehr wrote:
Now, if a class has only shared members, that is another story. In 
this case, all references should implicitly convert to shared. 
There's a DIP I meant to write about this. (For all qualifiers, not 
just shared).


When you say "shared members", you mean all the data is shared too or 
just the methods are shared?


If not the data, D has a problem with encapsulation. Not only all the 
methods on the class must be shared, but ALL code in the entire module 
must be marked as using a shared class instance. Otherwise, other 
functions could modify the private data without using the proper synch 
mechanisms.


We are better off requiring the cast, or enforcing that one must use a 
shared object to begin with.


I think any sometimes-shared object is in any case going to benefit 
from parallel implementations for when the thing is unshared.


-Steve


The specific proposal was that, for example, if a class is defined like 
this:


shared class C{
     // ...
}

then shared(C) and C are implicitly convertible to each other. The 
change is not fully backwards-compatible, because right now, this 
annotation just makes all members (data and methods) shared, but child 
classes may introduce unshared members.


OK, so the proposal is that all data and function members are shared. 
That makes sense.


In one sense, because the class reference is conflated with the type 
modifier, having a C that isn't shared, actually have it's class data be 
shared, would be useful.


-Steve



Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/17/18 8:02 AM, Timon Gehr wrote:
Now, if a class has only shared members, that is another story. In this 
case, all references should implicitly convert to shared. There's a DIP 
I meant to write about this. (For all qualifiers, not just shared).


When you say "shared members", you mean all the data is shared too or 
just the methods are shared?


If not the data, D has a problem with encapsulation. Not only all the 
methods on the class must be shared, but ALL code in the entire module 
must be marked as using a shared class instance. Otherwise, other 
functions could modify the private data without using the proper synch 
mechanisms.


We are better off requiring the cast, or enforcing that one must use a 
shared object to begin with.


I think any sometimes-shared object is in any case going to benefit from 
parallel implementations for when the thing is unshared.


-Steve


Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/16/18 8:26 PM, Manu wrote:

On Tue, Oct 16, 2018 at 2:20 PM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/16/18 4:26 PM, Manu wrote:

On Tue, Oct 16, 2018 at 11:30 AM Steven Schveighoffer via
Digitalmars-d  wrote:

int x;

shared int *p = &x; // allow implicit conversion, currently error

passToOtherThread(p);

useHeavily(&x);


What does this mean? It can't do anything... that's the whole point here.
I think I'm struggling here with people bringing presumptions to the
thread. You need to assume the rules I define in the OP for the
experiment to work.


OK, I wrote a whole big response to this, and I went and re-quoted the
above, and now I think I understand what the point of your statement is.

I'll first say that if you don't want to allow implicit casting of
shared to mutable,


It's critical that this is not allowed. It's totally unreasonable to
cast from shared to thread-local without synchronisation.


OK, so even with synchronization in the second thread when you cast, you 
still have a thread-local pointer in the originating thread WITHOUT 
synchronization.



It's as bad as casting away const.


Of course! But shared has a different problem from const. Const allows 
the data to change through another reference, shared cannot allow 
changes without synchronization.


Changes without synchronization are *easy* with an unshared reference. 
Data can't be shared and unshared at the same time.



then you can't allow implicit casting from mutable to
shared. Because it's mutable, races can happen.


I don't follow...


You seem to be saying that shared data is unusable. But why the hell 
have it then? At some point it has to be usable. And the agreed-upon use 
is totally defeated if you also have some stray non-shared reference to it.





There is in fact, no difference between:

int *p;
shared int *p2 = p;
int *p3 = cast(int*)p2;


Totally illegal!! You casted away shared. That's as bad as casting away const.


But if you can't do anything with shared data, how do you use it?




and this:

int *p;
shared int *p2 = p;
int *p3 = p;


There's nothing wrong with this... I don't understand the point?


It's identical to the top one. You now have a new unshared reference to 
shared data. This is done WITHOUT any agreed-upon synchronization.



So really, the effort to prevent the reverse cast is defeated by
allowing the implicit cast.


Only the caller has the thread-local instance. You can take a
thread-local pointer to a thread-local within the context of a single
thread.
So, it's perfectly valid for `p` and `p3` to exist in a single scope.
`p2` is fine here too... and if that shared pointer were to escape to
another thread, it wouldn't be a threat, because it's not readable or
writable, and you can't make it back into a thread-local pointer
without carefully/deliberately deployed machinery.


Huh? If shared data can never be used, why have it?

Pretend that p is not a pointer to an int, but a pointer to an UNSHARED 
type that has shared methods on it and unshared methods (for when you 
don't need any sync).


Now the shared methods will obey the sync, but the unshared ones won't. 
The result is races. I can't understand how you don't see that.



There is a reason we disallow assigning from mutable to immutable
without a cast. Yet, it is done in many cases, because you are sometimes
building an immutable object with mutable pieces, and want to cast the
final result.


I don't think analogy to immutable has a place in this discussion, or
at least, I don't understand the relevance...
I think the reasonable analogy is const.


No, immutable is more akin to shared because immutable and mutable are 
completely different. const can point at mutable or immutable data. 
shared can't be both shared and unshared. There's no comparison. Data is 
either shared or not shared, there is no middle ground. There is no 
equivalent of const to say "this data could be shared, or could be 
unshared".



In this case, it's ON YOU to make sure it's correct, and the traditional
mechanism for the compiler giving you the responsibility is to require a
cast.


I think what you're talking about are behaviours relating to casting
shared *away*, and that's some next-level shit. Handling in that case
is no different to the way it exists today. You must guarantee that
the pointer you possess becomes thread-local before casting it to a
thread-local pointer.
In my application framework, I will never cast shared away under my
proposed design. We don't have any such global locks.


OK, so how does shared data actually operate? Somewhere, the magic has 
to turn into real code. If not casting away shared, what do you suggest?



-

OK, so here is where I think I misunderstood your point. When you said a
lock-free queue would be unus

Re: shared - i need it to be useful

2018-10-17 Thread Steven Schveighoffer via Digitalmars-d

On 10/16/18 6:24 PM, Nicholas Wilson wrote:

On Tuesday, 16 October 2018 at 21:19:26 UTC, Steven Schveighoffer wrote:
OK, so here is where I think I misunderstood your point. When you said 
a lock-free queue would be unusable if it wasn't shared, I thought you 
meant it would be unusable if we didn't allow the implicit cast. But I 
realize now, you meant you should be able to use a lock-free queue 
without it being actually shared anywhere.


What I say to this is that it doesn't need to be usable. I don't care 
to use a lock-free queue in a thread-local capacity. I'll just use a 
normal queue, which is easy to implement, and doesn't have to worry 
about race conditions or using atomics. A lock free queue is a special 
thing, very difficult to get right, and only really necessary if you 
are going to share it. And used for performance reasons!


I think this comes up where the queue was originally shared, you 
acquired a lock on the thing it is a member of, and you want to continue 
using it through your exclusive reference.




Isn't that a locking queue? I thought we were talking lock-free?

-Steve


Re: shared - i need it to be useful

2018-10-16 Thread Steven Schveighoffer via Digitalmars-d

On 10/16/18 4:26 PM, Manu wrote:

On Tue, Oct 16, 2018 at 11:30 AM Steven Schveighoffer via
Digitalmars-d  wrote:


On 10/16/18 2:10 PM, Manu wrote:

On Tue, Oct 16, 2018 at 6:35 AM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/16/18 9:25 AM, Steven Schveighoffer wrote:

On 10/15/18 2:46 PM, Manu wrote:



  From there, it opens up another critical opportunity; T* -> shared(T)*

promotion.
Const would be useless without T* -> const(T)* promotion. Shared
suffers a similar problem.
If you write a lock-free queue for instance, and all the methods are
`shared` (ie, threadsafe), then under the current rules, you can't
interact with the object when it's not shared, and that's fairly
useless.



Oh, I didn't see this part. Completely agree with Timon on this, no
implicit conversions should be allowed.


Why?


int x;

shared int *p = &x; // allow implicit conversion, currently error

passToOtherThread(p);

useHeavily(&x);


What does this mean? It can't do anything... that's the whole point here.
I think I'm struggling here with people bringing presumptions to the
thread. You need to assume the rules I define in the OP for the
experiment to work.


OK, I wrote a whole big response to this, and I went and re-quoted the 
above, and now I think I understand what the point of your statement is.


I'll first say that if you don't want to allow implicit casting of 
shared to mutable, then you can't allow implicit casting from mutable to 
shared. Because it's mutable, races can happen.


There is in fact, no difference between:

int *p;
shared int *p2 = p;
int *p3 = cast(int*)p2;

and this:

int *p;
shared int *p2 = p;
int *p3 = p;

So really, the effort to prevent the reverse cast is defeated by 
allowing the implicit cast.


There is a reason we disallow assigning from mutable to immutable 
without a cast. Yet, it is done in many cases, because you are sometimes 
building an immutable object with mutable pieces, and want to cast the 
final result.


In this case, it's ON YOU to make sure it's correct, and the traditional 
mechanism for the compiler giving you the responsibility is to require a 
cast.


-

OK, so here is where I think I misunderstood your point. When you said a 
lock-free queue would be unusable if it wasn't shared, I thought you 
meant it would be unusable if we didn't allow the implicit cast. But I 
realize now, you meant you should be able to use a lock-free queue 
without it being actually shared anywhere.


What I say to this is that it doesn't need to be usable. I don't care to 
use a lock-free queue in a thread-local capacity. I'll just use a normal 
queue, which is easy to implement, and doesn't have to worry about race 
conditions or using atomics. A lock free queue is a special thing, very 
difficult to get right, and only really necessary if you are going to 
share it. And used for performance reasons!


Why would I want to incur performance penalties when using a lock-free 
queue in an unshared mode? I would actually expect 2 separate 
implementations of the primitives, one for shared one for unshared.


What about primitives that would be implemented the same? In that case, 
the shared method becomes:


auto method() { return (cast(Queue*)&this).method; }

Is this "unusable"? Without a way to say, you can call this on shared or 
unshared instances, then we need to do it this way.


But I would trust the queue to handle this properly depending on whether 
it was typed shared or not.


-Steve


Re: shared - i need it to be useful

2018-10-16 Thread Steven Schveighoffer via Digitalmars-d

On 10/16/18 2:10 PM, Manu wrote:

On Tue, Oct 16, 2018 at 6:35 AM Steven Schveighoffer via Digitalmars-d
 wrote:


On 10/16/18 9:25 AM, Steven Schveighoffer wrote:

On 10/15/18 2:46 PM, Manu wrote:



 From there, it opens up another critical opportunity; T* -> shared(T)*

promotion.
Const would be useless without T* -> const(T)* promotion. Shared
suffers a similar problem.
If you write a lock-free queue for instance, and all the methods are
`shared` (ie, threadsafe), then under the current rules, you can't
interact with the object when it's not shared, and that's fairly
useless.



Oh, I didn't see this part. Completely agree with Timon on this, no
implicit conversions should be allowed.


Why?


int x;

shared int *p = &x; // allow implicit conversion, currently error

passToOtherThread(p);

useHeavily(&x);

How is this safe? Thread1 is using x without locking, while the other 
thread has to lock. In order for synchronization to work, both sides 
have to agree on a synchronization technique and abide by it.



If you want to have a lock-free implementation of something, you can
abstract the assignments and reads behind the proper mechanisms anyway,
and still avoid locking (casting is not locking).


Sorry, I don't understand what you're saying. Can you clarify?



I'd still mark a lock-free implementation shared, and all its methods 
shared. shared does not mean you have to lock, just cast away shared. A 
lock-free container still has to do some special things to make sure it 
avoids races, and having an "unusable" state aids in enforcing this.


-Steve


Re: shared - i need it to be useful

2018-10-16 Thread Steven Schveighoffer via Digitalmars-d

On 10/16/18 9:25 AM, Steven Schveighoffer wrote:

On 10/15/18 2:46 PM, Manu wrote:



From there, it opens up another critical opportunity; T* -> shared(T)*

promotion.
Const would be useless without T* -> const(T)* promotion. Shared
suffers a similar problem.
If you write a lock-free queue for instance, and all the methods are
`shared` (ie, threadsafe), then under the current rules, you can't
interact with the object when it's not shared, and that's fairly
useless.



Oh, I didn't see this part. Completely agree with Timon on this, no 
implicit conversions should be allowed.


If you want to have a lock-free implementation of something, you can 
abstract the assignments and reads behind the proper mechanisms anyway, 
and still avoid locking (casting is not locking).


-Steve


Re: shared - i need it to be useful

2018-10-16 Thread Steven Schveighoffer via Digitalmars-d

On 10/15/18 2:46 PM, Manu wrote:

Okay, so I've been thinking on this for a while... I think I have a
pretty good feel for how shared is meant to be.

1. shared should behave exactly like const, except in addition to
inhibiting write access, it also inhibits read access.

I think this is the foundation for a useful definition for shared, and
it's REALLY easy to understand and explain.

Current situation where you can arbitrarily access shared members
undermines any value it has. Shared must assure you don't access
members unsafely, and the only way to do that with respect to data
members, is to inhibit access completely.
I think shared is just const without read access.

Assuming this world... how do you use shared?

1. traditional; assert that the object become thread-local by
acquiring a lock, cast shared away
2. object may have shared methods; such methods CAN be called on
shared instances. such methods may internally implement
synchronisation to perform their function. perhaps methods of a
lock-free queue structure for instance, or operator overloads on
`Atomic!int`, etc.

In practise, there is no functional change in usage from the current
implementation, except we disallow unsafe accesses (which will make
the thing useful).


From there, it opens up another critical opportunity; T* -> shared(T)*

promotion.
Const would be useless without T* -> const(T)* promotion. Shared
suffers a similar problem.
If you write a lock-free queue for instance, and all the methods are
`shared` (ie, threadsafe), then under the current rules, you can't
interact with the object when it's not shared, and that's fairly
useless.

Assuming the rules above: "can't read or write to members", and the
understanding that `shared` methods are expected to have threadsafe
implementations (because that's the whole point), what are the risks
from allowing T* -> shared(T)* conversion?

All the risks that I think have been identified previously assume that
you can arbitrarily modify the data. That's insanity... assume we fix
that... I think the promotion actually becomes safe now...?

Destroy...



This is a step in the right direction. But there is still one problem -- 
shared is inherently transitive.


So casting away shared is super-dangerous, even if you lock the shared 
data, because any of the subreferences will become unshared and 
read/writable.


For instance:

struct S
{
   int x;
   int *y;
}

shared int z;

auto s1 = shared(S)(1, &z);

auto s2 = shared(S)(2, &z);

S* s1locked = s1.lock;

Now I have access to z via s1locked as an unshared int, and I never 
locked z. Potentially one could do the same thing via s2, and now there 
are 2 mutable references, potentially in 2 threads.


All of this, of course, is manual. So technically we could manually 
implement it properly inside S. But this means shared doesn't help us much.


We really need on top of shared, a way to specify something is 
tail-shared. That is, all the data in S is unshared, but anything it 
points to is still shared. That at least helps the person implementing 
the manual locking from doing stupid things himself.


-Steve


Re: fork vs. posix_spawn (vfork)

2018-10-15 Thread Steven Schveighoffer via Digitalmars-d

On 10/14/18 7:36 AM, notna wrote:

Hi D gurus.

Did read an interesting post form GitLab [1] how they improved 
performance by 30x by just going to go_v1.9... because they again went 
from "fork" to "posix_spawn"...


I've searched the GitHub DLANG org for "posix_spawn" and didn't find a 
hit... so asking myself and you is DLANG still on "fork" and could 
there be some performance improvement potential?


[1] 
https://about.gitlab.com/2018/01/23/how-a-fix-in-go-19-sped-up-our-gitaly-service-by-30x/ 





Related:

https://issues.dlang.org/show_bug.cgi?id=14770

-Steve


Re: D Logic bug

2018-10-12 Thread Steven Schveighoffer via Digitalmars-d

On 10/12/18 6:06 AM, Kagamin wrote:

On Thursday, 11 October 2018 at 23:17:15 UTC, Jonathan Marler wrote:
I had a look at the table again, looks like the ternary operator is on 
there, just called the "conditional operator". And to clarify, D's 
operator precedence is close to C/C++ but doesn't match exactly.  This 
is likely a result of the grammar differences rather than an intention 
one.  For example, the "Conditional operator" in D actually has a 
higher priority than an assignment, but in C++ it's the same and is 
evaluated right-to-left.  So this expression would be different in C++ 
and D:


a ? b : c = d

In D it would be:

(a ? b : c ) = d

And in C++ would be:

a ? b : (c = d)


That's https://issues.dlang.org/show_bug.cgi?id=14186


Wow, interesting that C precedence is different from C++ here.

-Steve


Re: D Logic bug

2018-10-11 Thread Steven Schveighoffer via Digitalmars-d

On 10/11/18 9:16 PM, Jonathan Marler wrote:

On Thursday, 11 October 2018 at 23:29:05 UTC, Steven Schveighoffer wrote:

On 10/11/18 7:17 PM, Jonathan Marler wrote:

I had a look at the table again, looks like the ternary operator is 
on there, just called the "conditional operator". And to clarify, D's 
operator precedence is close to C/C++ but doesn't match exactly.  
This is likely a result of the grammar differences rather than an 
intention one.  For example, the "Conditional operator" in D actually 
has a higher priority than an assignment, but in C++ it's the same 
and is evaluated right-to-left.  So this expression would be 
different in C++ and D:




Not in my C/D code. It would have copious parentheses everywhere :)



Good :)


Yep. General rule of thumb for me after having been burned many many 
times -- Always use parentheses to define order of operations when 
dealing with bitwise operations (and, or, xor) and for the ternary operator.


I think I do make an exception when it's a simple assignment. i.e.:

a = cond ? 1 : 2;



That case is actually very strange, I don't know if it's something 
that's really common.




Yes, that explains why myself, Jonathan Davis and certainly others 
didn't know there were actually differences between C++ and D Operator 
precedence :)  I wasn't sure myself but having a quick look at each's 
operator precedence table made it easy to find an expression that 
behaves differently in both.




I actually was curious whether DMC followed the rules (hey, maybe Walter 
just copied his existing code!), but it does follow C's rules.


-Steve


Re: D Logic bug

2018-10-11 Thread Steven Schveighoffer via Digitalmars-d

On 10/11/18 7:17 PM, Jonathan Marler wrote:

I had a look at the table again, looks like the ternary operator is on 
there, just called the "conditional operator".  And to clarify, D's 
operator precedence is close to C/C++ but doesn't match exactly.  This 
is likely a result of the grammar differences rather than an intention 
one.  For example, the "Conditional operator" in D actually has a higher 
priority than an assignment, but in C++ it's the same and is evaluated 
right-to-left.  So this expression would be different in C++ and D:




Not in my C/D code. It would have copious parentheses everywhere :)

That case is actually very strange, I don't know if it's something 
that's really common.


-Steve


Re: LDC2 1.9.0 beta 1 bug

2018-10-05 Thread Steven Schveighoffer via Digitalmars-d

On 10/5/18 5:41 AM, Kagamin wrote:

On Thursday, 4 October 2018 at 12:51:27 UTC, Shachar Shemesh wrote:
More to the point, however, expanding the call to the second form 
means that I can *never* supply non-default values to arg1 and arg2.


You wrote it yourself: f!()(true, 'S')


This is a terrible workaround. It looks OK with no vararg parameters, 
but lousy if you have any.


i.e.:

f(arg1, arg2, arg3, true, 'S')

becomes:

f!(typeof(arg1), typeof(arg2), typeof(arg3))(arg1, arg2, arg3, true, 'S');

-Steve


Re: LDC2 1.9.0 beta 1 bug

2018-10-04 Thread Steven Schveighoffer via Digitalmars-d

On 10/4/18 8:51 AM, Shachar Shemesh wrote:
I got this as a report from a user, not directly running this, which is 
why I'm not opening a bug report.


Consider the following function:
void f(ARGS...)(ARGS args, bool arg1 = true, char arg2 = 'H');

Now consider the following call to it:
   f(true, 'S');

Theoretically, this can either be calling f!()(true, 'S') or f!(bool, 
char)(true, 'S', true, 'H');


Under 1.8.0, it would do the former. Under 1.9.0-beta1, the later.

Why is this a bug?
Two reasons. First, this is a change of behavior.

More to the point, however, expanding the call to the second form means 
that I can *never* supply non-default values to arg1 and arg2.


You are correct that it's a change in behavior. Johan brought this up 
earlier when the release happened [1], and I agree with both you and him 
that the behavior change requires at least a deprecation cycle.


But it doesn't seem to be getting traction with the people who have made 
the decision in the first place (and Walter simply said to post a bug 
report, which has happened).


I will point out a couple things:

1. Yes you can supply non-default values to arg1 and arg2, you just 
can't use ifti. I can't begin to describe how useless this is.
2. The problem with the original behavior is that you couldn't *actually 
use* the default parameters. In other words, this doesn't compile:


f();

So technically, it was simply an error to provide default parameters in 
a template variadic (the explicit instantiation workaround was allowed, 
but again, useless).


My argument in the bug report is that the whole reason it was added (to 
allow file and line numbers to be runtime parameters in exception 
constructors) is more correctly fixed by fixing another issue, 
https://issues.dlang.org/show_bug.cgi?id=18919, and the expected way 
that default parameters behave should be implemented instead. Both the 
old way and the new way have large inconsistency problems.


-Steve

[1] https://forum.dlang.org/post/myuhmpfygyufxpucv...@forum.dlang.org


Re: `shared`...

2018-10-01 Thread Steven Schveighoffer via Digitalmars-d

On 10/1/18 7:09 PM, Manu wrote:

On Mon, Oct 1, 2018 at 8:55 AM Timon Gehr via Digitalmars-d
 wrote:


On 01.10.2018 04:29, Manu wrote:

struct Bob
{
void setThing() shared;
}

As I understand, `shared` attribution intends to guarantee that I dun
synchronisation internally.
This method is declared shared, so if I have shared instances, I can
call it... because it must handle thread-safety internally.

void f(ref shared Bob a, ref Bob b)
{
a.setThing(); // I have a shared object, can call shared method

b.setThing(); // ERROR
}

This is the bit of the design that doesn't make sense to me...
The method is shared, which suggests that it must handle
thread-safety. My instance `b` is NOT shared, that is, it is
thread-local.
So, I know that there's not a bunch of threads banging on this
object... but the shared method should still work! A method that
handles thread-safety doesn't suddenly not work when it's only
accessed from a single thread.
...


shared on a method does not mean "this function handles thread-safety".
It means "the `this` pointer of this function is not guaranteed to be
thread-local". You can't implicitly create an alias of a reference that
is supposed to be thread-local such that the resulting reference can be
freely shared among threads.


I don't understand. That's the point of `scope`... is that it won't
escape the reference. 'freely shared' is the antithesis of `scope`.


I feel like I don't understand the design...
mutable -> shared should work the same as mutable -> const... because
surely that's safe?


No. The main point of shared (and the main thing you need to understand)
is that it guarantees that if something is _not_ `shared` is is not
shared among threads. Your analogy is not correct, going from
thread-local to shared is like going from mutable to immutable.


We're talking about `mutable` -> `shared scope`. That's like going
from mutable to const.
`shared scope` doesn't say "I can share this", what it says is "this
may be shared, but *I won't share it*", and that's the key.
By passing a thread-local as `shared scope`, the receiver accepts that
the argument _may_ be shared (it's not in this case), but it will not
become shared in the call. That's the point of scope, no?


If the suggested typing rule was implemented, we would have the
following way to break the type system, allowing arbitrary aliasing
between mutable and shared references, completely defeating `shared`:

class C{ /*...*/ }

shared(C) sharedGlobal;
struct Bob{
  C unshared;
  void setThing() shared{
  sharedGlobal=unshared;
  }
}

void main(){
  C c = new C(); // unshared!
  Bob(c).setThing();
  shared(D) d = sharedGlobal; // shared!
  assert(c !is d); // would fail (currently does not even compile)
  // sendToOtherThread(d);
  // c.someMethod(); // (potential) race condition on unshared data
}


Your entire example depends on escaping references. I think you missed
the point?



The problem with mutable wildcards is that you can assign them.

This exposes the problem in your design. The reason const works is 
because you can't mutate it. Shared is not the same.


simple example:

void foo(scope shared int *a, scope shared int *b)
{
   a = b;
}

If I can bind a to a local mutable int pointer, and b as a pointer to 
global shared int, the assignment is now considered OK (types and scopes 
are the same), but now my local points at a shared int without the 
shared adornments.


The common wildcard you need between shared and mutable is *unique*. 
That is, even though it's typed as shared or unshared, the compiler has 
guaranteed there is no other reference to that data. In that case, you 
can move data from one place to another without compromising the system 
(as you assign from one unique pointer to another, the original must 
have to be nullified, otherwise the wildcard still would not work, and 
the unique property would cease to be accurate).


IMO, the correct way to deal with shared would be to make it 100% 
unusable. Not readable, or writable. And then you have to cast away 
shared to make it work (and hopefully performing the correct locking to 
make sure your changes are defined). I don't think there's a magic 
bullet that can fix this.


-Steve


Re: `shared`...

2018-10-01 Thread Steven Schveighoffer via Digitalmars-d

On 10/1/18 7:56 PM, Steven Schveighoffer wrote:

On 10/1/18 7:09 PM, Manu wrote:

Your entire example depends on escaping references. I think you missed
the point?



The problem with mutable wildcards is that you can assign them.

This exposes the problem in your design. The reason const works is 
because you can't mutate it. Shared is not the same.


simple example:

void foo(scope shared int *a, scope shared int *b)
{
    a = b;
}


Haha, of course, this has no effect!

In order for it to show the problem, a has to be ref'd.

-Steve


Re: Updating D beyond Unicode 2.0

2018-09-26 Thread Steven Schveighoffer via Digitalmars-d

On 9/26/18 4:43 PM, Walter Bright wrote:
But expanding it seems of vanishingly little value. Note that each thing 
that gets added to D adds weight to it, and it needs to pull its weight. 
Nothing is free.


It may be the weight is already there in the form of unicode symbol 
support, just the range of the characters supported isn't good enough 
for some languages. It might be like replacing your refrigerator -- you 
get an upgrade, but it's not going to take up any more space because you 
get rid of the old one. I would like to see the PR before passing 
judgment on the heft of the change.


The value is simply in the consistency -- when some of the words for 
your language can be valid symbols but others can't, then it becomes a 
weird guessing game as to what is supported. It would be like saying all 
identifiers can have any letters except `q`. Sure, you can get around 
that, but it's weirdly exclusive.


I claim complete ignorance as to what is required, it hasn't been 
technically laid out what is at stake, and I'm not bilingual anyway. It 
could be true that I'm completely misunderstanding the positions of others.


-Steve


Re: BetterC and CTFE mismatch

2018-09-26 Thread Steven Schveighoffer via Digitalmars-d

On 9/26/18 5:08 AM, Sebastiaan Koppe wrote:

On Wednesday, 26 September 2018 at 08:22:26 UTC, Simen Kjærås wrote:
This is essentially an arbitrary restriction. The basic reason is if a 
function is compiled (even just for CTFE), it ends up in the object 
files, and you've asked for only betterC functions to end up in the 
object files.


--
  Simen


So anything I do at CTFE has to be betterC as well? That is a bummer.


This is an artificial, and not really intended, limitation. Essentially, 
CTFE has to be a real function. If it's defined, it's expected to be 
callable from runtime as well as CTFE.


But I can't see why, if you don't call from runtime, it should matter. I 
think this has to do with the places betterC is enforced in the compiler.




I'll try to workaround this, but I would like to see this fixed. Is 
there anything I can do to move this forward?


I'd suggest a bug report if one hasn't been made.

-Steve


Re: Updating D beyond Unicode 2.0

2018-09-26 Thread Steven Schveighoffer via Digitalmars-d

On 9/26/18 5:54 AM, rjframe wrote:

On Fri, 21 Sep 2018 16:27:46 +, Neia Neutuladh wrote:


I've got this coded up and can submit a PR, but I thought I'd get
feedback here first.

Does anyone see any horrible potential problems here?

Or is there an interestingly better option?

Does this need a DIP?


I just want to point out since this thread is still living that there have
been very few answers to the actual question ("should I submit my PR?").

Walter did answer the question, with the reasons that Unicode identifier
support is not useful/helpful and could cause issues with tooling. Which
is likely correct; and if we really want to follow this logic, Unicode
identifier support should be removed from D entirely.


This is a non-starter. We can't break people's code, especially for 
trivial reasons like 'you shouldn't code that way because others don't 
like it'. I'm pretty sure Walter would be against removing Unicode 
support for identifiers.




I don't recall seeing anyone in favor providing technical reasons, save
the OP.


There doesn't necessarily need to be a technical reason. In fact, there 
really isn't one -- people can get by with using ASCII identifiers just 
fine (and many/most people do). Supporting Unicode would be purely for 
social or inclusive reasons (it may make D more approachable to 
non-English speaking schoolchildren for instance).


As an only-English speaking person, it doesn't bother me either way to 
have Unicode identifiers. But the fact that we *already* support Unicode 
identifiers leads me to expect that we support *all* Unicode 
identifiers. It doesn't make a whole lot of sense to only support some 
of them.




Especially since the work is done, it makes sense to me to ask for the PR
for review. Worst case scenario, it sits there until we need it.


I suggested this as well.

https://forum.dlang.org/post/poaq1q$its$1...@digitalmars.com

I think it stands a good chance of getting incorporated, just for the 
simple fact that it's enabling and not disruptive.


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-26 Thread Steven Schveighoffer via Digitalmars-d

On 9/26/18 2:50 AM, Shachar Shemesh wrote:

On 25/09/18 15:35, Dukc wrote:
Another reason is that something may not have a good translation to 
English. If there is an enum type listing city names, it is IMO better 
to write them as normal, using Unicode. CityName.seinäjoki, not 
CityName.seinaejoki.


This sounded like a very compelling example, until I gave it a second 
thought. I now fail to see how this example translates to a real-life 
scenario.


City names (data, changes over time) as enums (compile time set) seem 
like a horrible idea.


That may sound like a very technical objection to an otherwise valid 
point, but it really think that's not the case. The properties that 
cause city names to be poor candidates for enum values are the same as 
those that make them Unicode candidates.


Hm... I could see actually some "clever" use of opDispatch being used to 
define cities or other such names.


In any case, I think the biggest pro for supporting Unicode symbol names 
is -- we already support Unicode symbol names. It doesn't make a whole 
lot of sense to only support some of them.


-Steve


Re: Forums intermittently going down?

2018-09-25 Thread Steven Schveighoffer via Digitalmars-d

On 9/25/18 5:05 PM, H. S. Teoh wrote:

On Tue, Sep 25, 2018 at 08:41:51PM +, Vladimir Panteleev via Digitalmars-d 
wrote:

On Tuesday, 25 September 2018 at 18:26:58 UTC, CharlesM wrote:

Yeah it happened again today. I heard this site was made in D, maybe
is because the GC?


No, just old server hardware and database fragmentation.


Wow, that's GC-phobia like I've never seen before!


Well, I thought it might be GC related also. It behaves similarly to how 
you would expect a GC pause to behave (several fast responses, then one 
that takes 5 seconds to come back).


But lately, I've noticed I just get the "down for maintenance" message 
more than a delayed response.


In any case, I generally don't use the forum except read-only mode on my 
phone. For posting, I'm generally using NNTP.


I'll note that when I started running into DB slowdowns on a system (not 
related to D), adding one index fixed the issue. Sometimes linear 
searches are fast enough to hide in plain sight :)


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/24/18 3:18 PM, Patrick Schluter wrote:

On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote:
2. There are no rules about what *encoding* is acceptable, it's 
implementation defined. So various compilers have different rules as 
to what will be accepted in the actual source code. In fact, I read 
somewhere that not even ASCII is guaranteed to be supported.


Indeed. IBM mainframes have C compilers too but not ASCII. They code in 
EBCDIC. That's why for instance it's not portable to do things like


  if(c >= 'A' && c <= 'Z') printf("CAPITAL LETTER\n");

is not true in EBCDIC.


Right. But it's just a side-note -- I'd guess all modern compilers 
support ASCII, and definitely ones that we would want to interoperate with.


Besides, that example is more concerned about *input data* encoding, not 
*source code* encoding. If the above is written in ASCII, then I would 
assume that the bytes in the source file are the ASCII bytes, and 
probably the IBM compilers would not know what to do with such files (it 
would all be gibberish if you opened on an EBCDIC editor). You'd first 
have to translate it to EBCDIC, which is a red flag that likely this 
isn't going to work :)


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/24/18 2:20 PM, Martin Tschierschke wrote:

On Monday, 24 September 2018 at 14:34:21 UTC, Steven Schveighoffer wrote:

On 9/24/18 10:14 AM, Adam D. Ruppe wrote:
On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer 
wrote:
Part of the reason, which I haven't read here yet, is that all the 
keywords are in English.


Eh, those are kinda opaque sequences anyway, since the meanings 
aren't quite what the normal dictionary definition is anyway. Look up 
"int" in the dictionary... or "void", or even "string". They are just 
a handful of magic sequences we learn with the programming language. 
(And in languages like Rust, "fn", lol.)


Well, even on top of that, the standard library is full of English 
words that read very coherently when used together (if you understand 
English).


I can't imagine a long chain of English algorithms with some Chinese 
one pasted in the middle looks very good :) I suppose you could alias 
them all...



You might get really funny error messages.

🙂 can't be casted to int.


Haha, it could be cynical as well

int can’t be casted to int🤔

Oh, the games we could play.

-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/24/18 10:14 AM, Adam D. Ruppe wrote:

On Monday, 24 September 2018 at 13:26:14 UTC, Steven Schveighoffer wrote:
Part of the reason, which I haven't read here yet, is that all the 
keywords are in English.


Eh, those are kinda opaque sequences anyway, since the meanings aren't 
quite what the normal dictionary definition is anyway. Look up "int" in 
the dictionary... or "void", or even "string". They are just a handful 
of magic sequences we learn with the programming language. (And in 
languages like Rust, "fn", lol.)


Well, even on top of that, the standard library is full of English words 
that read very coherently when used together (if you understand English).


I can't imagine a long chain of English algorithms with some Chinese one 
pasted in the middle looks very good :) I suppose you could alias them 
all...


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/22/18 12:56 PM, Neia Neutuladh wrote:

On Saturday, 22 September 2018 at 12:35:27 UTC, Steven Schveighoffer wrote:
But aren't we arguing about the wrong thing here? D already accepts 
non-ASCII identifiers.


Walter was doing that thing that people in the US who only speak English 
tend to do: forgetting that other people speak other languages, and that 
people who speak English can learn other languages to work with people 
who don't speak English.


I don't think he was doing that. I think what he was saying was, D tried 
to accommodate users who don't normally speak English, and they still 
use English (for the most part) for coding.


I'm actually surprised there isn't much code out there that is written 
with other identifiers besides ASCII, given that C99 supported them. I 
assumed it was because they weren't supported. Now I learn that they are 
supported, yet almost all C code I've ever seen is written in English. 
Perhaps that's just because I don't frequent foreign language sites 
though :) But many people here speak English as a second language, and 
vouch for their cultures still using English to write code.


He was saying it's inevitably a mistake to use 
non-ASCII characters in identifiers and that nobody does use them in 
practice.


I would expect people probably do try to use them in practice, it's just 
that the problems they run into aren't worth the effort 
(tool/environment support). But I have no first or even second hand 
experience with this. It does seem like Walter has a lot of experience 
with it though.


Walter talking like that sounds like he'd like to remove support for 
non-ASCII identifiers from the language. I've gotten by without 
maintaining a set of personal patches on top of DMD so far, and I'd like 
it if I didn't have to start.


I don't think he was saying that. I think he was against expanding 
support for further Unicode identifiers because the the first effort did 
not produce any measurable benefit. I'd be shocked from the recent 
positions of Walter and Andrei if they decided to remove non-ASCII 
identifiers that are currently supported, thereby breaking any existing 
code.


What languages need an upgrade to unicode symbol names? In other 
words, what symbols aren't possible with the current support?


Chinese and Japanese have gained about eleven thousand symbols since 
Unicode 2.


Unicode 2 covers 25 writing systems, while Unicode 11 covers 146. Just 
updating to Unicode 3 would give us Cherokee, Ge'ez (multiple 
languages), Khmer (Cambodian), Mongolian, Burmese, Sinhala (Sri Lanka), 
Thaana (Maldivian), Canadian aboriginal syllabics, and Yi (Nuosu).


Very interesting! I would agree that we should at least add support for 
unicode symbols that are used in spoken languages, especially if we 
already have support for symbols that aren't ASCII already. I don't see 
the downside, especially if you can already use Unicode 2.0 symbols for 
identifiers (the ship has already sailed).


It could be a good incentive to get kids in countries where English 
isn't commonly spoken to try D out as a first programming language ;) 
Using your native language to show example code could be a huge benefit 
for teaching coding.


My recommendation is to put the PR up for review (that you said you had 
ready) and see what happens. Having an actual patch to talk about could 
change minds. At the very least, it's worth not wasting your efforts 
that you have already spent. Even if it does need a DIP, the PR can show 
that one less piece of effort is needed to get it implemented.


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/24/18 12:23 AM, Neia Neutuladh wrote:

On Monday, 24 September 2018 at 01:39:43 UTC, Walter Bright wrote:

On 9/23/2018 3:23 PM, Neia Neutuladh wrote:
Okay, that's why you previously selected C99 as the standard for what 
characters to allow. Do you want to update to match C11? It's been 
out for the better part of a decade, after all.


I wasn't aware it changed in C11.


http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf page 522 (PDF 
numbering) or 504 (internal numbering).


Outside the BMP, almost everything is allowed, including many things 
that are not currently mapped to any Unicode value. Within the BMP, a 
heck of a lot of stuff is allowed, including a lot that D doesn't 
currently allow.


GCC hasn't even updated to the C99 standard here, as far as I can tell, 
but clang-5.0 is up to date.


I searched around for the current state of symbol names in C, and found 
some really crappy rules, though maybe this site isn't up to date?:


https://en.cppreference.com/w/c/language/identifier

What I understand from that is:

1. Yes, you can use any unicode character you want in C/C++ (seemingly 
since C99)
2. There are no rules about what *encoding* is acceptable, it's 
implementation defined. So various compilers have different rules as to 
what will be accepted in the actual source code. In fact, I read 
somewhere that not even ASCII is guaranteed to be supported.


The result being, that you have to write the identifiers with an ASCII 
escape sequence in order for it to be actually portable. Which to me, 
completely defeats the purpose of using such identifiers in the first place.


For example, on that page, they have a line that works in clang, not in 
GCC (tagged as implementation defined):


char *🐱 = "cat";

The portable version looks like this:

char *\U0001f431 = "cat";

Seriously, who wants to use that?

Now, D can potentially do better (especially when all front-ends are the 
same) and support such things in the spec, but I think the argument 
"because C supports it" is kind of bunk.


Or am I reading it wrong?

In any case, I would expect that symbol name support should be focused 
only on languages which people use, not emojis. If there are words in 
Chinese or Japanese that can't be expressed using D, while other words 
can, it would seem inconsistent to a Chinese or Japanese speaking user, 
and I think we should work to fix that. I just have no idea what the 
state of that is.


I also tend to agree that most code is going to be written in English, 
even when the primary language of the user is not. Part of the reason, 
which I haven't read here yet, is that all the keywords are in English. 
Someone has to kind of understand those to get the meaning of some 
constructs, and it's going to read strangely with the non-english words.


One group which I believe hasn't spoken up yet is the group making the 
hunt framework, whom I believe are all Chinese? At least their web site 
is. It would be good to hear from a group like that which has large 
experience writing mature D code (it appears all to be in English) and 
how they feel about the support.


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-24 Thread Steven Schveighoffer via Digitalmars-d

On 9/22/18 8:58 AM, Jonathan M Davis wrote:

On Saturday, September 22, 2018 6:37:09 AM MDT Steven Schveighoffer via
Digitalmars-d wrote:

On 9/22/18 4:52 AM, Jonathan M Davis wrote:

I was laughing out loud when reading about composing "family"
emojis with zero-width joiners. If you told me that was a tech
parody, I'd have believed it.


Honestly, I was horrified to find out that emojis were even in Unicode.
It makes no sense whatsover. Emojis are supposed to be sequences of
characters that can be interepreted as images. Treating them like
Unicode symbols is like treating entire words like Unicode symbols.
It's just plain stupid and a clear sign that Unicode has gone
completely off the rails (if it was ever on them). Unfortunately, it's
the best tool that we have for the job.

But aren't some (many?) Chinese/Japanese characters representing whole
words?


It's true that they're not characters in the sense that Roman characters are
characters, but they're still part of the alphabets for those languages.
Emojis are specifically formed from sequences of characters - e.g. :) is two
characters which are already expressible on their own. They're meant to
represent a smiley face, but it's a sequence of characters already. There's
no need whatsoever to represent anything extra Unicode. It's already enough
of a disaster that there are multiple ways to represent the same character
in Unicode without nonsense like emojis. It's stuff like this that really
makes me wish that we could come up with a new standard that would replace
Unicode, but that's likely a pipe dream at this point.


But there are tons of emojis that have nothing to do with sequences of 
characters. Like houses, or planes, or whatever. I don't even know what 
the sequences of characters are for them.


I think it started out like that, but turned into something else.

Either way, I can't imagine any benefit from using emojis in symbol names.

-Steve


Re: Updating D beyond Unicode 2.0

2018-09-22 Thread Steven Schveighoffer via Digitalmars-d

On 9/22/18 4:52 AM, Jonathan M Davis wrote:

I was laughing out loud when reading about composing "family"
emojis with zero-width joiners. If you told me that was a tech
parody, I'd have believed it.


Honestly, I was horrified to find out that emojis were even in Unicode. It
makes no sense whatsover. Emojis are supposed to be sequences of characters
that can be interepreted as images. Treating them like Unicode symbols is
like treating entire words like Unicode symbols. It's just plain stupid and
a clear sign that Unicode has gone completely off the rails (if it was ever
on them). Unfortunately, it's the best tool that we have for the job.


But aren't some (many?) Chinese/Japanese characters representing whole 
words?


-Steve


Re: Updating D beyond Unicode 2.0

2018-09-22 Thread Steven Schveighoffer via Digitalmars-d

On 9/21/18 9:08 PM, Neia Neutuladh wrote:

On Friday, 21 September 2018 at 20:25:54 UTC, Walter Bright wrote:
But identifiers? I haven't seen hardly any use of non-ascii 
identifiers in C, C++, or D. In fact, I've seen zero use of it outside 
of test cases. I don't see much point in expanding the support of it. 
If people use such identifiers, the result would most likely be 
annoyance rather than illumination when people who don't know that 
language have to work on the code.


you *do* know that not every codebase has people working on it who 
only know English, right?


If I took a software development job in China, I'd need to learn 
Chinese. I'd expect the codebase to be in Chinese. Because a Chinese 
company generally operates in Chinese, and they're likely to have a lot 
of employees who only speak Chinese.


And no, you can't just transcribe Chinese into ASCII.

Same for Spanish, Norwegian, German, Polish, Russian -- heck, it's 
almost easier to list out the languages you *don't* need non-ASCII 
characters for.


Anyway, here's some more D code using non-ASCII identifiers, in case you 
need examples: https://git.ikeran.org/dhasenan/muzikilo


But aren't we arguing about the wrong thing here? D already accepts 
non-ASCII identifiers. What languages need an upgrade to unicode symbol 
names? In other words, what symbols aren't possible with the current 
support?


Or maybe I'm misunderstanding something.

-Steve


Re: Jai compiles 80,000 lines of code in under a second

2018-09-21 Thread Steven Schveighoffer via Digitalmars-d

On 9/21/18 10:19 AM, Nicholas Wilson wrote:
On Friday, 21 September 2018 at 09:21:34 UTC, Petar Kirov [ZombineDev] 
wrote:
I have been watching Jonathan Blow's Jai for a while myself. There are 
many interesting ideas there, and many of them are what made me like D 
so much in the first place. It's very important to note that the speed 
claims he has been making are all a matter of developer discipline. 
You can have an infinite loop executed at compile-time in both D and Jai.


You're going to OOM pretty fast in D if you try :)


I can see the marketing now, "D finds infinite loops in compile-time 
code way faster than Jai!".


-Steve


Re: Truly @nogc Exceptions?

2018-09-20 Thread Steven Schveighoffer via Digitalmars-d

On 9/20/18 1:58 PM, Adam D. Ruppe wrote:

On Thursday, 20 September 2018 at 17:14:12 UTC, Steven Schveighoffer wrote:
I don't know how a performance problem can occur on an error being 
thrown anyway -- the process is about to end.


Walter's objection was code size - it would throw stuff out of cache 
lines, even if it doesn't need to actually run.


So like this line:

int[] a;
a = a[1 .. $];

With no bounds checking is just

inc a.ptr;

but with bounds checking it becomes something more like

mov ecx, a.length
cmp ecx, 1 // in other words, if length >= offset
jae proceed
push line
push file
call d_arraybounds // throws the error
proceed:
inc a.ptr


Now, what my patch did was just, right before push line, it inserted 
"push length; push offset;". I believe this to be trivial since they are 
already loaded in registers or immediates and thus just a couple bytes 
for those instructions, but Walter (as I recall, this was a while ago 
and I didn't look up his exact words when writing this) said even a 
couple bytes are important for such a common operation as it throws off 
the L1 caches. I never got around to actually measuring the performance 
impact to prove one way or another.


But... even if that is a problem, dmd -O will usually rearrange that to 
avoid the jump on the in-bounds case, and I'm sure ldc/gdc do too, 
so the extra pushes' instruction bytes are off the main execution 
path anyway and thus shouldn't waste cache space.



idk though. regardless, to me, the extra info is *well* worth the cost 
anyway.


Sounds like a case of premature optimization at best.

Besides, if it is a performance issue, you aren't doing bounds checks on 
every slice/index anyway. I know in iopipe, to squeeze out every bit of 
performance, I avoid bounds checks when I know from previous asserts the 
bounds are correct.


-Steve


Re: Truly @nogc Exceptions?

2018-09-20 Thread Steven Schveighoffer via Digitalmars-d

On 9/20/18 12:24 PM, Adam D. Ruppe wrote:

On Thursday, 20 September 2018 at 15:52:03 UTC, Steven Schveighoffer wrote:

I needed to know what the slice parameters that were failing were.


Aye. Note that RangeError is called by the compiler though, so you gotta 
patch dmd to make it pass the arguments to it for index. Ugh. I did a PR 
for this once but it got shot down because of an allegeded (without 
evidence btw) performance degradation. Ugh.


Well, you can always override that. Just do the check yourself and throw 
the error you want ;)


In my case, that's what I did anyway.

I don't know how a performance problem can occur on an error being 
thrown anyway -- the process is about to end.


-Steve


Re: Truly @nogc Exceptions?

2018-09-20 Thread Steven Schveighoffer via Digitalmars-d

On 9/20/18 11:33 AM, Adam D. Ruppe wrote:
On Wednesday, 19 September 2018 at 21:16:00 UTC, Steven Schveighoffer 
wrote:

As Andrei says -- Destroy!


Nah, I agree. Actually, I'm of the opinion that string error messages in 
exceptions ought to be considered harmful: you shouldn't be doing 
strings at all. All the useful information should be in the type - the 
class name and the members with details.


Well, defining a new class can sometimes be a mild hassle... but for 
really common ones, we really should just do it, and other ones can be 
done as templated classes or templated factory functions that define a 
new class right there and then.


http://arsdnet.net/dcode/exception.d

That's the proof-of-concept I wrote for this years ago, go to the bottom 
of the file for the usage example. It uses a reflection mixin to make 
writing the new classes easy, and I even wrote an enforce thing that can 
add more info by creating a subclass that stores arguments to functions 
so it can print it all (assuming they are sane to copy like strings or 
value things lol)


 enforce!fopen("nofile.txt".ptr, "rb".ptr);

MyExceptionBase@exception.d(38): fopen call failed
     filename = nofile.txt
     mode = rb




Awesome! This is just what I was thinking of. In fact, I did something 
similar locally since I needed to know what the slice parameters that 
were failing were. I still had to trick the @nogc to get around the "new 
Exception" piece.


The printMembers thing is nice. I think for druntime/phobos, however, we 
should have a base that just calls a virtual function with the idea that 
the message is printed, and then a further-derived type could do the 
printMembers thing if that's what you want.


-Steve


Re: Truly @nogc Exceptions?

2018-09-20 Thread Steven Schveighoffer via Digitalmars-d

On 9/20/18 11:06 AM, H. S. Teoh wrote:

On Thu, Sep 20, 2018 at 08:48:13AM -0400, Steven Schveighoffer via 
Digitalmars-d wrote:
[...]

But this means you still have to build msg when throwing the
error/exception. It's not needed until you print it, and there's no
reason anyway to make it allocate, even with RAII. For some reason D
forces msg to be built, but it does't e.g. build the entire stack
trace string before hand, or build the string that shows the exception
class name or the file/line beforehand.

[...]

IIRC, originally the stacktrace was also built at exception construction
time. But it was causing a major performance hit, so eventually someone
changed the code to construct it lazily (i.e., only when the catcher
actually tries to look it up).

I think it makes sense to also make .msg lazy, if the exception object
is already carrying enough info to build the message when the catcher
asks for it. And if the catcher doesn't ask for it, we saved an extra GC
allocation (which is a plus even if we're not trying to go @nogc).


Except we DON'T construct the stack trace string, even lazily. If you 
look at the code I posted, it's output directly to the output buffer 
(via the sink delegate), without ever having allocated.


I think we can do that for the message too (why not, it's all 
supported). But either one (using GC at print time, or lazily outputting 
to buffer at print time) solves the biggest problem -- being able to 
construct an exception without the GC.


Plus, this nudges developers of exceptions to store more useful data. If 
you catch an exception that has details in it, possibly it is only going 
to be in the string, which you now have to *parse* to get out what the 
problem was. If instead it was standard practice just to store the 
details, and then construct the string later, more useful information 
would be available in the form of fields/accessors.


Think about this -- every ErrnoException that is thrown allocates its 
message via the GC on construction. Even if you catch that and just look 
at the errno code. Even with dip1008:


https://github.com/dlang/phobos/blob/6a15dfbe18f9151379f6337f53a3c41d12dee939/std/exception.d#L1625

-Steve


Re: Truly @nogc Exceptions?

2018-09-20 Thread Steven Schveighoffer via Digitalmars-d

On 9/20/18 6:48 AM, Atila Neves wrote:
On Wednesday, 19 September 2018 at 21:16:00 UTC, Steven Schveighoffer 
wrote:
Given dip1008, we now can throw exceptions inside @nogc code! This is 
really cool, and helps make code that uses exceptions or errors @nogc. 
Except...


The mechanism to report what actually went wrong for an exception is a 
string passed to the exception during *construction*. Given that you 
likely want to make such an exception inside a @nogc function, you are 
limited to passing a compile-time-generated string (either a literal 
or one generated via CTFE).




I expressed my concern for DIP1008 and the `msg` field when it was first 
announced. I think the fix is easy and a one line change to dmd. I also 
expressed this on that thread but was apparently ignored. What's the 
fix? Have the compiler insert a call to the exception's destructor at 
the end of the `catch(scope Exception)` block. That's it. The `msg` 
field is just a slice, point it to RAII managed memory and you're good 
to go.


Give me deterministic destruction of exceptions caught by scope when 
using dip1008 and I'll give you @nogc exception throwing immediately. 
I've even already written the code!


I thought it already did that? How is the exception destroyed when 
dip1008 is enabled?


But this means you still have to build msg when throwing the 
error/exception. It's not needed until you print it, and there's no 
reason anyway to make it allocate, even with RAII. For some reason D 
forces msg to be built, but it does't e.g. build the entire stack trace 
string before hand, or build the string that shows the exception class 
name or the file/line beforehand.


-Steve


Re: Truly @nogc Exceptions?

2018-09-19 Thread Steven Schveighoffer via Digitalmars-d

On 9/19/18 7:53 PM, Seb wrote:
On Wednesday, 19 September 2018 at 21:28:56 UTC, Steven Schveighoffer 
wrote:

On 9/19/18 5:16 PM, Steven Schveighoffer wrote:
One further thing: I didn't make the sink version of message @nogc, 
but in actuality, it could be.


We recently introduced support for output ranges in the formatting of 
Phobos:


https://dlang.org/changelog/2.079.0.html#toString

Output ranges have the advantage that they could be @nogc and because of 
the templatization also @safe.


I don't think that will work here, as Throwable is a class.

All I can think of is that you would have 2 versions, a @nogc one that 
takes a @nogc delegate, and one that is not.


Of course, your exception type could define something completely 
separate, and you can deal with it in your own project as needed.


If there was a way to say "this is @nogc if you give it a @nogc 
delegate, and not if you don't", that would be useful. The compiler 
could verify it at compile-time.


-Steve


Re: Truly @nogc Exceptions?

2018-09-19 Thread Steven Schveighoffer via Digitalmars-d

On 9/19/18 5:16 PM, Steven Schveighoffer wrote:
One further thing: I didn't make the sink version of message @nogc, but 
in actuality, it could be. Notice how it allocates using the stack. Even 
if we needed some indeterminate amount of memory, it would be simple to 
use C malloc/free, or alloca. But traditionally, we don't put any 
attributes on these base functions. Would it make sense in this case?


Aand, no we can't. Because the sink could actually allocate.

-Steve


Truly @nogc Exceptions?

2018-09-19 Thread Steven Schveighoffer via Digitalmars-d
Given dip1008, we now can throw exceptions inside @nogc code! This is 
really cool, and helps make code that uses exceptions or errors @nogc. 
Except...


The mechanism to report what actually went wrong for an exception is a 
string passed to the exception during *construction*. Given that you 
likely want to make such an exception inside a @nogc function, you are 
limited to passing a compile-time-generated string (either a literal or 
one generated via CTFE).


To demonstrate what I mean, let me give you an example member function 
inside a type containing 2 fields, x and y:


void foo(int[] arr)
{
   auto x = arr[x .. y];
}

There are 2 ways this can throw a range error:

a) x > y
b) y > arr.length

But which is it? And what are x and y, or even the array length?

The error message we get is basic (module name and line number aren't 
important here):


   core.exception.RangeError@testerror.d(6): Range violation

Not good enough -- we have all the information present to give a more 
detailed message. Why not:


   Attempted slice with wrong ordered parameters, 5 .. 4

or

   Slice parameter 6 is greater than length 5

All that information is available, yet we don't see anything like that.

Let's look at the base of all exception and error types to see why we 
don't have such a thing. The part which prints this message is the 
member function toString inside Throwable, repeated here for your 
reading pleasure [1]:


void toString(scope void delegate(in char[]) sink) const
{
import core.internal.string : unsignedToTempString;

char[20] tmpBuff = void;

sink(typeid(this).name);
sink("@"); sink(file);
sink("("); sink(unsignedToTempString(line, tmpBuff, 10)); 
sink(")");


if (msg.length)
{
sink(": "); sink(msg);
}
if (info)
{
try
{
sink("\n");
foreach (t; info)
{
sink("\n"); sink(t);
}
}
catch (Throwable)
{
// ignore more errors
}
}
}

(Side Note: there is an overload for toString which takes no delegate 
and returns a string. But since this overload is present, doing e.g. 
writeln(myEx) will use it)


Note how this *doesn't* allocate anything.

But hang on, what about the part that actually prints the message:

sink(typeid(this).name);
sink("@"); sink(file);
sink("("); sink(unsignedToTempString(line, tmpBuff, 10)); 
sink(")");


if (msg.length)
{
sink(": "); sink(msg);
}

Hm... Note how the file name, and the line number are all *members* of 
the exception, and there was no need to allocate a special string to 
contain the message we saw. So it *is* possible to have a custom message 
without allocation. It's just that the only interface for details is via 
the `msg` string member field -- which is only set on construction.


We can do better.

I noticed that there is a @__future member function inside Throwable 
called message. This function returns the message that the Throwable is 
supposed to display (defaulting to return msg). I believe this was 
inserted at Sociomantic's request, because they need to be able to have 
a custom message rendered at *print* time, not *construction* time [2]. 
This makes sense -- why do we need to allocate some string that will 
never be printed (in the case where an exception is caught and handled)? 
This helps alleviate the problem a bit, as we could construct our 
message at print-time when the @nogc requirement is no longer present.


But we can do even better.

What if we added ALSO a function:

void message(scope void delegate(in char[]) sink)

In essence, this does *exactly* what the const(char)[] returning form of 
message does, but it doesn't require any allocation, nor storage of the 
data to print inside the exception. We can print numbers (and other 
things) and combine them together with strings just like the toString 
function does.


We can then replace the code for printing the message inside toString 
with this:


   bool printedColon = false;
   void subSink(in char[] data)
   {
  if(!printedColon && data.length > 0)
  {
  sink(": ");
  printedColon = true;
  }
  sink(data);
   }
   message(&subSink);

In this case, we then have a MUCH better mechanism to implement our 
desired output from the slice error:


class RangeSliceError : Throwable
{
size_t lower;
size_t upper;
size_t len;

...

override void message(scope void delegate(in char[]) sink)
{
import core.internal.string : unsignedToTempString;

char[20] tmpBuff = void;

if (lower > upper)
{
   sink("Attempted slice with wrong ordered parameters ");
   sink(unsignedToTempString(lower, tmpBuff, 10));
 

Re: Small @nogc experience report

2018-09-19 Thread Steven Schveighoffer via Digitalmars-d

On 9/19/18 1:13 PM, Shachar Shemesh wrote:

There is a catch, though. Writing Mecca with @nogc required 
re-implementing quite a bit of druntime. Mecca uses its own exception 
allocations (mkEx, just saw it's not yet documented, it's under 
mecca.lib.exception). The same module also has "enforceNGC". We also 
have our own asserts. This is partially to support our internal logging 
facility, that needs a static list of formats, but it also solves a very 
important problem with D's @nogc:


void func() @nogc {
   assert(condition, string); // string is useless without actual info 
about what went wrong.
   assert(condition, format(string, arg, arg)); // No good - format is 
not @nogc

   ASSERT!"format"(condition, arg, arg); // @nogc and convenient
}

So, yes, we do use @nogc, but it took a *lot* of work to do it.


I'm running into this coincidentally right now, when trying to debug a 
PR. I found I'm getting a range error deep inside a phobos function. But 
because Phobos is trying to be pure @nogc nothrow @safe, I can do almost 
nothing to display what is wrong.


What I ended up doing is making an extern(C) hook that had the "right" 
attributes, even though it's not @nogc (let's face it, you are about to 
crash anyway).


But it got me thinking, what a useless interface to display errors we 
have! Inside Throwable, there is the function toString(someDelegate 
sink) which prints out the exception trace.


Near the front there is this:

if (msg.length)
{
sink(": "); sink(msg);
}

My, wouldn't it be nice to be able to override this! And forget about 
the whole msg BS. When an exception trace is printed, there are almost 
no restrictions as to what can be done. We should delay the generation 
of the message until then as well! Not to mention that if we can output 
things piecemeal through the sink, we don't even have to allocate at all.


I'm going to write up a more detailed post on this, but it's annoying to 
throw exceptions without any information EXCEPT what can be converted 
into a string at runtime at the time of exception. All that is missing 
is this hook to generate the message.


-Steve


Re: extern(C++, ns) is wrong

2018-09-19 Thread Steven Schveighoffer via Digitalmars-d

On 9/18/18 9:49 PM, Jonathan M Davis wrote:

On Tuesday, September 18, 2018 6:22:55 PM MDT Manu via Digitalmars-d wrote:

https://github.com/dlang/dmd/pull/8667

O_O

Thank you Walter for coming to the party!


Oh, wow. I sure wasn't expecting that. I thought that he'd made it pretty
clear that a DIP was needed, and even then, it didn't seem likely that it
would be accepted. This is awesome. I guess that he finally came around.


I think a big part is that the implementation was done. I think there's 
a big difference between "I don't really love this, but crap, I'll have 
to implement it all" and "I don't really love this, but the 
implementation isn't too intrusive, and all I have to do is click the 
merge button, OK." I sure wish I had the skills to hack dmd, there are 
so many ideas I'd like to see implemented in the language :)


Anyways, great to see this merged!

-Steve


Re: extern(C++, ns) is wrong

2018-09-16 Thread Steven Schveighoffer via Digitalmars-d

On 9/14/18 6:41 PM, Neia Neutuladh wrote:


Specifically, Walter wants this to compile:

module whatever;
extern(C++, foo) void doStuff();
extern(C++, bar) void doStuff();

And he's not too concerned that you might have to use doubly fully 
qualified names to refer to C++ symbols, like:


import core.stdcpp.sstream;
import core.stdcpp.vector;
core.stdcpp.vector.std.vector v;


This is probably the best explanation of why the current situation sucks.

-Steve


Re: int/longRe: DIP 1015--removal of integer & character literal conversion to bool--Final Review

2018-09-16 Thread Steven Schveighoffer via Digitalmars-d

On 9/15/18 8:36 PM, Nicholas Wilson wrote:

Without it, I get a (possibly quite a lot of) deprecation warnings and I 
have to insert a cast to the corresponding type, e.g. 
f(cast(int)E.a)/g(cast(long)(a - b)), to verify the behaviour under the 
new system and silence the deprecation warning (absolutely necessary if 
using `-de`). Then I have to delete them after stage 2, but what if I 
want to support older compilers? Well then I have to wait until they are 
sufficiently old enough.




As precedent, we do have -transition=intpromote, which disables the 
requirement for casting smaller integers to int first.


So the way I would expect someone to migrate their project:

1. Examine all deprecations, looking for ones where I actually *WANT* 
the bool version to be called. Insert cast there.

2. Enable the -transition=nobooldemote or whatever we call it.
3. Once the deprecation period is over, remove the -transition switch.

If I wanted a version to be compilable with older versions of dmd, then 
I would have to cast.


Hm... another option is to have a switch identify "useless" casts once 
the deprecation period is over. Or add it into dfix.


-Steve


Re: DIP 1015--removal of integer & character literal conversion to bool--Final Review

2018-09-16 Thread Steven Schveighoffer via Digitalmars-d

On 9/15/18 6:29 PM, Mike Franklin wrote:

On Saturday, 15 September 2018 at 20:07:06 UTC, Steven Schveighoffer wrote:


Looks pretty good to me. The only question I have is on this part:

enum YesNo : bool { no, yes } // Existing implementation: OK
  // After stage 1: Deprecation warning
  // After stage 2: Error
  // Remedy: `enum YesNo : bool { no = 
false, yes = true }`


Why is this necessary? I can't see how there are integer literals 
being used here, or how implicitly going from `false` to `true` in the 
2 items being enumerated is going to be confusing.


You're right, I just tested the implementation, and this is not 
necessary.  I'll remove it.  Thanks!


Then I have no objections, looks like a nice positive change to me!

-Steve


Re: DIP 1015--removal of integer & character literal conversion to bool--Final Review

2018-09-15 Thread Steven Schveighoffer via Digitalmars-d

On 9/14/18 6:41 AM, Mike Parker wrote:
DIP 1015, "Deprecation and removal of implicit conversion from integer 
and character literals to bool", is now ready for Final Review. This is 
a last chance for community feedback before the DIP is handed off to 
Walter and Andrei for the Formal Assessment. Please read the procedures 
document for details on what is expected in this review stage:


https://github.com/dlang/DIPs/blob/master/PROCEDURE.md#final-review

The current revision of the DIP for this review is located here:

https://github.com/dlang/DIPs/blob/299f81c2352fae4c7fa097de71308d773dcd9d01/DIPs/DIP1015.md 



In it you'll find a link to and summary of the previous review round. 
This round of review will continue until 11:59 pm ET on September 28 
unless I call it off before then.


Thanks in advance for your participation.


Looks pretty good to me. The only question I have is on this part:

enum YesNo : bool { no, yes } // Existing implementation: OK
  // After stage 1: Deprecation warning
  // After stage 2: Error
  // Remedy: `enum YesNo : bool { no = 
false, yes = true }`


Why is this necessary? I can't see how there are integer literals being 
used here, or how implicitly going from `false` to `true` in the 2 items 
being enumerated is going to be confusing.


-Steve


Re: More fun with autodecoding

2018-09-15 Thread Steven Schveighoffer via Digitalmars-d

On 9/15/18 12:04 PM, Neia Neutuladh wrote:

On Saturday, 15 September 2018 at 15:31:00 UTC, Steven Schveighoffer wrote:
The problem I had was that it wasn't clear to me which constraint was 
failing. My bias brought me to "it must be autodecoding again!". But 
objectively, I should have examined all the constraints to see what 
was wrong. All C++ concepts seem to do (haven't used them) is help 
identify easier which requirements are failing.


They also make it so your automated documentation can post a link to 
something that describes the type in more cases. std.algorithm would 
still be relatively horked, but a lot of functions could be declared as 
yielding, for instance, ForwardRange!(ElementType!(TRange)).


True, we currently rely on convention there. But this really is simply 
documentation at a different (admittedly more verified) level.




We can fix all these problems by simply identifying the constraint 
clauses that fail. By color coding the error message identifying which 
ones are true and which are false, we can pinpoint the error without 
changing the language.


I wish. I had a look at std.algorithm.searching.canFind as the first 
thing I thought to check. Its constraints are of the form:


     bool canFind(Range)(Range haystack)
     if (is(typeof(find!pred(haystack

The compiler can helpfully point out that the specific constraint that 
failed was is(...), which does absolutely no good in trying to track 
down the problem.


is(typeof(...)) constraints might be useless here, but we have started 
to move away from such things in general (see for instance isInputRange 
and friends).


But there could actually be a solution -- just recursively play out the 
items at compile time (probably with the verbose switch) to see what 
underlying cause there is.


Other than that, you can then write find(myrange) and see what comes up.

In my case even, the problem was hasSlicing, which itself is a 
complicated template, and wouldn't have helped me diagnose the real 
problem. A recursive display of what things failed would help, but even 
if I could trigger a way to diagnose hasSlicing, instead of copying all 
the constraints locally, it's still a much better situation.


I'm really thinking of exploring how this could play out, just toying 
with the compiler to do this would give me experience in how the thing 
works.


-Steve


Re: Proposal: __not(keyword)

2018-09-15 Thread Steven Schveighoffer via Digitalmars-d

On 9/14/18 11:06 AM, Adam D. Ruppe wrote:


It also affects attrs brought through definitions though:

shared class foo {
    int a; // automatically shared cuz of the above line of code
    __not(shared) int b; // no longer shared
}


Aside from Jonathan's point, which I agree with, that the cost(bool) 
mechanism would be preferable in generic code (think not just negating 
existing attributes, but determining how to forward them), the above is 
different then just negation.


Making something unshared *inside* something that is shared breaks 
transitivity, and IMO the above simply would be the same as not having 
any attribute there.


In other words, I would expect:

shared foo f;

static assert(is(typeof(f.b)) == shared(int));

I'm not sure how the current behavior works, but definitely wanted to 
clarify that we can't change something like that without a major 
language upheaval.


-Steve


Re: More fun with autodecoding

2018-09-15 Thread Steven Schveighoffer via Digitalmars-d

On 9/13/18 3:53 PM, H. S. Teoh wrote:

On Thu, Sep 13, 2018 at 06:32:54PM -0400, Nick Sabalausky (Abscissa) via 
Digitalmars-d wrote:

On 09/11/2018 09:06 AM, Steven Schveighoffer wrote:


Then I found the true culprit was isForwardRange!R. This led me to
requestion my sanity, and finally realized I forgot the empty
function.


This is one reason template-based interfaces like ranges should be
required to declare themselves as deliberately implementing said
interface. Sure, we can tell people they should always `static
assert(isForwardRage!MyType)`, but that's coding by convention and
clearly isn't always going to happen.


No, please don't. I've used C# and Swift, and this sucks compared to 
duck typing.



Yeah, I find myself writing `static assert(isInputRange!MyType)` all the
time these days, because you just never can be too sure you didn't screw
up and cause things to mysteriously fail, even though they shouldn't.

Although I used to be a supporter of free-form sig constraints (and
still am to some extent) and a hater of Concepts like in C++, more and
more I'm beginning to realize the wisdom of Concepts rather than
free-for-all ducktyping.  It's one of those things that work well in
small programs and fast, one-shot projects, but don't generalize so well
as you scale up to larger and larger projects.


The problem I had was that it wasn't clear to me which constraint was 
failing. My bias brought me to "it must be autodecoding again!". But 
objectively, I should have examined all the constraints to see what was 
wrong. All C++ concepts seem to do (haven't used them) is help identify 
easier which requirements are failing.


We can fix all these problems by simply identifying the constraint 
clauses that fail. By color coding the error message identifying which 
ones are true and which are false, we can pinpoint the error without 
changing the language.


Once you fix the issue, it doesn't error any more, so the idea of duck 
typing and constraints is sound, it's just difficult to diagnose.


-Steve


Re: More fun with autodecoding

2018-09-12 Thread Steven Schveighoffer via Digitalmars-d

On 9/11/18 7:58 AM, jmh530 wrote:


Is there any reason why this is not sufficient?

[1] https://run.dlang.io/is/lu6nQ0


That's OK if you are the only one defining S. But what if float is 
handled elsewhere?


-Steve


Re: More fun with autodecoding

2018-09-11 Thread Steven Schveighoffer via Digitalmars-d

On 9/10/18 7:00 PM, Nicholas Wilson wrote:

On Monday, 10 September 2018 at 20:44:46 UTC, Andrei Alexandrescu wrote:

On 9/10/18 12:46 PM, Steven Schveighoffer wrote:

On 9/10/18 8:58 AM, Steven Schveighoffer wrote:
I'll have to figure out why my specialized range doesn't allow 
splitting based on " ".


And the answer is: I'm an idiot. Forgot to define empty :) Also my 
slicing operator accepted ints and not size_t.


I guess a better error message would be in order.


https://github.com/dlang/DIPs/pull/131 will help narrow down the cause.


While this would help eventually, I'd prefer something that just 
transforms all the existing code into useful error messages. See my 
response to Andrei.


-Steve


Re: More fun with autodecoding

2018-09-11 Thread Steven Schveighoffer via Digitalmars-d

On 9/10/18 1:44 PM, Andrei Alexandrescu wrote:

On 9/10/18 12:46 PM, Steven Schveighoffer wrote:

On 9/10/18 8:58 AM, Steven Schveighoffer wrote:
I'll have to figure out why my specialized range doesn't allow 
splitting based on " ".


And the answer is: I'm an idiot. Forgot to define empty :) Also my 
slicing operator accepted ints and not size_t.


I guess a better error message would be in order.



A better error message would help prevent the painful diagnosis that I 
had to do to actually find the issue.


So the error I got was this:

source/bufref.d(346,36): Error: template 
std.algorithm.iteration.splitter cannot deduce function from argument 
types !()(Result, string), candidates are:
/Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(3792,6): 
   std.algorithm.iteration.splitter(alias pred = "a == b", Range, 
Separator)(Range r, Separator s) if (is(typeof(binaryFun!pred(r.front, 
s)) : bool) && (hasSlicing!Range && hasLength!Range || 
isNarrowString!Range))
/Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4163,6): 
   std.algorithm.iteration.splitter(alias pred = "a == b", Range, 
Separator)(Range r, Separator s) if (is(typeof(binaryFun!pred(r.front, 
s.front)) : bool) && (hasSlicing!Range || isNarrowString!Range) && 
isForwardRange!Separator && (hasLength!Separator || 
isNarrowString!Separator))
/Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4350,6): 
   std.algorithm.iteration.splitter(alias isTerminator, 
Range)(Range r) if (isForwardRange!Range && 
is(typeof(unaryFun!isTerminator(r.front
/Users/steves/.dvm/compilers/dmd-2.081.0/osx/bin/../../src/phobos/std/algorithm/iteration.d(4573,6): 
   std.algorithm.iteration.splitter(C)(C[] s) if (isSomeChar!C)


This means I had to look at each line, figure out which overload I'm 
calling, and then copy all the constraints locally, seeing which ones 
were true and which ones false.


But it didn't stop there. The problem was hasSlicing!Range. If you look 
at hasSlicing, it looks like this:


enum bool hasSlicing(R) = isForwardRange!R
&& !isNarrowString!R
&& is(ReturnType!((R r) => r[1 .. 1].length) == size_t)
&& (is(typeof(lvalueOf!R[1 .. 1]) == R) || isInfinite!R)
&& (!is(typeof(lvalueOf!R[0 .. $])) || is(typeof(lvalueOf!R[0 .. 
$]) == R))

&& (!is(typeof(lvalueOf!R[0 .. $])) || isInfinite!R
|| is(typeof(lvalueOf!R[0 .. $ - 1]) == R))
&& is(typeof((ref R r)
{
static assert(isForwardRange!(typeof(r[1 .. 2])));
}));

Now I had to instrument a whole slew of items. I pasted this whole thing 
this into my code, added an alias to my range type for R, and then 
changed the big boolean expression to a bunch of static asserts.


Then I found the true culprit was isForwardRange!R. This led me to 
requestion my sanity, and finally realized I forgot the empty function.


A fabulous fantastic mechanism that would have saved me some time is 
simply coloring the clauses of the template constraint that failed red, 
the ones that passed green, and the ones that weren't evaluated grey.


Furthermore, it would be good to either recursively continue this for 
red clauses like `hasSlicing` which have so much underneath. Either that 
or a way to trigger the colored evaluation on demand.


If I were a dmd guru, I'd look at doing this myself. I may still try and 
hack it in just to see if I can do it.


--

Finally, there is a possible bug in the definition of hasSlicing: it 
doesn't require the slice parameters be size_t, but there are places 
(e.g. inside std.algorithm.searching.find) that pass in range.length .. 
range.length for slicing the range. In my implementation I had used ints 
as the parameters for opSlice. So I started seeing errors deep inside 
std.algorithm saying there was no overload for slicing. Again the sanity 
was questioned, and I figured out the error and now it's actually working.


-Steve


Re: More fun with autodecoding

2018-09-10 Thread Steven Schveighoffer via Digitalmars-d

On 9/10/18 8:58 AM, Steven Schveighoffer wrote:
I'll have to figure out why my specialized range doesn't allow splitting 
based on " ".


And the answer is: I'm an idiot. Forgot to define empty :) Also my 
slicing operator accepted ints and not size_t.


-Steve



Re: More fun with autodecoding

2018-09-10 Thread Steven Schveighoffer via Digitalmars-d

On 9/8/18 8:36 AM, Steven Schveighoffer wrote:

On 8/9/18 2:44 AM, Walter Bright wrote:

On 8/8/2018 2:01 PM, Steven Schveighoffer wrote:
Here's where I'm struggling -- because a string provides indexing, 
slicing, length, etc. but Phobos ignores that. I can't make a new 
type that does the same thing. Not only that, but I'm finding the 
specializations of algorithms only work on the type "string", and 
nothing else.


One of the worst things about autodecoding is it is special, it *only* 
steps in for strings. Fortunately, however, that specialness enabled 
us to save things with byCodePoint and byCodeUnit.


So it turns out that technically the problem here, even though it seemed 
like an autodecoding problem, is a problem with splitter.


splitter doesn't deal with encodings of character ranges at all.

For instance, when you have this:

"abc 123".byCodeUnit.splitter;

What happens is splitter only has one overload that takes one parameter, 
and that requires a character *array*, not a range.


So the byCodeUnit result is aliased-this to its original, and surprise! 
the elements from that splitter are string.


Next, I tried to use a parameter:

"abc 123".byCodeUnit.splitter(" ");

Nope, still devolves to string. It turns out it can't figure out how to 
split character ranges using a character array as input.


Hm... I made some erroneous assumptions in determining these problems.

1. There is no alias this for the source in ByCodeUnitImpl. I'm not sure 
how it was working when I tested before, but byCodeUnit definitely 
doesn't have it, and doesn't compile with the no-arg splitter call.
2. The .splitter(" ") does actually work and return a range of 
ByCodeUnitImpl elements.


So some of my analysis must have been based on bad testing.

However, the issue with the no-arg splitter is still there, and I still 
think it should be fixed.


I'll have to figure out why my specialized range doesn't allow splitting 
based on " ".


-Steve


Re: More fun with autodecoding

2018-09-10 Thread Steven Schveighoffer via Digitalmars-d

On 9/8/18 8:36 AM, Steven Schveighoffer wrote:
I'll work on adding some issues to the tracker, and potentially doing 
some PRs so they can be fixed.


https://issues.dlang.org/show_bug.cgi?id=19238
https://github.com/dlang/phobos/pull/6700

-Steve



Re: Will the core.stdc module be updated for newer versions of C?

2018-09-10 Thread Steven Schveighoffer via Digitalmars-d

On 9/7/18 6:12 PM, solidstate1991 wrote:
While for the most part it still works very well, however when porting 
Mago I found a few functions that are not present in C99 (most notably 
wcsncpy_s).


It will be updated when you update it ;)

There is just so much in the stdc libraries that it's difficult to 
achieve 100% coverage. The intention is for any time you have a #include 
 for some C standard header, you can do import 
core.stdc.someFilePath in D. If there are missing functions, and they 
aren't OS specific, please file a bug report, and if you're up to it, 
add the function in a PR.


-Steve


Re: More fun with autodecoding

2018-09-10 Thread Steven Schveighoffer via Digitalmars-d

On 9/10/18 1:45 AM, Chris wrote:

After a while your code will be cluttered with absurd stuff like this. 
`.byCodeUnit`, `.byGrapheme`, `.array` etc. Due to my experience with 
`splitter` et. al. I tried to create my own parser to have better 
control over every step.


I considered that, but I'm still trying to make this buffer reference 
thing work. Phobos just needs to be fixed. This is actually not as 
hopeless as I once thought. But what needs to happen is all of Phobos 
algorithms need to be tested with byCodeUnit et. al.


After a few *minutes* of testing things I ran 
into this bug [1] that didn't get fixed till early 2018. I never started 
to write my own step-by-step parser. I'm glad I didn't.


It actually was fixed accidentally in 2017 in this PR: 
https://github.com/dlang/druntime/pull/1952. The bug was closed in 2018 
when someone noticed the code no longer failed.


Essentially, the whole string switch algorithm was replaced with a 
completely rewritten better approach. This is a great example of why we 
should be moving more of the compiler magic into the library -- it's 
just easier to write and understand there.


I wish people began to realize that string handling is a basic necessity 
and that the correct handling of strings is of utmost importance. Please 
keep us updated on how things work out (or not) for you.


Absolutely, D needs to have great support for string parsing and 
manipulation. The potential is awesome.


I will keep it up, what I'm trying to fix is the fact that using 
std.algorithm to extract pieces from a buffer, but then using the 
position in that buffer to determine things (i.e. parsing) is really 
difficult without some stupid requirements like pointer math.


[Please, nobody answer my post pointing out that a) we don't understand 
Unicode and b) that it's an insult to the Universe to draw attention to 
flaws that keep pestering us on an almost daily basis - without trying 
to fix them ourselves stante pede. As is clear from Steve's efforts, the 
Universe doesn't seem to care.)


I don't characterize it as the universe not caring. Phobos has a legacy 
problem with string handling, and it needs to somehow be addressed -- 
either by painfully extracting the problem, or painfully working around 
it. I don't think anyone here thinks there isn't a problem or that it's 
insulting to bring it up. But anything that needs to be done is painful 
either way, which is why it's not happening very fast.


-Steve


Re: More fun with autodecoding

2018-09-09 Thread Steven Schveighoffer via Digitalmars-d

On 8/9/18 2:44 AM, Walter Bright wrote:

On 8/8/2018 2:01 PM, Steven Schveighoffer wrote:
Here's where I'm struggling -- because a string provides indexing, 
slicing, length, etc. but Phobos ignores that. I can't make a new type 
that does the same thing. Not only that, but I'm finding the 
specializations of algorithms only work on the type "string", and 
nothing else.


One of the worst things about autodecoding is it is special, it *only* 
steps in for strings. Fortunately, however, that specialness enabled us 
to save things with byCodePoint and byCodeUnit.


So it turns out that technically the problem here, even though it seemed 
like an autodecoding problem, is a problem with splitter.


splitter doesn't deal with encodings of character ranges at all.

For instance, when you have this:

"abc 123".byCodeUnit.splitter;

What happens is splitter only has one overload that takes one parameter, 
and that requires a character *array*, not a range.


So the byCodeUnit result is aliased-this to its original, and surprise! 
the elements from that splitter are string.


Next, I tried to use a parameter:

"abc 123".byCodeUnit.splitter(" ");

Nope, still devolves to string. It turns out it can't figure out how to 
split character ranges using a character array as input.


The only thing that does seem to work is this:

"abc 123".byCodeUnit.splitter(" ".byCodeUnit);

But this goes against most algorithms in Phobos that deal with character 
ranges -- generally you can use any width character range, and it just 
works. Having a drop-in replacement for string would require splitter to 
handle these transcodings (and I think in general, algorithms should be 
able to handle them as well). Not only that, but the specialized 
splitter that takes no separator can split on multiple spaces, a feature 
I want to have for my drop-in replacement.


I'll work on adding some issues to the tracker, and potentially doing 
some PRs so they can be fixed.


-Steve


Re: More fun with autodecoding

2018-09-09 Thread Steven Schveighoffer via Digitalmars-d

On 9/8/18 8:36 AM, Steven Schveighoffer wrote:


Sent this when I was on a plane, and for some reason it posted with the 
timestamp when I hit "send later", not when I connected just now. So 
this is to bring the previous message back to the forefront.


-Steve


Re: This is why I don't use D.

2018-09-06 Thread Steven Schveighoffer via Digitalmars-d

On 9/5/18 4:40 PM, Nick Sabalausky (Abscissa) wrote:

On 09/04/2018 09:58 PM, Jonathan M Davis wrote:
On Tuesday, September 4, 2018 7:18:17 PM MDT James Blachly via 
Digitalmars-d

wrote:


Are you talking about this?

https://github.com/clinei/3ddemo

which hasn't been updated since February 2016?


This is part of why it's sometimes been discussed that we need a way to
indicate which dub packages are currently maintained and work.


What we need is for DUB to quit pretending the compiler (and DUB itself, 
for that matter) isn't a dependency just like any other. I pointed this 
out years ago over at DUB's GitHub project, but pretty much just got 
silence.



The compiler doesn't change all that often, and when it does, it's 
usually a long deprecation cycle.


The problem really is that phobos/druntime change all the time. And they 
are tied to the compiler.


I think some sort of semver scheme really should be implemented for the 
compiler and phobos. But we need more manpower to handle that.


-Steve


Re: This is why I don't use D.

2018-09-05 Thread Steven Schveighoffer via Digitalmars-d

On 9/5/18 11:46 AM, Dennis wrote:

On Wednesday, 5 September 2018 at 13:27:48 UTC, Steven Schveighoffer wrote:
3ddemo has one commit. In February 2016. I think it would be an 
amazing feat indeed if a project with one version builds for more than 
2 years in any language.


This problem is not about 3ddemo. I can totally relate to the OP, when I 
started learning D (we're talking April 2017 here) I tried many OpenGL 
demos and GUI libraries. I like learning by example, so I tried a lot of 
them on both Ubuntu and Windows. My success rate of building them was 
below 20%, and even if it did succeed, it often still had deprecation 
warnings, or linking errors when loading the required shared libraries, 
or glitches like messed up text rendering. I would try to fix it myself, 
but the error messages were not clear at all for a beginner and Googling 
them yielded few results.


I should say, I have little experience with or understanding of building 
opengl stuff. My experience with trying stuff is that it's very finnicky 
about which libraries are installed or how the environment has to be 
properly set up.


Even after I got 3ddemo to compile, it didn't run, wouldn't open certain 
libraries.


So I think it would be nice if these experiences were better, but I 
don't know what the core D projects need to do here. My guess is that 
there is not a lot of manpower making proper easy-to-use 3d libraries.


We're not even only talking about small unmaintained projects here: at 
the time I tried it, Gtk-D was broken.[1] Out of frustration I carried 
on in C# for a while, and guess what: the first best OpenTK demo I found 
basically worked first try. Now I didn't give up on D, but I can totally 
understand that others (like OP) don't have the patience to put up with 
this.


I think Gtk-D has gotten a lot better (not my experience, but Tilix 
seems to be doing good with it) since then.


While we can't force volunteers to keep their D projects up to date, we 
could try to give some incentive by notifying them via code.dlang.org, 
or give users information on what compiler / environment is required for 
dub packages to build. It might prevent some new users from leaving D 
out of frustration.


I think a "known good" configuration entry, even if it's manual, would 
be a good thing to add.


-Steve


Re: This is why I don't use D.

2018-09-05 Thread Steven Schveighoffer via Digitalmars-d

On 9/4/18 8:49 PM, Everlast wrote:

I downloaded 3ddemo, extracted, built and I get these errors:

logger 2.66.0: building configuration "library"...
\dub\packages\logger-2.66.0\logger\std\historical\logger\core.d(1717,16): Error: 
cannot implicitly convert expression logger of type shared(Logger) to 
std.historical.logger.core.Logger
\dub\packages\logger-2.66.0\logger\std\historical\logger\core.d(261,21): 
Error: no property fracSec for type const(SysTime), did you mean 
std.datetime.systime.SysTime.fracSecs?
\dub\packages\logger-2.66.0\logger\std\historical\logger\filelogger.d(86,27): 
Error: template instance 
`std.historical.logger.core.systimeToISOString!(LockingTextWriter)` 
error instantiating

dmd.exe failed with exit code 1.

This is typical with most of my trials with D... something is always 
broken all the time and I'm expected to jump through a bunch of hoops to 
get it to work. File a issue, fix it myself, use a different library, 
etc. I'm expected to waste my time fixing a problem that really should 
not exist or should have a high degree of automation to help fix it. I 
really have better things to do with my time so I won't invest it in D.


3ddemo has one commit. In February 2016. I think it would be an amazing 
feat indeed if a project with one version builds for more than 2 years 
in any language.


I built it successfully with DMD 2.076 (I just picked a random old 
version). So it's still usable, you just have to know what version of 
the compiler to use. I'd say it would be nice to record which version it 
builds with in some way on code.dlang.org.




This attitude of "It's your problem" is going to kill D.


I wouldn't say the attitude is "It's your problem", but more that you 
can't expect a completely unmaintained, scantily tested piece of 
software to magically work because it's written in D.


In this phase of D's life, things just aren't going to stay buildable. 
We are making too many changes to the language and the standard library 
to say that D is going to build things today that were buildable 2+ 
years ago.


In time, this will settle down, and D will be much more stable. I'd 
recommend coming back and checking again later. But I would definitely 
suggest not looking for older projects to test with.


There is really no incentive for me to use D except for it's language 
features... everything else it does, besides performance, is shit 
compared to what most other languages do. Really, D wins on very few 
metrics but the D fanboys will only focus on those.


Sounds like you have other problems than buildability, so maybe D just 
isn't right for you. Thanks for stopping by!


-Steve


Re: This thread on Hacker News terrifies me

2018-09-02 Thread Steven Schveighoffer via Digitalmars-d

On 9/1/18 6:29 AM, Shachar Shemesh wrote:

On 31/08/18 23:22, Steven Schveighoffer wrote:

On 8/31/18 3:50 PM, Walter Bright wrote:

https://news.ycombinator.com/item?id=17880722

Typical comments:

"`assertAndContinue` crashes in dev and logs an error and keeps going 
in prod. Each time we want to verify a runtime assumption, we decide 
which type of assert to use. We prefer `assertAndContinue` (and I 
push for it in code review),"


e.g. D's assert. Well, actually, D doesn't log an error in production.



I think it's the music of the thing rather than the thing itself.

Mecca has ASSERT, which is a condition always checked and that always 
crashes the program if it fails, and DBG_ASSERT, which, like D's built 
in assert, is skipped in release mode (essentially, an assert where you 
can log what went wrong without using the GC needing format).


When you compare this to what Walter was quoting, you get the same end 
result, but a vastly different intention. It's one thing to say "this 
ASSERT is cheap enough to be tested in production, while this DBG_ASSERT 
one is optimized out". It's another to say "well, in production we want 
to keep going no matter what, so we'll just ignore the asserts".


Which is exactly what Phobos and Druntime do (ignore asserts in 
production). I'm not sure how the intention makes any difference.


The obvious position of D is that asserts and bounds checks shouldn't be 
used in production -- that is how we ship our libraries. It is what the 
"-release" switch does. How else could it be interpreted?


-Steve


Re: This thread on Hacker News terrifies me

2018-08-31 Thread Steven Schveighoffer via Digitalmars-d

On 8/31/18 3:50 PM, Walter Bright wrote:

https://news.ycombinator.com/item?id=17880722

Typical comments:

"`assertAndContinue` crashes in dev and logs an error and keeps going in 
prod. Each time we want to verify a runtime assumption, we decide which 
type of assert to use. We prefer `assertAndContinue` (and I push for it 
in code review),"


e.g. D's assert. Well, actually, D doesn't log an error in production.

-Steve


Re: std.encoding:2554 - Unrecognized Encoding: big5 - Please help!

2018-08-28 Thread Steven Schveighoffer via Digitalmars-d

On 8/28/18 9:38 AM, spikespaz wrote:
I have a user who submitted a bug report for one of my projects. The 
error is in std\encoding.d on line 2554.


The problem arises when he types "google.com.hk/search?q={{query}}" 
(exact string) into this function:
https://github.com/spikespaz/search-deflector/blob/master/source/setup.d#L253-L278 



Here is the issue on GH:
https://github.com/spikespaz/search-deflector/issues/12

I really can't figure this one out and any help would be appreciated. 
Thanks.


The answer is: the encoding "big5" isn't supported by std.encoding.

You can add a new subclass and register it, to handle the encoding. 
Maybe this article can help you write one: 
https://en.wikipedia.org/wiki/Big5


You want to subclass EncodingScheme. See instructions at the top of 
https://dlang.org/phobos/std_encoding.html


-Steve


Re: Is @safe still a work-in-progress?

2018-08-28 Thread Steven Schveighoffer via Digitalmars-d

On 8/24/18 10:25 PM, Walter Bright wrote:

On 8/23/2018 6:32 AM, Steven Schveighoffer wrote:
Furthermore any member function (or UFCS function for that matter) 
REQUIRES the first parameter to be the aggregate. How do you make a 
member function that stuffs the return into a different parameter 
properly typecheck?


What I propose is that the function interface be refactored so it does 
fit into these patterns. Is that an unreasonable requirement? I don't 
know. But it doesn't seem to be, as I haven't run into it yet.


So this would mean a member function would have to be refactored into a 
different function with a different calling syntax. i.e:


x.foo(target);

would have to be refactored to:

target.foo(x);

or foo(target, x);

Aside from the adjustment in name that is necessary to make this read 
correctly, that may cause other problems (sometimes non-member functions 
aren't available if it's a template instantiation).


Phobos doesn't do this by accident. It's how constructors work (see 
above) and how pipeline programming works.


Constructors I agree are reasonable to consider `this` to be the 
return value. On that point, I would say we should definitely go ahead 
with making that rule, and I think it will lead to no confusion 
whatsoever.


pipeline programming depends on returning something other than `void`, 
so I don't see how this applies.


grep Phobos for instances of put() and see its signature. It's part of 
pipeline programming, and it's all over the place.


I stand partly corrected! Indeed you can put a void-returning function 
at the *end* of a pipeline call, I hadn't thought of that.


But in terms of put, strictly speaking, any call of some.pipeline.put(x) 
is wrong. It should be put(some.pipeline, x), to avoid issues with how 
put was designed.



It would restrict your legitimate calls.


Maybe that's a good thing. Having multiple simultaneous routes of data 
out of a function is not good practice (note that it is impossible with 
functional programming). If you absolutely must have it, the exit routes 
can be aggregated into a struct, then pass that struct as the first 
argument.


Maybe it's better to designate one sink, and have that be the result. I 
know that after inout was implemented, there were definitely cases where 
one wanted to have multiple inout routes (i.e. independent traces 
between multiple parameters for copying mutability). It may be the same 
for this, I don't know.


I want to stress that it may be a valid solution, but we should strive 
to prove the solutions are the best possible rather than just use 
duct-tape methodology.


I don't know how to prove anything with programming languages.


I don't mean prove like mathematical proof. I mean try to consider how 
this affects all cases instead of just the one case that will make 
phobos compile.


"show" is a better verb than "prove".

It should even be considered that perhaps there are better solutions 
even than the approach dip1000 has taken.


People have hypothesized that for several years, and so far none have 
been forthcoming beyond a few hand-wavy generalities.


I'm just saying if dip1000 cannot fix all the problems, that instead of 
adding weird exceptions, or the classic "you're just doing it wrong", 
maybe we should reconsider the approach.


Another case which was brought up and pretty much ignored was this one: 
https://forum.dlang.org/post/qkrdpmdqaxjadgvso...@forum.dlang.org


I also want to point out that the attitude of 'we could just fix it, 
but nobody will pull my request' is unhelpful. We want to make sure we 
have the best solution possible, don't take criticism as meaningless 
petty bickering. People are genuinely trying to make sure D is 
improved. Hostility towards reviews or debate doesn't foster that.


I'm not hostile to debate. I just don't care for "this is uncharted 
territory, so let's do nothing" which has been going on for probably 4 
years now, coincident with "scope is incomplete, D sux".


I.e. lead, follow, or get out of the way :-)


I'm opting for the latter, as the idea of band-aid PRs to get Phobos 
compiling with dip1000 just to see if dip1000 is going to work seems 
like the wrong approach to me.


The consequence of this is that getting out of the way means your PRs 
don't get pulled.


-Steve


Re: Is @safe still a work-in-progress?

2018-08-28 Thread Steven Schveighoffer via Digitalmars-d

On 8/24/18 10:28 PM, Walter Bright wrote:

On 8/23/2018 8:14 AM, Steven Schveighoffer wrote:
If I had to design a specific way to allow the common case to be easy, 
but still provide a mechanism for the uncommon cases, I would say:


1. define a compiler-recognized attribute (e.g. @__sink).
2. If @__sink is applied to any parameter, that is effectively the 
return value.
3. In the absence of a @__sink designation on non-void-returning 
functions, it applies to the return value.
4. In the absence of a @__sink designation on void returning 
functions, it applies to the first parameter.

5. Inference of @__sink happens even on non-templates.
6. If @__sink is attributed on multiple parameters, you assume all 
return parameters are assigned to all @__sink parameters for the 
purposes of verifying lifetimes are not exceeded.



'ref' is already @__sink.


No, otherwise we wouldn't need the patch you are pushing.

-Steve


Re: D is dead

2018-08-24 Thread Steven Schveighoffer via Digitalmars-d

On 8/24/18 6:16 PM, Jonathan M Davis wrote:

On Friday, August 24, 2018 7:46:57 AM MDT Mike Franklin via Digitalmars-d
wrote:

On Friday, 24 August 2018 at 13:21:25 UTC, Jonathan M Davis wrote:

I think that you're crazy.


No, I just see more potential in D than you do.


To be clear, I'm not calling you crazy in general. I'm calling the idea of
bypassing libc to call syscalls directly under any kind of normal
circumstances crazy. There is tons of work to be done around here to improve
D, and IMHO, reimplementing OS functions just because they're written in C
is a total waste of time and an invitation for bugs - in addition to making
the druntime code that much less portable, since it bypasses the API layer
that was standardized for POSIX systems.


Let me say that I both agree with Jonathan and with Mike.

I think we should reduce Phobos dependence on the user-library part of 
libc, while at the same time, not re-inventing how the OS bindings are 
called. For example, using Martin's std.io library instead of .


I really don't want to see dlang have to maintain posix system calls on 
all supported OSes when that's already being done for us.


Windows makes this simpler -- the system calls are separate from the C 
runtime. It would be nice if Posix systems were that way, but it's both 
silly to reinvent the system calls (they are on every OS anyways, and in 
shared-library form), and a maintenance nightmare.


For platforms that DON'T have an OS abstraction, or it's split out from 
the user library part of libc, it would be perfectly acceptable to write 
a shim there if needed. I'd be surprised if it's not already present in 
C form.


-Steve


Re: Embrace the from template?

2018-08-24 Thread Steven Schveighoffer via Digitalmars-d

On 8/24/18 6:29 PM, Jonathan Marler wrote:

On Friday, 24 August 2018 at 20:36:06 UTC, Seb wrote:

On Friday, 24 August 2018 at 20:04:22 UTC, Jonathan Marler wrote:

I'd gladly fix it but alas, my pull requests are ignored :(


They aren't! It's just that sometimes the review queue is pretty full.
I have told you before that your contributions are very welcome (like 
they are from everyone else) and if there's anything blocking your 
productivity you can always ping me on Slack.


Don't tempt me to start contributing again :)  I had months where I got 
almost no attention on a dozen or so PRs...I love to contribute but I'd 
have to be mad to continue throwing dozens of hours of work away.


I thought we were going to get the unittest import problem solved, but 
then you closed the PR abruptly (we did get phobos to stop compiling 
with -dip1000 so we could work around the linker errors).


In any case, I can understand the feeling of frustration. I also have no 
power to force others to review who make important decisions, so I can't 
guarantee it won't happen again. I myself would love to have the time to 
get more reps with the compiler code, but I'm hopelessly lost when 
reviewing dmd stuff currently.


If the problem gets solved I'll willingly start working again, but I 
don't think anything's changed.


I'll just be blunt -- I don't think "the problem" is ever going to get 
solved. This is the world of volunteer OSS development, and nobody has 
control over anyone's time but themselves. Things could go great for a 
month and then stagnate. If you hit on something that some VIP is 
looking to solve, it may get a lot of attention.


But I would recommend letting a PR stay open, pinging reviewers, etc. 
instead of closing them. Don't give up hope that it will not ever be merged.


-Steve


Re: Dicebot on leaving D: It is anarchy driven development in all its glory.

2018-08-24 Thread Steven Schveighoffer via Digitalmars-d

On 8/24/18 5:12 PM, Meta wrote:

On Friday, 24 August 2018 at 17:12:53 UTC, H. S. Teoh wrote:


I got bitten by this just yesterday.  Update dmd git master, update 
vibe.d git master, now my vibe.d project doesn't compile anymore due 
to some silly string.d error somewhere in one of vibe.d's 
dependencies. :-/


While we're airing grievances about code breakages, I hit this little 
gem the other day, and it annoyed me enough to complain about it: 
https://github.com/dlang/phobos/pull/5291#issuecomment-414196174


What really gets me is the actual removal of the symbol. If it had been 
left there with a deprecation message, I would've caught the problem 
immediately at the source and fixed it in a few minutes. Instead, I 
spent an hour or so tracing "execution" paths through a codebase that 
I'm unfamiliar with to figure out why a static if branch is no longer 
being taken.


According to this comment: 
https://github.com/dlang/phobos/pull/5291#issuecomment-360929553


There was no way to get a deprecation to work.

When we can't get a deprecation to work, we face a hard decision -- 
actually break code right away, print lots of crappy errors, or just 
leave the bug unfixed.


-Steve


Re: D is dead

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 12:22 PM, Shachar Shemesh wrote:

On 23/08/18 17:01, Steven Schveighoffer wrote:
I'm not saying all bugs you file will be fixed, but all bugs you 
*don't* file will definitely not be fixed.


So far, my experience is that it has about the same chances of being 
fixed both ways, and not filing takes less effort.


I have had much better success with bugs being fixed for issues that I 
file vs. hoping someone fixes it without a report, not just in D's 
ecosystem, but pretty much anywhere.


But that's the choice you make. We'll have to disagree on that one. It's 
hard to fix bugs without reports, but hey, maybe you will get lucky and 
someone fixes them by accident.


-Steve


Re: D is dead

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 12:27 PM, Shachar Shemesh wrote:

On 23/08/18 17:01, Steven Schveighoffer wrote:

So interestingly, you are accepting the sockaddr by VALUE.


Indeed. The reason is that I cannot accept them by reference, as then 
you wouldn't be able to pass lvalues* in. Another controversial decision 
by D.


*rvalues you meant.

One that is actively being addressed, at least by the community: 
https://github.com/dlang/DIPs/blob/master/DIPs/DIP1016.md


No guarantees it gets through, but this is probably further than anyone 
has ever gotten before on this topic (and it's a very old topic).




Had that been C++, I'd definitely get a const ref instead.


If you want to use inheritance this is a given, in D or in C++.

What this simply means is your identification of the problem is simply 
wrong -- it's not that you can't make subtypes with structs (you can), 
it's that you can't accept rvalues by reference, and accepting by 
reference is required for inheritance.


-Steve


Re: Is @safe still a work-in-progress?

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 9:32 AM, Steven Schveighoffer wrote:

On 8/23/18 4:58 AM, Walter Bright wrote:

On 8/22/2018 6:50 AM, Steven Schveighoffer wrote:
As for things being made "more flexible in the future" this basically 
translates to code breakage. For example, if you are depending on 
only the first parameter being considered the "return" value, and all 
of a sudden it changes to encompass all your parameters, your 
existing code may fail to compile, even if it's correctly safe and 
properly annotated.


It's a good point. But I don't see an obvious use case for considering 
all the ref parameters as being returns.


You would have to consider the shortest liftetime and assume everything 
goes there. It would restrict your legitimate calls. Only mitigating 
factor may be if you take the ones you aren't going to modify as const 
or inout.


Actually, thinking about this, the shortest lifetime is dictated by how 
it is called, so there is no valid way to determine which one makes 
sense when compiling the function.


In order for this to work, you'd have to attribute it somehow. I can see 
that is likely going to be way more cumbersome than it's worth.


If I had to design a specific way to allow the common case to be easy, 
but still provide a mechanism for the uncommon cases, I would say:


1. define a compiler-recognized attribute (e.g. @__sink).
2. If @__sink is applied to any parameter, that is effectively the 
return value.
3. In the absence of a @__sink designation on non-void-returning 
functions, it applies to the return value.
4. In the absence of a @__sink designation on void returning functions, 
it applies to the first parameter.

5. Inference of @__sink happens even on non-templates.
6. If @__sink is attributed on multiple parameters, you assume all 
return parameters are assigned to all @__sink parameters for the 
purposes of verifying lifetimes are not exceeded.


Ugly to specify, but might actually be pretty non-intrusive to use.

-Steve


Re: D is dead

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 9:22 AM, Shachar Shemesh wrote:


On the other hand, look at ConnectedSocket.connect:
https://weka-io.github.io/mecca/docs/mecca/reactor/io/fd/ConnectedSocket.connect.html 



Why do I need two forms? What good is that? Why is the second form a 
template? Answer: Because in D, structs can't inherit, and I cannot 
define an implicit cast. What I'd really want to do is to have 
SockAddrIPv4 be implicitly castable to SockAddr, so that I can pass a 
SockAddrIPv4 to any function that expects SockAddr.


Except what I'd _really_ like to do is for them to be the same thing. 
I'd like inheritance. Except I can't do that for structs, and if I 
defined SockAddr as a class, I'd mandate allocating it on the GC, 
violating the whole point behind writing Mecca to begin with.


So interestingly, you are accepting the sockaddr by VALUE. Which 
eliminates any possibility of using inheritance meaningfully anyway 
(except that depending how you define SockAddr, it may include all the 
data of the full derived address, sockaddr is quirky that way, and NOT 
like true inheritance).


You CAN use inheritance, just like you would with classes, but you have 
to pass by reference for it to make sense


struct SockAddr
{
   int addressFamily; // forget what this really is called
   ...
}

struct SockAddrIPv4
{
   SockAddr base;
   ref SockAddr getBase() { return base; }
   alias getBase this;
   ...
}

Now, you can pass SockAddrIPv4 into a ref SockAddr, check the address 
family, and cast to the correct thing. Just like you would with classes 
and inheritance. You can even define nice mechanisms for this.


e.g.:

struct SockAddr
{
   ...
   ref T cast(T)() if (isSomeSockaddr!T)
   {
  assert(addressFamily == T.requiredAddressFamily);
  return *cast(T*)&this;
   }
}

To summarize: Weka isn't ditching D, and people aren't even particularly 
angry about it. It has problems, and we've learned to live with them, 
and that's that.


This sounds more like what I would have expected, so thank you for 
clarifying.


The general consensus, however, is that these problems 
will not be resolved (we used to file bugs in Bugzilla. We stopped doing 
that because we saw nothing happens with them), and as far as the future 
of the language goes, that's bad news.


Bugs do get fixed, there is just no assigned timeframe for having them 
fixed. An all volunteer workforce has this issue.


It took 10 (I think) years for bug 304 to get fixed. It was a huge pain 
in the ass, but it did get fixed.


I wouldn't stop filing them, definitely file them. If they are blocking 
your work, complain about them loudly, every day. But not filing them 
doesn't help anyone.


I'm not saying all bugs you file will be fixed, but all bugs you *don't* 
file will definitely not be fixed.


-Steve


Re: Is @safe still a work-in-progress?

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 4:58 AM, Walter Bright wrote:

On 8/22/2018 6:50 AM, Steven Schveighoffer wrote:

What about:

size_t put(sink, parameters...)

Does this qualify as the sink being the "return" type? Obviously the 
real return can't contain any references, so it trivially can be ruled 
out as the destination of any escaping parameters.


Your reasoning is correct, but currently it only applies with 'void' 
return types.



Or how about a member function that takes a ref parameter? Is `this` 
the "return" or is the ref parameter the "return"?


`this` is the ref parameter. In particular, consider a constructor:

   struct S {
  int* p;
  this(return scope int* p) { this.p = p; }
   }

   int i;
   S s = S(&i);

This code appears in Phobos, and it is very reasonable to expect it to 
check as safe.


What I mean to say is, we have a semantic today -- the return value is 
hooked to any `return` parameters, end of story. This is clear, concise, 
and easy to understand.


You are saying that in some cases, the return value is actually 
deposited in the `this` parameter. In cases where the actual return type 
is void, OK, I see that we can tack on that rule without issues.


Furthermore any member function (or UFCS function for that matter) 
REQUIRES the first parameter to be the aggregate. How do you make a 
member function that stuffs the return into a different parameter 
properly typecheck? What rule do we tack on then? It's going to be 
confusing to anyone who writes their API thinking about how it's call 
syntax reads, not how the compiler wants to do flow analysis.


Not to mention, the keyword is `return`, not `returnorfirstparam`. It's 
still going to be confusing no matter how you look at it.


My problem with the idea is that it is going to seem flaky -- we are 
using convention to dictate what is actually the return parameter, vs. 
what semantically happens inside the function. It's going to confuse 
anyone trying to do it a different way. I've experienced this in the 
past with things like toHash, where if you didn't define it with the 
exact signature, it wouldn't actually be used.


I realize obviously, that `put` is already specified. But as I said in 
the bug report, we should think twice about defining rules based 
solely on how Phobos does things, and calling that the solution.


Phobos doesn't do this by accident. It's how constructors work (see 
above) and how pipeline programming works.


Constructors I agree are reasonable to consider `this` to be the return 
value. On that point, I would say we should definitely go ahead with 
making that rule, and I think it will lead to no confusion whatsoever.


pipeline programming depends on returning something other than `void`, 
so I don't see how this applies.


It's more when you are setting members via properties where this comes 
into play. We need it -- we need this ability to tell the compiler "this 
parameter connects to this other parameter". I just don't know if the 
proposed rules are a) good enough for the general case, and b) don't 
cause more confusion than they are worth.


As for things being made "more flexible in the future" this basically 
translates to code breakage. For example, if you are depending on only 
the first parameter being considered the "return" value, and all of a 
sudden it changes to encompass all your parameters, your existing code 
may fail to compile, even if it's correctly safe and properly annotated.


It's a good point. But I don't see an obvious use case for considering 
all the ref parameters as being returns.


You would have to consider the shortest liftetime and assume everything 
goes there. It would restrict your legitimate calls. Only mitigating 
factor may be if you take the ones you aren't going to modify as const 
or inout.


> I want to ensure Atila is successful with this. But that means 
Phobos has to

compile with dip1000. So I need to make it work.



I think it's a very worthy goal to make Phobos work, and a great proof 
of concept for dip1000's veracity.


However, one-off rules just to make it work with existing code go 
against that goal IMO. Rules that stand on their own I think will fare 
better than ones that are loopholes to allow existing code to compile.


I couldn't come up with a better idea than this, and this one works.



I want to stress that it may be a valid solution, but we should strive 
to prove the solutions are the best possible rather than just use 
duct-tape methodology.


It should even be considered that perhaps there are better solutions 
even than the approach dip1000 has taken.


I also want to point out that the attitude of 'we could just fix it, but 
nobody will pull my request' is unhelpful. We want to make sure we have 
the best solution possible, don't take criticism as meaningless petty 
bickering. People are genuinely trying to make sure D is improved. 
Hostility towards reviews or debate doesn't foster that.


-Steve


Re: D is dead

2018-08-23 Thread Steven Schveighoffer via Digitalmars-d

On 8/23/18 8:03 AM, Walter Bright wrote:

On 8/23/2018 4:31 AM, Shachar Shemesh wrote:



This is in the language spec:

How many people know that without resorting to the specs.


This is a little unfair. It's plainly stated in the documentation for 
foreach. Heck, I wrote a C compiler and the library for it, and 
yesterday I had to look up again how strncmp worked. I refer to the 
documentation regularly. Back when I designed digital circuits, I had a 
well-worn TTL data book on my desk, too. If it wasn't documented, or 
documented confusingly, it would be a fair point.


On the point of opApply, the choice is quite obvious. Why would you put 
opApply in an aggregate if you didn't want to control foreach behavior? 
Once you think about it, there shouldn't really be any more discussion.



Does it matter if it allows copying or not?

For the preference for opApply, no.

But it does for empty/front/popFront, which is exactly my point.


If front() returns by ref, then no copying happens. If front() returns 
by value, then a copy is made. This should not be surprising behavior.


I think he means, if the range ITSELF doesn't allow copying, it won't 
work with foreach (because foreach makes a copy), but it will work with 
opApply.


If you're referring to #14246, I posted a PR for it. I don't see how 
that is pretending it isn't a problem. It is.
When I first reported this, about 3 and a half years ago, the forum 
explained to me that this is working as expected.


The forum can be anyone saying anything. A more reliable answer would be 
the bugzilla entry being closed as "invalid", which did not happen.


There have been several people who I have spoken with in person, and 
also seen posted here, that say the forum is unfriendly or not open to 
criticism of D. I feel it's the opposite (in fact, most of the die-hard 
supporters are very critical of D), but everyone has their own experiences.


There are many people who post short curt answers, maybe even cynical. 
But this isn't necessarily the authoritative answer. Where I see this 
happening, I usually try to respond with a more correct answer (even 
though my voice isn't authoratative exactly), but the sad truth is that 
we can't spend all our day making sure we have a super-pleasant forum 
where every answer is valid and nobody is rude.


In reply to Shachar's general point:
This whole thread seems very gloomy and final, but I feel like the tone 
does not match in my mind how D is progressing. "Every single one of the 
people [at Weka] rushing to defend D at the time has since come around." 
Seems like you all have decided to either ditch D internally, maybe 
moving forward, or accepted that Weka will fail eventually due to the 
choice of D? It sure reads that way.


This is in SHARP contrast to the presentation that Liran gave at Dconf 
this year, touting D as a major reason Weka was able to develop what 
they did, and to some degree, your showcase of how Mecca works.


My experience with D is that it has gotten much better over the years. I 
suppose that having worked with the earlier versions, and seeing what 
has happened gives me a different perspective. I guess I just don't have 
that feeling that there are some unfixable problems that will "kill" the 
language. Everything in a programming language is fixable, it just 
matters how much pain you are willing to deal with to fix it. If we get 
to a point where there really is a sticking point, D3 can be born.


I do feel that we need, in general, more developers working on the 
compiler itself. So many of the problems need compiler changes, and the 
learning curve to me just seems so high to get into it.


-Steve


Re: Friends don't let friends use inout with scope and -dip1000

2018-08-22 Thread Steven Schveighoffer via Digitalmars-d

On 8/22/18 4:17 AM, Kagamin wrote:

On Tuesday, 21 August 2018 at 14:04:15 UTC, Steven Schveighoffer wrote:
I would guess it's no different than other inferred attributes. I 
would also guess that it only gets promoted to a return parameter if 
it's actually returned.


If we can't have properly typed parameters, it feels like it has 
potential to prevent some patterns.


But scope is not part of the type, nor is return. One of my biggest 
concerns about dip1000 is that the "scope-ness" or "return-ness" of a 
variable is hidden from the type system. It's just the compiler doing 
flow analysis and throwing you an error when it can't work the thing 
out. I'm more worried about not being able to express the flow in a way 
that the compiler understands, and having it complain about things that 
are actually safe.




This prevents automatic scope promotion:

template escape(T)
{
     int[] escape1(scope int[] r)
     {
     return r;
     }
     alias escape=escape1;
}


But that's not valid dip1000 code. If you call it, it should give a 
compiler error (r *does* escape its scope).


-Steve


Re: Is @safe still a work-in-progress?

2018-08-22 Thread Steven Schveighoffer via Digitalmars-d

On 8/22/18 5:23 AM, Walter Bright wrote:

On 8/21/2018 6:07 PM, Mike Franklin wrote:
The proposed idea wants to make the first parameter, if it's `ref`, 
special.


This is because Phobos is written with functions of the form:

     void put(sink, parameters...)

which corresponds to:

     sink.put(parameters...)

The two forms are fairly interchangeable, made more so by the Uniform 
Function Call Syntax.





 > Why not the first `ref` parameter regardless of whether it's the 
absolute first in the list.  Why not the last `ref` parameter?  Why not 
all `ref` parameters?


Good question. If this fairly restricted form solves the problems, then 
there is no need for the more flexible form. Things can always be made 
more flexible in the future, but tightening things can be pretty 
disruptive. Hence, unless there is an obvious and fairly strong case 
case for the flexibility, then it should be avoided for now.


What about:

size_t put(sink, parameters...)

Does this qualify as the sink being the "return" type? Obviously the 
real return can't contain any references, so it trivially can be ruled 
out as the destination of any escaping parameters.


Or how about a member function that takes a ref parameter? Is `this` the 
"return" or is the ref parameter the "return"?


My problem with the idea is that it is going to seem flaky -- we are 
using convention to dictate what is actually the return parameter, vs. 
what semantically happens inside the function. It's going to confuse 
anyone trying to do it a different way. I've experienced this in the 
past with things like toHash, where if you didn't define it with the 
exact signature, it wouldn't actually be used.


I realize obviously, that `put` is already specified. But as I said in 
the bug report, we should think twice about defining rules based solely 
on how Phobos does things, and calling that the solution.


As for things being made "more flexible in the future" this basically 
translates to code breakage. For example, if you are depending on only 
the first parameter being considered the "return" value, and all of a 
sudden it changes to encompass all your parameters, your existing code 
may fail to compile, even if it's correctly safe and properly annotated.




I want to ensure Atila is successful with this. But that means Phobos 
has to compile with dip1000. So I need to make it work.




I think it's a very worthy goal to make Phobos work, and a great proof 
of concept for dip1000's veracity.


However, one-off rules just to make it work with existing code go 
against that goal IMO. Rules that stand on their own I think will fare 
better than ones that are loopholes to allow existing code to compile.


-Steve


Re: Engine of forum

2018-08-21 Thread Steven Schveighoffer via Digitalmars-d

On 8/21/18 10:08 AM, Ali wrote:

On Tuesday, 21 August 2018 at 05:30:07 UTC, Walter Bright wrote:
Ask 10 people, and you'll get 10 different answers on what a better 
forum would be.


Actually I think we can get 8 out of those 10 to agree,
rust, ocaml, fsharp, nim, scala, clojure .. all use 
https://www.discourse.org/

I think this software is nowadays regarded and the best


Cool! Does it support an interface on top of a newsgroup server? 
Priority #1 in these parts.




If people leave because of the forum software, changing it won't 
change that.


I also agree with that, most people who leave probably leave for more 
objective reasons, like that the language doesn't answer their needs, or 
they didnt find the libraries they needed within the ecosystem etc...


But what I really meant, is that out of those who leaves, there is 
possible a very small percentage who left, because they couldnt 
communicate effectively with the community, and that better 
communication channels in general ( and a better forum software as an 
example) could have kept them around for longer , replacing the forum 
software is a small change, a small win, and I expect small returns. But 
a small win, is a win


On the contrary, many of the regular contributors here, don't give a 
lick about the forum software, as long as it's primarily backed by the 
newsgroup server. Many, including myself use the NG server, many others 
use the mailing list interface. If the NG was ditched, I would have a 
big problem communicating, as I hate dealing with web forums.


The forum software probably could be better in terms of formatting code 
(see for example vibe.d's forums which are ALSO NG backed and have code 
formatting features). Other than that, editing posts just doesn't make 
sense in terms of a mailing list or newsgroup. And it also doesn't make 
sense in terms of a discussion where things you thought you read 
mysteriously change.


-Steve


Re: Friends don't let friends use inout with scope and -dip1000

2018-08-21 Thread Steven Schveighoffer via Digitalmars-d

On 8/21/18 9:42 AM, Kagamin wrote:

except for templated functions:

int[] escape(scope int[] r)
{
     return r; //error, can't return scoped argument
}

int[] escape(return int[] r)
{
     return r; //ok, just as planned
}

int[] escape(return scope int[] r)
{
     return r; //ok, `return scope` reduced to just `return`
}

int[] escape(T)(scope int[] r)
{
     return r; //ok! `scope` silently promoted to `return`
}

You can't have strictly scoped parameter in a templated function - it's 
silently promoted to return parameter. Is this intended?


I would guess it's no different than other inferred attributes. I would 
also guess that it only gets promoted to a return parameter if it's 
actually returned.


As long as the *result* is scoped like the parameter. In the case of the 
OP in this thread, there is definitely a problem with inout and the 
connection to the return value.


-Steve


Re: Is @safe still a work-in-progress?

2018-08-20 Thread Steven Schveighoffer via Digitalmars-d

On 8/17/18 11:04 PM, Walter Bright wrote:

On 8/17/2018 11:17 AM, bachmeier wrote:
This is a good example of D needing to evolve or fizzle out. I don't 
see evidence that the community has yet figured out how to evolve the 
language. If it had, these problems would not be around for so many 
years.


We deprecate features of D all the time. (Remember the D1 => D2 
wrenching change?)


Hm... if you are going for "all the time", the example of D1 to D2 
transition is pretty dated.


I'd say more like the addition of UDAs was a big evolution. Or maybe UFCS.

The reason @safe cannot be default at the moment it because -dip1000 
needs work, and nobody is willing to pitch in and review/pull my PRs on it.


I would, but I have no idea how dip1000 is supposed to work. I think 
only you understand it. Even looking at the PR that you have been citing 
over and over, I can't make heads or tails of what it does or what it 
allows.


-Steve


Re: Friends don't let friends use inout with scope and -dip1000

2018-08-20 Thread Steven Schveighoffer via Digitalmars-d

On 8/20/18 5:43 AM, Nicholas Wilson wrote:

On Monday, 20 August 2018 at 09:31:09 UTC, Atila Neves wrote:

On Friday, 17 August 2018 at 13:39:29 UTC, Steven Schveighoffer wrote:

// used to be scope int* ptr() { return ints; }
scope inout(int)* ptr() inout { return ints; }


Does scope apply to the return value or the `this` reference?


I assumed the return value. I think I've read DIP1000 about a dozen 
times now and I still get confused. As opposed to `const` or 
`immutable`, `scope(T)` isn't a thing so... I don't know?


A type constructor affects the type of something. So const(int) is an 
int that is const.


const int is actually NOT a type constructor, but a storage class. It's 
main effect is to make the int actually const(int), but can have other 
effects (e.g. if it's a global, it may be put into global storage 
instead of thread-local).


scope is not a type constructor, ever. So how do you specify the return 
type is scope? How do you specify a difference between the scope of the 
'this' pointer, and the scope of the return value?


I'm super-confused as to what dip1000 actually is doing, and how to use it.



What usually happens is that qualifiers to the left of the name apply to 
the return type and those to the right apply `this`. Not that that 
_should_ make any difference since lifetime ints == lifetime this




No:

const int* foo() const { return null; }

Error: redundant const attribute.

Up until 2.080, this was a deprecation, and the result was int *


What happens if you remove the return type? (i.e. scope auto)


And write what instead?



scope ptr() inout { return ints; } ?


Yes, this is what I was thinking.

-Steve


Re: Is @safe still a work-in-progress?

2018-08-17 Thread Steven Schveighoffer via Digitalmars-d

On 8/17/18 1:26 PM, jmh530 wrote:

On Friday, 17 August 2018 at 14:26:07 UTC, H. S. Teoh wrote:

[...]

And that is exactly why the whole implementation of @safe is currently 
rather laughable. By blacklisting rather than whitelisting, we 
basically open the door wide open to loopholes -- anything that we 
haven't thought of yet could potentially be a @safe-breaking 
combination, and we wouldn't know until somebody discovers and reports 
it.


Sadly, it seems there is little interest in reimplementing @safe to 
use whitelisting instead of blacklisting.



T


Fundamentally, I see it as a good idea. Walter has talked about how 
important memory safety is for D. People thinking their @safe code is 
safe is a big problem when that turns out to not be the case. Imagine 
the black eye D would have if a company was hacked because of something 
like this?


This will always be a possibility thanks to @trusted.

IMO, the problem is that you can't just replace @safe as it is now. You 
could introduce something like @whitelist or @safewhitelist and begin 
implementing it, but it would probably be some time before it could 
replace @safe. Like when @whitelist is only breaking unsafe code.


I have to say, I don't see how this all helps.

In theory, black-listing and white-listing will get you to the same 
position. Mechanisms to get or use pointers aren't really being added to 
the language any more, so the set of "things" to "list" either black or 
white is finite.


In this thread, we are talking about something that should have been 
black-listed LONG ago, but was not because it was "too useful" (i.e. 
would break too much code). If @safe code was white-listed, nobody would 
use it until it was finished, so it would be theoretical anyway.


Nobody wants a feature that is @safe, but not useful.

However, a bigger problem is that we have a bug that is "fixed" (slicing 
static arrays) but only if you use a feature that doesn't work correctly 
(dip1000). Why? I think the bug should be reopened until dip1000 is the 
default, or it simply gets fixed (i.e. without requiring dip1000).


-Steve


Re: Friends don't let friends use inout with scope and -dip1000

2018-08-17 Thread Steven Schveighoffer via Digitalmars-d

On 8/17/18 3:36 AM, Atila Neves wrote:

Here's a struct:

-
struct MyStruct {
     import core.stdc.stdlib;
     int* ints;
     this(int size) @trusted { ints = cast(int*) malloc(size); }
     ~this() @trusted { free(ints); }
     scope int* ptr() { return ints; }
}
-

Let's try and be evil with -dip1000:

-
@safe:

// struct MyStruct ...

const(int) *gInt;

void main() {
     auto s = MyStruct(10);
     gInt = s.ptr;
}
-

% dmd -dip1000 scope_inout.d
scope_inout.d(26): Error: scope variable this may not be returned


Yay!

What if instead of `auto` I write `const` instead (or immutable)? This 
is D we're talking about, so none of this boilerplate nonsense of 
writing two (or three) basically identical functions. So:


-
// used to be scope int* ptr() { return ints; }
scope inout(int)* ptr() inout { return ints; }


Does scope apply to the return value or the `this` reference?

What happens if you remove the return type? (i.e. scope auto)


-

% dmd -dip1000 scope_inout.d
% echo $?
0
# nope, no error here

Wait, what? Turns out now it compiles. After some under-the-breath 
mumbling I go hit issues.dlang.org and realise that the issue already 
exists:



https://issues.dlang.org/show_bug.cgi?id=17935


I don't see what this bug report has to do with the given case.




For reasons unfathomable to me, this is considered the _correct_ 
behaviour. Weirder still, writing out the boilerplate that `inout` is 
supposed to save us (mutable, const and immutable versions) doesn't 
compile, which is what one would expect.


So: @safe + inout + scope + dip1000 + custom memory allocation in D gets 
us to the usability of C++ circa 1998. At least now we have valgrind and 
asan I guess.


"What about template this?", I hear you ask. It kinda works. Sorta. 
Kinda. Behold:



scope auto ptr(this T)() { return ints; }


After changing the definition of `ptr` this way the code compiles fine 
and `ints` is escaped. Huh. However, if you change `auto s` to `scope 
s`, it fails to compile as  intended. Very weird.


This seems like a straight up bug.



If you change the destructor to `scope` then it also fails to compile 
even if it's `auto s`. Because, _obviously_, that's totally different.


I'd file an issue but given that the original one is considered not a 
bug for some reason, I have no idea about what I just wrote is right or 
not.


What I do know is I found multiple ways to do nasty things to memory 
under the guise of @safe and -dip1000, and my understanding was that the 
compiler would save me from myself. In the meanwhile I'm staying away 
from `inout` and putting `scope` on my destructors even if I don't quite 
understand when destructors should be `scope`. Probably always? I have 
no idea.





This doesn't surprise me. I'm beginning to question whether scope 
shouldn't have been a type constructor instead of a storage class. It's 
treated almost like a type constructor in most places, but the language 
grammar makes it difficult to be specific as to what part it applies.


-Steve


Re: More fun with autodecoding

2018-08-08 Thread Steven Schveighoffer via Digitalmars-d

On 8/8/18 4:13 PM, Walter Bright wrote:

On 8/6/2018 6:57 AM, Steven Schveighoffer wrote:
But I'm not sure if the performance is going to be the same, since now 
it will likely FORCE autodecoding on the algorithms that have 
specialized versions to AVOID autodecoding (I think).


Autodecoding is expensive which is why the algorithms defeat it. Nearly 
none actually need it.


You can get decoding if needed by using .byDchar or .by!dchar (forgot 
which it was).


There is byCodePoint and byCodeUnit, whereas byCodePoint forces auto 
decoding.


The problem is, I want to use this wrapper just like it was a string in 
all respects (including the performance gains had by ignoring 
auto-decoding).


Not trying to give too much away about the library I'm writing, but the 
problem I'm trying to solve is parsing out tokens from a buffer. I want 
to delineate the whole, as well as the parts, but it's difficult to get 
back to the original buffer once you split and slice up the buffer using 
phobos functions.


Consider that you are searching for something in a buffer. Phobos 
provides all you need to narrow down your range to the thing you are 
looking for. But it doesn't give you a way to figure out where you are 
in the whole buffer.


Up till now, I've done it by weird length math, but it gets tiring (see 
for instance: 
https://github.com/schveiguy/fastaq/blob/master/source/fasta/fasta.d#L125). 
I just want to know where the darned thing I've narrowed down is in the 
original range!


So this wrapper I thought would be a way to use things like you always 
do, but at any point, you just extract a piece of information (a buffer 
reference) that shows where it is in the original buffer. It's quite 
easy to do that part, the problem is getting it to be a drop-in 
replacement for the original type.


Here's where I'm struggling -- because a string provides indexing, 
slicing, length, etc. but Phobos ignores that. I can't make a new type 
that does the same thing. Not only that, but I'm finding the 
specializations of algorithms only work on the type "string", and 
nothing else.


I'll try using byCodeUnit and see how it fares.

-Steve


  1   2   3   4   5   6   7   8   9   10   >