Re: Why I Like D

2022-01-11 Thread forkit via Digitalmars-d-announce

On Wednesday, 12 January 2022 at 06:27:47 UTC, forkit wrote:


surely this article needs to be balanced, with another article, 
titled 'why I don't like D' ;-) (..but written by someone who 
really knows D).




oh. btw. I'd love to see Walter (or Andrei, or both) write this 
article ;-)




Re: Why I Like D

2022-01-11 Thread forkit via Digitalmars-d-announce
On Wednesday, 12 January 2022 at 02:37:47 UTC, Walter Bright 
wrote:
"Why I like D" is on the front page of HackerNews at the moment 
at number 11.


https://news.ycombinator.com/news


surely this article needs to be balanced, with another article, 
titled 'why I don't like D' ;-) (..but written by someone who 
really knows D).


IMO... the next generation programming language (that will 
succeed) will be defined by it's tooling, and not just the 
language.


Language complexity increases the demands on tooling.

I remember Scott Meyers.. the last thing D needs.. 2014 talk.

We really need him now.. more than ever ;-)


Re: Why I Like D

2022-01-11 Thread surlymoor via Digitalmars-d-announce
On Wednesday, 12 January 2022 at 02:37:47 UTC, Walter Bright 
wrote:
"Why I like D" is on the front page of HackerNews at the moment 
at number 11.


https://news.ycombinator.com/news


Nice article, especially this paragraph:

In case you are writing a performance critical piece of 
software, remember you
can turn off the garbage collector! People on forums like to 
bash that in such
case you cannot use many functions from standard library. So 
what? If
performances are essential for your system you are likely 
already writing you own
utility library with highly optimized algorithms and data 
structures for your use

case, so you won’t really miss the standard library much.


Good luck to the boys and girls in the HN comments as the 
dumpster fire is already raging.


Re: Why I Like D

2022-01-11 Thread ag0aep6g via Digitalmars-d-announce

On 12.01.22 03:37, Walter Bright wrote:
"Why I like D" is on the front page of HackerNews at the moment at 
number 11.


https://news.ycombinator.com/news


https://news.ycombinator.com/item?id=29863557

https://aradaelli.com/blog/why-i-like-d/


Why I Like D

2022-01-11 Thread Walter Bright via Digitalmars-d-announce

"Why I like D" is on the front page of HackerNews at the moment at number 11.

https://news.ycombinator.com/news


Re: fixedstring: a @safe, @nogc string type

2022-01-11 Thread Paul Backus via Digitalmars-d-announce

On Tuesday, 11 January 2022 at 17:55:28 UTC, H. S. Teoh wrote:
Generally, I'd advise not conflating your containers with 
ranges over your containers: I'd make .opSlice return a 
traditional D slice (i.e., const(char)[]) instead of a 
FixedString, and just require writing `[]` when you need to 
iterate over the string as a range:


FixedString!64 mystr;
foreach (ch; mystr[]) { // <-- iterates over const(char)[]
...
}

This way, no redundant copying of data is done during iteration.


It already does this. In D2, `[]` is handled by a zero-argument 
`opIndex` overload, not by `opSlice`. [1] FixedString has such an 
overload [2], and it does, in fact, return a slice.


[1] https://dlang.org/spec/operatoroverloading.html#slice
[2] 
https://github.com/Moth-Tolias/fixedstring/blob/v1.0.0/source/fixedstring.d#L105


Re: fixedstring: a @safe, @nogc string type

2022-01-11 Thread H. S. Teoh via Digitalmars-d-announce
On Tue, Jan 11, 2022 at 11:16:13AM +, Moth via Digitalmars-d-announce wrote:
> On Tuesday, 11 January 2022 at 03:20:22 UTC, Salih Dincer wrote:
> > [snip]
> 
> glad to hear you're finding it useful! =]

One minor usability issue I found just glancing over the code: many of
your methods take char[] as argument. Generally, you want const(char)[]
instead, so that it will work with both char[] and immutable(char)[].
No reason why you can't copy some immutable chars into a FixedString,
for example.

Another potential issue is with the range interface. Your .popFront is
implemented by copying the entire buffer 1 char forwards, which can
easily become a hidden performance bottleneck. Iteration over a
FixedString currently is O(N^2), which is a problem if performance is
your concern.

Generally, I'd advise not conflating your containers with ranges over
your containers: I'd make .opSlice return a traditional D slice (i.e.,
const(char)[]) instead of a FixedString, and just require writing `[]`
when you need to iterate over the string as a range:

FixedString!64 mystr;
foreach (ch; mystr[]) { // <-- iterates over const(char)[]
...
}

This way, no redundant copying of data is done during iteration.

Another issue is the way concatenation is implemented. Since
FixedStrings have compile-time size, this potentially means every time
you concatenate a string in your code you get another instantiation of
FixedString. This can lead to a LOT of template bloat if you're not
careful, which may quickly outweigh any benefits you may have gained
from not using the built-in strings.


> hm, i'm not sure how i would go about fixing that double character
> issue. i know there's currently some wierdness with wchars / dchars
> equality that needs to be fixed [shouldn't be too much trouble, just
> need to set aside the time for it], but i think being able to tell how
> many chars there are in a glyph requires unicode awareness? i'll look
> into it.
[...]

Yes, you will require Unicode-awareness, and no, it will NOT be as
simple as you imagine.

First of all, you have the wide-character issue: if you're dealing with
anything outside of the ASCII range, you will need to deal with code
points (potentially wchar, dchar).  You can either take the lazy way out
(FixedString!(n, wchar), FixedString!(n, dchar)), but that will
exacerbate your template bloat very quickly. Plus, it wastes a lot of
memory, esp. if you start using dchar[] -- 4 bytes per character
potentially makes ASCII strings use up 4x more memory. (And even if you
decide using dchar[] isn't a concern, there's still the issue of
graphemes -- see below, which requires non-trivial decoding anyway.)

Or you can handle UTF-8, which is a better solution in terms of memory
usage. But then you will immediately run into the encoding/decoding
problem. Your .opSlice, for example, will not work correctly unless you
auto-decode. But that will be a performance hit -- this is one of the
design mistakes in hindsight that's still plaguing Phobos today. IMO the
better approach is to iterate over the string *without* decoding, but
just detecting codepoint boundaries.  Regardless, you will need *some*
way of iterating over code points instead of code units in order to deal
with this properly.

But that's only the beginning of the story. In Unicode, a "code point"
is NOT what most people imagine a "character" is. For most European
languages this is the case, but once you go outside of that, you'll
start finding things like accented characters that are composed of
multiple code points.  In Unicode, that's called a Grapheme, and here's
the bad news: the length of a Grapheme is technically unbounded (even
though in practice it's usually 2 or occasionally 3 -- but you *will*
find more on rare occasions). And worst of all, determining the length
of a grapheme requires an expensive, non-trivial algorithm that will
KILL your performance if you blindly do it every time you traverse your
string.

And generally, you don't *want* to do grapheme segmentation anyway --
most code doesn't even care what the graphemes are, it just wants to
treat strings as opaque data that you may occasionally want to segment
into substrings (and substrings don't necessarily require grapheme
segmentation to compute, depending on what the final goal is). But
occasionally you *will* need grapheme segmentation (e.g., if you need to
know how many visual "characters" there are in a string); for that, you
will need std.uni. And no, it's not something you can implement
overnight.  It requires some heavy-duty lookup tables and a (very
careful!) implementation of TR14.

Because of the foregoing, you have at least 4 different definitions of
the length of the string:

1. The number of code units it occupies, i.e., the number of chars /
wchars / dchars.

2. The number of code points it contains, which, in UTF-8, is a
non-trivial quantity that requires iterating over the entire string to
compute. Or 

Re: Error message formatter for range primitives

2022-01-11 Thread WebFreak001 via Digitalmars-d-announce
On Wednesday, 5 January 2022 at 09:32:36 UTC, Robert Schadek 
wrote:
In 
https://forum.dlang.org/post/tfdycnibnxyryizec...@forum.dlang.org I complained
that error message related to range primitives like 
isInputRange, especially on

template constraints, are not great.

[...]


cool!

As I'm not a fan of needing to refactor code I made my first DMD 
PR to try to make it possible to include this in phobos here: 
https://github.com/dlang/dmd/pull/13511


```d
source/app.d(43,5): Error: template `app.fun` cannot deduce 
function from argument types `!()(Sample1)`

source/app.d(22,6):Candidates are: `fun(T)(T t)`
  with `T = Sample1`
  must satisfy the following constraint:
`   isInputRange!T: Sample1 is not an InputRange because:
the function 'popFront' does not exist`
source/app.d(24,6):`fun(T)(T t)`
  with `T = Sample1`
  must satisfy the following constraint:
`   isRandomAccessRange!T: Sample1 is not an 
RandomAccessRange because

the function 'popFront' does not exist
and the property 'save' does not exist
and must allow for array indexing, aka. [] access`
```


Re: fixedstring: a @safe, @nogc string type

2022-01-11 Thread WebFreak001 via Digitalmars-d-announce

On Tuesday, 11 January 2022 at 11:16:13 UTC, Moth wrote:

On Tuesday, 11 January 2022 at 03:20:22 UTC, Salih Dincer wrote:

[snip]


glad to hear you're finding it useful! =]

hm, i'm not sure how i would go about fixing that double 
character issue. i know there's currently some wierdness with 
wchars / dchars equality that needs to be fixed [shouldn't be 
too much trouble, just need to set aside the time for it], but 
i think being able to tell how many chars there are in a glyph 
requires unicode awareness? i'll look into it.


[...]


you can relatively easily find out how many bytes a string takes 
up with `std.utf`. You can also iterate by code points or 
graphemes there if you want to translate some kind of character 
index to byte position.


HOWEVER it's not clear what a character is. Sure for the posted 
cases here it's no problem but when it comes to languages based 
on combining glyphs together to form new glyphs it's no longer 
clear what is a character. There are Graphemes (grapheme 
clusters) which are probably the closest to what everybody would 
think a character is, but IIRC there are edge cases with that a 
programmer wouldn't expect, like adding a character not 
increasing the count of characters of the string because it 
merges with the last Grapheme. Additionally there is a 
performance impact on using Graphemes over simpler things like 
codepoints which fit 98% of use-cases with strings. Codepoints in 
D are mapped 1:1 using dchar, take up to 2 wchars or up to 4 
chars. You can use `std.utf` to compute byte lengths for a 
codepoint given a string.


I would rather suggest you support FixedString with types other 
than `char`. (wchar, dchar, heck users could even use any 
arbitrary type and use this as array class) For languages that 
commonly use more than 1 byte per codepoint or for interop with 
Win32 unicode APIs, JavaScript strings, C# strings, UTF16 files 
in general, etc. programmers might opt to use FixedString with 
wchar then.


With D's templates that should be quite easy to do (add a 
template parameter to the struct like `struct FixedString(size_t 
maxSize, CharT = char)` and replace all usage of char in your 
code with `CharT` in this case)


Re: fixedstring: a @safe, @nogc string type

2022-01-11 Thread vit via Digitalmars-d-announce

On Tuesday, 11 January 2022 at 11:16:13 UTC, Moth wrote:

On Tuesday, 11 January 2022 at 03:20:22 UTC, Salih Dincer wrote:

[snip]


glad to hear you're finding it useful! =]

... i know there's currently some wierdness with wchars / 
dchars equality that needs to be fixed [shouldn't be too much 
trouble...




If you try mixing char/wchar/dchar, you need encoding/decoding 
for utf-8, utf-16 and utf-32 ( maybe even LE/BE ). It become 
complicated very fast...






Re: fixedstring: a @safe, @nogc string type

2022-01-11 Thread Moth via Digitalmars-d-announce

On Tuesday, 11 January 2022 at 03:20:22 UTC, Salih Dincer wrote:

[snip]


glad to hear you're finding it useful! =]

hm, i'm not sure how i would go about fixing that double 
character issue. i know there's currently some wierdness with 
wchars / dchars equality that needs to be fixed [shouldn't be too 
much trouble, just need to set aside the time for it], but i 
think being able to tell how many chars there are in a glyph 
requires unicode awareness? i'll look into it.


what's your usecase for usefulCapacity()?