Re: Range functions expand char to dchar

2015-09-09 Thread Freddy via Digitalmars-d

On Tuesday, 8 September 2015 at 18:28:40 UTC, Matt Kline wrote:

A bit verbose, but I suppose that will do.

You could use map
---
import std.algorithm : map;
import std.utf : byCodeUnit;
import std.array : array;

auto arr = ["foo", "bar", "baz"].map!(a => a.byCodeUnit).array;
---


Re: Range functions expand char to dchar

2015-09-08 Thread anonymous via Digitalmars-d
On Tuesday 08 September 2015 20:28, Matt Kline wrote:

> If we have a range of char elements, won't that do? regex() uses 
> the standard isSomeString!S constraint to take any range of chars.

isSomeString!S doesn't check if S is a range. It checks if S is "some 
string", meaning: "Char[], where Char is any of char, wchar or dchar, with 
or without qualifiers".

http://dlang.org/phobos/std_traits.html#isSomeString

Checking for ranges would be done with isInputRange, isForwardRange, etc.

http://dlang.org/phobos/std_range_primitives.html


Re: Range functions expand char to dchar

2015-09-08 Thread Dmitry Olshansky via Digitalmars-d

On 08-Sep-2015 20:57, Matt Kline wrote:

On Tuesday, 8 September 2015 at 17:52:13 UTC, Matt Kline wrote:


Whether by design or by oversight, this is quite undesirable.


My apologies for double-posting, but is this intended behavior, or an
unfortunate consequence of the metaprogramming used to determine the
resulting type of these range functions?




Historical consequence of enabling auto-decoding for arrays of char and 
wchar (and only those). Today it's recognized that one should either 
wrap an array of char  as code unit range or code point range explicitly 
using byUTF helper.


--
Dmitry Olshansky


Range functions expand char to dchar

2015-09-08 Thread Matt Kline via Digitalmars-d
After seeing Walter's DConf presentation from this year, I've 
been making an effort to use range algorithms more, such as using 
chain() and joiner() as an alternative to array concatenation and 
std.array.join.


Unfortunately, doing so with strings has been problematic, as 
these algorithms expand strings into dstrings.


An example:

import std.algorithm;
import std.range;
import std.stdio;
import std.regex;

void main()
{
// One would expect this to be a range of chars
auto test = chain("foo", "bar", "baz");
// prints "dchar"
writeln(typeid(typeof(test.front)));

auto arr = ["foo", "bar", "baz"];
auto joined = joiner(arr, ", ");
// Also "dchar"
writeln(typeid(typeof(joined.front)));

// Problems ensue if one assumes the result of joined is a 
char string.

auto r = regex(joined);
matchFirst("won't compile", r); // Compiler error
}

Whether by design or by oversight, this is quite undesirable. It 
violates the principle of least astonishment (one wouldn't expect 
joining a bunch of strings would result in a dstring), causing 
issues such as the one shown above. And, if I aim to use UTF-8 
consistently throughout my applications (see 
http://utf8everywhere.org/), what am I to do?


Re: Range functions expand char to dchar

2015-09-08 Thread Matt Kline via Digitalmars-d

On Tuesday, 8 September 2015 at 17:52:13 UTC, Matt Kline wrote:


Whether by design or by oversight, this is quite undesirable.


My apologies for double-posting, but is this intended behavior, 
or an unfortunate consequence of the metaprogramming used to 
determine the resulting type of these range functions?





Re: Range functions expand char to dchar

2015-09-08 Thread anonymous via Digitalmars-d
On Tuesday 08 September 2015 19:52, Matt Kline wrote:

> An example:
> 
> import std.algorithm;
> import std.range;
> import std.stdio;
> import std.regex;
> 
> void main()
> {
>  // One would expect this to be a range of chars
>  auto test = chain("foo", "bar", "baz");
>  // prints "dchar"
>  writeln(typeid(typeof(test.front)));
> 
>  auto arr = ["foo", "bar", "baz"];
>  auto joined = joiner(arr, ", ");
>  // Also "dchar"
>  writeln(typeid(typeof(joined.front)));
> 
>  // Problems ensue if one assumes the result of joined is a 
> char string.
>  auto r = regex(joined);
>  matchFirst("won't compile", r); // Compiler error
> }
> 
> Whether by design or by oversight,

By design with regrets:
http://forum.dlang.org/post/m01r3d$1frl$1...@digitalmars.com

> this is quite undesirable. It 
> violates the principle of least astonishment (one wouldn't expect 
> joining a bunch of strings would result in a dstring),

The result is a range of dchars actually, strictly not a dstring.

> causing 
> issues such as the one shown above. And, if I aim to use UTF-8 
> consistently throughout my applications (see 
> http://utf8everywhere.org/), what am I to do?

You can use std.utf.byCodeUnit to get ranges of chars:


import std.algorithm;
import std.array: array;
import std.range;
import std.stdio;
import std.regex;
import std.utf: byCodeUnit;

void main()
{
auto test = chain("foo".byCodeUnit, "bar".byCodeUnit, "baz".byCodeUnit);
pragma(msg, typeof(test.front)); /* "immutable(char)" */

auto arr = ["foo".byCodeUnit, "bar".byCodeUnit, "baz".byCodeUnit];
auto joined = joiner(arr, ", ".byCodeUnit);
pragma(msg, typeof(joined.front)); /* "immutable(char)" */

/* Having char elements isn't enough. Need to turn the range into an
array via std.array.array: */
auto r = regex(joined.array);
matchFirst("won't compile", r); /* compiles */
}


Alternatively, since you have to materialize `joined` into an array anyway, 
you can use the dchar range and make a string from it when passing to 
`regex`:


import std.algorithm;
import std.conv: to;
import std.stdio;
import std.regex;

void main()
{
auto arr = ["foo", "bar", "baz"];
auto joined = joiner(arr, ", ");
pragma(msg, typeof(joined.front)); /* "dchar" */

/* to!string now: */
auto r = regex(joined.to!string);
matchFirst("won't compile", r); /* compiles */
}



Re: Range functions expand char to dchar

2015-09-08 Thread Matt Kline via Digitalmars-d

On Tuesday, 8 September 2015 at 18:21:34 UTC, anonymous wrote:

By design with regrets:
http://forum.dlang.org/post/m01r3d$1frl$1...@digitalmars.com

On Thursday, 25 September 2014 at 19:40:29 UTC, Walter Bright 
wrote:
Top of my list would be the auto-decoding behavior of 
std.array.front() on character arrays. Every time I'm faced 
with that I want to throw a chair through the window.


At least I'm not alone. :)


You can use std.utf.byCodeUnit to get ranges of chars:


A bit verbose, but I suppose that will do.

/* Having char elements isn't enough. Need to turn the 
range into an

array via std.array.array: */
auto r = regex(joined.array);
matchFirst("won't compile", r); /* compiles */
}


If we have a range of char elements, won't that do? regex() uses 
the standard isSomeString!S constraint to take any range of chars.