On 11/21/10 11:59 PM, Rainer Deyke wrote:
On 11/21/2010 21:56, Andrei Alexandrescu wrote:
On 11/21/10 22:09 CST, Rainer Deyke wrote:
On 11/21/2010 17:31, Andrei Alexandrescu wrote:
char[] and wchar[] fail to provide some of the guarantees of all other
instances of T[].

What exactly are those guarantees?

That the range view and the array view provide direct access to the same
data.

Where do ranges state that assumption?

One of the useful features of most arrays is that an array of T can be
treated as a range of T.  However, this feature is missing for arrays of
char and wchar.

This is not a guarantee by ranges, it's just a mistaken assumption.

    - When writing code that uses T[], it is often natural to mix
range-based access and index-based access, with the assumption that both
provide direct access to the same underlying data.  However, with char[]
this assumption is incorrect, as the underlying data is transformed when
viewing the array as a range.  This means that generic code that uses
T[] must take special consideration of char[] or it may unexpectedly
produce incorrect results when T = char.

What you're saying is that you write generic code that requires T[], and
then the code itself uses front, popFront, and other range-specific
functions in conjunction with it.

No, I'm saying that I write generic code that declares T[] and then
passes it off to a function that operates on ranges, or to a foreach loop.

A function that operates on ranges would have an appropriate constraint so it would work properly or not at all. foreach works fine with all arrays.

But this is exactly the problem. If you want to use range primitives,
you submit to the requirement of ranges. So you write the generic
function to ask for ranges (with e.g. isForwardRange etc). Otherwise
your code is incorrect.

Again, my generic function declares the array as a local variable or a
member variable.  It cannot declare a generic range.

If you want to work with arrays, use a[0] to access the front, a[$ - 1]
to access the back, and a = a[1 .. $] to chop off the first element of
the array. It is not AT ALL natural to mix those with a.front, a.back
etc. It is not - why? because std.range defines them with specific
meanings for arrays in general and for arrays of characters in
particular. If you submit to use std.range's abstraction, you submit to
using it the way it is defined.

It absolutely is natural to mix these in code that is written without
consideration for strings, especially when you consider that foreach
also uses the range interface.

Let's say I have an array and I want to iterate over the first ten
items.  My first instinct would be to write something like this:

   foreach (item; array[0 .. 10]) {
     doSomethingWith(item);
   }

Simple, natural, readable code.  Broken for arrays of char or wchar, but
in a way that is difficult to detect.

Why is it broken? Please try it to convince yourself of the contrary.

So: if you want to use char[] as an array with the built-in array
interface, no problem. If you want to use char[] as a range with the
range interface as defined by std.range, again no problem. But asking
for one and then surreptitiously using the other is simply incorrect
code. You can't use std.range while at the same time complaining you
can't be bothered to read its docs.

This would sound reasonable if I were using char[] directly.  I'm not.
I'm using T[] in a generic context.  I may not have considered the case
of T = char when I wrote the code.  The code may even have originally
used Widget[] before I decided to make it generic.

Fine. Use T[] generically in conjunction with the array primitives. If you plan to use them with the range primitives, you do as ranges do.

I challenge you to define an alternative built-in string that fares
better than string&  Comp. Before long you'll be overwhelmed by the
various necessities imposed by your constraints.

Easy:
   - string_t becomes a keyword.
   - Syntactically speaking, string_t!T is the name of a type when T is a
type.
   - For every built-in character type T (including const and immutable
versions), the type currently called T[] is now called string_t!T, but
otherwise maintains all of its current behavior.
   - For every other type T, string_t!T is an error.
   - char[] and wchar[] (including const and immutable versions) are
plain arrays of code units, even when viewed as a range.

It's not my preferred solution, but it's easy to explain, it fixes the
main problem with the current system, and it only costs one keyword.

(I'd rather treat string_t as a library template with compiler support
like and rename it to String, but then it wouldn't be a built-in string.)

I very much prefer the current state of affairs.


Andrei

Reply via email to