Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

Steven Schveighoffer Thu, 13 Jan 2011 14:05:19 -0800

On Thu, 13 Jan 2011 15:51:00 -0500, Andrei Alexandrescu<seewebsiteforem...@erdani.org> wrote:

On 1/13/11 11:35 AM, Steven Schveighoffer wrote:

On Thu, 13 Jan 2011 14:08:36 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:

Let's take a look:


// Incorrect string code
void fun(string s) {
foreach (i; 0 .. s.length) {
writeln("The character in position ", i, " is ", s[i]);
}
}

// Incorrect string_t code
void fun(string_t!char s) {
foreach (i; 0 .. s.codeUnits) {
writeln("The character in position ", i, " is ", s[i]);
}
}

Both functions are incorrect, albeit in different ways. The only
improvement I'm seeing is that the user needs to write codeUnits
instead of length, which may make her think twice. Clearly, however,
copiously incorrect code can be written with the proposed interface
because it tries to hide the reality that underneath a variable-length
encoding is being used, but doesn't hide it completely (albeit for
good efficiency-related reasons).


You might be looking at my previous version. The new version (recently
posted) will throw an exception for that code if a multi-code-unit
code-point is found.

I was looking at your latest. It's code that compiles and runs, butdynamically fails on some inputs. I agree that it's often better to failnoisily instead of silently, but in a manner of speaking thestring-based code doesn't fail at all - it correctly iterates the codeunits of a string. This may sometimes not be what the user expected;most of the time they'd care about the code points.

iterating the code units is possible by accessing the array data. i.e.you could do:


foreach(i, c; s.data)

if you want the code-units.

That is the point of having a separate type. Using string_t tells thelibrary "I'm using this data as a string". Using char[] tells the library"I'm using this data as an array."

The difference here is, you have to *specifically* try to access the codeunits, the default is code-points. All it does really is switch thedefault.

It also supports this:

foreach(i, d; s)
{
writeln("The character in position ", i, " is ", d);
}

where i is the index (might not be sequential)
Well string supports that too, albeit with the nit that you need tospecify dchar.


This is not a small problem.

isRandomAccessRange requires hasLength (see here:
http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/range.d#L532).
This is not a random access range per that definition.
That's an interesting twist. By the way I specified length is requiredthen because I couldn't imagine having random access into something thatI can't tell the length of. Apparently I was wrong :o).

Yes, in fact, you could say that specifically defines VLERange ;) Butactually, there are two types of VLE ranges, those which can be randomlyaccessed (where determining the beginning of a code point, given a randomindex is possible) and those that cannot (where decoding depends on theexact order of the data). Actually, those would not be bi-directionalranges anyways.

But a string
isn't a random access range anyways (it's specifically disallowed by
std.range per that same reference).


It isn't and it isn't supposed to be.


I agree with that assessment, which is why I omitted length.

-Steve

Re: VLERange: a range in between BidirectionalRange and RandomAccessRange

Reply via email to