Re: Major performance problem with std.array.front()

Andrei Alexandrescu Fri, 07 Mar 2014 12:01:24 -0800

On 3/6/14, 6:37 PM, Walter Bright wrote:

In "Lots of low hanging fruit in Phobos" the issue came up about the
automatic encoding and decoding of char ranges.

[snip]

Is there any hope of fixing this?


There's nothing to fix.

Allow me to enumerate the functions of std.algorithm and how they worktoday and how they'd work with the proposed change. Let s be a variableof some string type.


1.

s.all!(x => x == 'é') currently works as expected. Proposed: fails silently.

2.

s.any!(x => x == 'é') currently works as expected. Proposed: fails silently.

3.

s.canFind!(x => x == 'é') currently works as expected. Proposed: failssilently.


4.

s.canFind('é') currently works as expected. Proposed: fails silently.

5.

s.count() currently works as expected. Proposed: fails silently.

6.

s.count!((a, b) => std.uni.toLower(a) == std.uni.toLower(b))("é")currently works as expected (with the known issues of lowercaseconversion). Proposed: fails silently.


7.

s.count('é') currently works as expected. Proposed: fails silently.

8.

s.countUntil("a") currently work as expected. Proposed: fails silently.This applies to all variations of countUntil.


9.

s.endsWith('é') currently works as expected. Proposed: fails silently.

10.

s.find('é') currently works as expected. Proposed: fails silently. Thisapplies to other variations of find that include custom predicates.


11.

...

I went down std.algorithm in the order listed in its documentation andfound pernicious issues with almost every single algorithm.

I designed the range behavior of strings after much thinking andconsideration back in the day when I designed std.algorithm. It waspainfully obvious (but it seems to have been forgotten now that it'sworking so well) that approaching strings as arrays of char[] wouldbreak almost every single algorithm leaving us essentially in thepre-UTF C++aveman era.

Making strings bidirectional ranges has been a very good choice withinthe constraints. There was already a string type, and that wasimmutable(char)[], and a bunch of code depended on that definition.

Clearly one might argue that their app has no business dealing withdiacriticals or Asian characters. But that's the typical provincial viewthat marred many languages' approach to UTF and internationalization. Ifyou know your string is ASCII, the remedy is simple - don't use char[]and friends. From day 1, the type "char" was meant to mean "code unit ofUTF characters".

So please ponder the above before going to do surgery on the patientthat's going to kill him.



Andrei

Re: Major performance problem with std.array.front()

Reply via email to