Re: Major performance problem with std.array.front()

Vladimir Panteleev Sat, 08 Mar 2014 12:41:29 -0800

On Saturday, 8 March 2014 at 20:05:36 UTC, Andrei Alexandrescuwrote:

Searching for characters in strings would be difficult to deeminappropriate.

The notion of "character" exists only in certain writing systems.It is thus a flawed practice, and I think it should not beencouraged, as it will only make writing truly-internationalsoftware more difficult. A more correct approach is searching fora certain substring. If non-exact matching is needed(normalization, case insensitivity etc.), then the appropriatesolution is to use the Unicode algorithms.

If you look at the situation from this point of view, single codepoints become merely an implementation detail.

1. All algorithms would by default operate on strings atchar/wchar level (i.e. code unit). That would cause the usualissues and confusions I was aware of from C++. Certainalgorithms would require specialization and/or the user usingbyDchar for correctness.

As previously discussed, "correctness" here is conditional. Iwould not use that word, it is another extreme.

From experience with C++ I knew (1) had a bad track record, and(2) "generically conservative, specialize for speed" was asuccessful pattern.
What would you have chosen given that context?

Ideally, we would have the Unicode algorithms in the standardlibrary from day 1, and advocated their use throughout thedocumentation.

I'm inclined to say that the correct approach is to
state that algorithms operate explicitly on a T.sizeof basisand that ifthe data contained in a particular range has somemulti-element encodingthen separate, specialized routines should be used with theT.sizeof
behavior will not produce the desired result.
That sounds quite like C++ plus ICU. It doesn't strike me asthe golden standard for Unicode integration.

Why not? Because it sounds like D needs exactly that. Plus itsamazing slicing and range capabilities, of course.

So the problem to me is that we're stuck not fixing somethingthat'shorribly broken just because it's broken in a way that peoplepresumably
now expect.
Clearly I'm being subjective here but again I'd find itdifficult to get convinced we have something horribly brokenfrom the evidence I gathered inside and outside Facebook.

Have you or anyone you personally know tried to process text in Dcontaining a writing system such as Sanskrit's?

I'd personally like to see this fixed and I think the newbehavior ispreferable overall, but I do share Andrei's concern that sucha big
change might hurt the language anyway.
I've said this once and I'm saying it again: the best way toconvert this discussion into something useful is to deviseideas for useful non-breaking additions.

I disagree. As I've argued, I believe that currently most uses ofdchars in an application are incorrect, and ultimately a timebomb for proper internationalization support. We need to applythe same procedure that we do with any language construct thatwas deemed to have been a poor decision: put it through adeprecation cycle and fix it.

Re: Major performance problem with std.array.front()

Reply via email to