On 05/30/2016 03:00 PM, Jack Stouffer wrote:
On Monday, 30 May 2016 at 18:24:23 UTC, Andrei Alexandrescu wrote:
That kind of makes this thread less productive than "How to improve
autodecoding?" -- Andrei
Please don't misunderstand, I'm for fixing string behavior.
Surely the
On 30.05.2016 18:01, Andrei Alexandrescu wrote:
On 05/28/2016 03:04 PM, Walter Bright wrote:
On 5/28/2016 5:04 AM, Andrei Alexandrescu wrote:
So it harkens back to the original mistake: strings should NOT be
arrays with
the respective primitives.
An array of code units provides consistency,
On Monday, 30 May 2016 at 18:24:23 UTC, Andrei Alexandrescu wrote:
That kind of makes this thread less productive than "How to
improve autodecoding?" -- Andrei
Please don't misunderstand, I'm for fixing string behavior. But,
let's not pretend that this wouldn't be one of the (if not the)
On Monday, 30 May 2016 at 16:03:03 UTC, Marco Leise wrote:
When on the other hand you work with real world international
text, you'll want to work with graphemes.
Actually, my main rule of thumb is: don't mess with strings. Get
them from the user, store them without modification, spit them
On Monday, 30 May 2016 at 17:14:47 UTC, Andrew Godfrey wrote:
I like "make string iteration explicit" but I wonder about
other constructs. E.g. What about "sort an array of strings"?
How would you tell a generic sort function whether you want it
to interpret strings by code unit vs code point
On 30-May-2016 21:24, Andrei Alexandrescu wrote:
On 05/30/2016 12:34 PM, Jack Stouffer wrote:
On Monday, 30 May 2016 at 16:25:20 UTC, Nick Sabalausky wrote:
D1 -> D2 was a vastly more disruptive change than getting rid of
auto-decoding would be.
Don't be so sure. All string handling code
On Monday, 30 May 2016 at 14:35:03 UTC, Seb wrote:
That's a great idea - the compiler should also issue
deprecation warnings when I try to do things like:
I don't agree on changing those. Indexing and slicing a char[] is
really useful and actually not hard to do correctly (at least
with
On 05/30/2016 12:34 PM, Jack Stouffer wrote:
On Monday, 30 May 2016 at 16:25:20 UTC, Nick Sabalausky wrote:
D1 -> D2 was a vastly more disruptive change than getting rid of
auto-decoding would be.
Don't be so sure. All string handling code would become broken, even if
it appears to work at
On 05/30/2016 12:25 PM, Nick Sabalausky wrote:
On 05/29/2016 09:58 PM, Jack Stouffer wrote:
The problem is not active users. The problem is companies who have > 10K
LOC and libraries that are no longer maintained. E.g. It took
Sociomantic eight years after D2's release to switch only a few
On Monday, 30 May 2016 at 16:03:03 UTC, Marco Leise wrote:
*** http://site.icu-project.org/home#TOC-What-is-ICU-
I was actually talking about ICU with a colleague today. Could it
be that Unicode itself is broken? I've often heard criticism of
Unicode but never looked into it.
I like "make string iteration explicit" but I wonder about other
constructs. E.g. What about "sort an array of strings"? How would
you tell a generic sort function whether you want it to interpret
strings by code unit vs code point vs grapheme?
On Monday, 30 May 2016 at 16:25:20 UTC, Nick Sabalausky wrote:
D1 -> D2 was a vastly more disruptive change than getting rid
of auto-decoding would be.
Don't be so sure. All string handling code would become broken,
even if it appears to work at first.
Am Thu, 26 May 2016 16:23:16 -0700
schrieb "H. S. Teoh via Digitalmars-d"
:
> On Thu, May 26, 2016 at 12:00:54PM -0400, Andrei Alexandrescu via
> Digitalmars-d wrote:
> [...]
> > s.walkLength
> > s.count!(c => !"!()-;:,.?".canFind(c)) // non-punctuation
> >
On 05/29/2016 09:58 PM, Jack Stouffer wrote:
The problem is not active users. The problem is companies who have > 10K
LOC and libraries that are no longer maintained. E.g. It took
Sociomantic eight years after D2's release to switch only a few parts of
their projects to D2. With the loss of old
Am Mon, 30 May 2016 09:26:09 +
schrieb Chris :
> If it's true that auto decode is unnecessary in many cases, then
> it shouldn't affect the whole code base. But I might be mistaken
> here. Maybe we should make a list of the functions where auto
> decode does make a
On 05/28/2016 03:04 PM, Walter Bright wrote:
On 5/28/2016 5:04 AM, Andrei Alexandrescu wrote:
So it harkens back to the original mistake: strings should NOT be
arrays with
the respective primitives.
An array of code units provides consistency, predictability,
flexibility, and performance.
On 05/29/2016 04:47 PM, H. S. Teoh via Digitalmars-d wrote:
It depends on what you're trying to accomplish. That's the point we're
trying to get at. For some operations, working with code points makes
the most sense. But for other operations, it does not. There is no one
representation that is
On Monday, 30 May 2016 at 14:56:36 UTC, ag0aep6g wrote:
All this is only sensible when we move to a dedicated string
type that's not just an alias of `immutable(char)[]`.
`immutable(char)[]` explicitly is an array of code units. It
would not be acceptable, in my opinion, if the normal array
On 05/30/2016 04:35 PM, Seb wrote:
That's a great idea - the compiler should also issue deprecation
warnings when I try to do things like:
string a = "你好";
a[1]; // deprecation: direct access to a Unicode string is highly
error-prone. Please specify the type of access. More details
On Monday, 30 May 2016 at 12:59:08 UTC, Adam D. Ruppe wrote:
On Monday, 30 May 2016 at 12:45:27 UTC, Andrei Alexandrescu
wrote:
That's... what I said. -- Andrei
You said "not arrays", he said "not ranges".
So that just means making the std.range.primitives.popFront and
front add a
On Monday, 30 May 2016 at 12:45:27 UTC, Andrei Alexandrescu wrote:
That's... what I said. -- Andrei
You said "not arrays", he said "not ranges".
So that just means making the std.range.primitives.popFront and
front add a constraint if(!isSomeString()).
Language built-ins still work, but
On 05/30/2016 07:58 AM, Marc Schütz wrote:
On Saturday, 28 May 2016 at 12:04:20 UTC, Andrei Alexandrescu wrote:
On 5/28/16 6:59 AM, Marc Schütz wrote:
The fundamental problem is choosing one of those possibilities over the
others without knowing what the user actually wants, which is what both
On Saturday, 28 May 2016 at 12:04:20 UTC, Andrei Alexandrescu
wrote:
On 5/28/16 6:59 AM, Marc Schütz wrote:
The fundamental problem is choosing one of those possibilities
over the
others without knowing what the user actually wants, which is
what both
BEFORE and AFTER do.
OK, that's a fair
On Sunday, 29 May 2016 at 17:35:35 UTC, Nick Sabalausky wrote:
On 05/12/2016 08:47 PM, Jack Stouffer wrote:
As much as I agree on the importance of a good smooth migration
path, I don't think the "Python 2 vs 3" situation is really all
that comparable here. Unlike Python, we wouldn't be
On 5/29/2016 5:56 PM, H. S. Teoh via Digitalmars-d wrote:
As far as Unicode is concerned, it is a standard for representing
*written* text, not spoken language, so concepts like phonemes aren't
even relevant in the first place. Let's not get derailed from the
present discussion by confusing the
On Sunday, 29 May 2016 at 17:35:35 UTC, Nick Sabalausky wrote:
Unlike Python, we wouldn't be maintaining a "with
auto-decoding" fork for years and years and years, ensuring
nobody ever had a pressing reason to bother migrating.
If it happens, they better. The D1 fork was maintained for almost
On Sun, May 29, 2016 at 01:13:36PM +, Tobias M via Digitalmars-d wrote:
> On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote:
> > Ok, you have a point there, to be precise is a multigraph (a
> > digraph)(cf. [1]). In French you can have multigraphs consisting of
> > three or more characters
On 5/29/2016 4:47 AM, Tobias Müller wrote:
No, this is well established terminology, you are confusing several things here:
For D, we should stick with the terminology as defined by Unicode.
On 05/12/2016 10:15 PM, Walter Bright wrote:
> On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote:
>> I am as unclear about the problems of autodecoding as I am about the
> necessity
>> to remove curl. Whenever I ask I hear some arguments that work well
> emotionally
>> but are scant on reason and
On Sun, May 29, 2016 at 03:55:22PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
> On 05/29/2016 09:42 AM, Tobias M wrote:
> > On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote:
> > > On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via
> > > Digitalmars-d wrote:
> > > >
On 05/29/2016 09:42 AM, Tobias M wrote:
On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote:
On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
On 5/27/16 3:10 PM, ag0aep6g wrote:
> I don't think there is value in distinguishing by language. > The
point
On 05/12/2016 08:47 PM, Jack Stouffer wrote:
If you're serious about removing auto-decoding, which I think you and
others have shown has merits, you have to the THE SIMPLEST migration
path ever, or you will kill D. I'm talking a simple press of a button.
I'm not exaggerating here. Python, a
On Sunday, 29 May 2016 at 13:04:18 UTC, Tobias M wrote:
On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote:
I am pretty sure that a single grapheme in unicode does not
correspond to your notion of "character". I am pretty sure
that what you think of as a "character" is officially called
On Friday, 27 May 2016 at 19:43:16 UTC, H. S. Teoh wrote:
On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu
via Digitalmars-d wrote:
On 5/27/16 3:10 PM, ag0aep6g wrote:
> I don't think there is value in distinguishing by language.
> The point of Unicode is that you shouldn't need
On Sunday, 29 May 2016 at 12:41:50 UTC, Chris wrote:
Ok, you have a point there, to be precise is a multigraph
(a digraph)(cf. [1]). In French you can have multigraphs
consisting of three or more characters /o/, as in Irish
=> /i:/. However, a phoneme is not necessarily a spoken
On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote:
I am pretty sure that a single grapheme in unicode does not
correspond to your notion of "character". I am pretty sure that
what you think of as a "character" is officially called
"Grapheme Cluster" not "Grapheme".
Grapheme is a
On Sunday, 29 May 2016 at 11:47:30 UTC, Tobias Müller wrote:
On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote:
Unicode graphemes are not always the same as graphemes in
natural (written) languages. If <é> is composed in Unicode, it
is still one grapheme in a written language, not two
On Sunday, 29 May 2016 at 11:47:30 UTC, Tobias Müller wrote:
On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote:
Unicode graphemes are not always the same as graphemes in
natural (written) languages. If <é> is composed in Unicode, it
is still one grapheme in a written language, not two
On Sunday, 29 May 2016 at 11:25:11 UTC, Chris wrote:
Unicode graphemes are not always the same as graphemes in
natural (written) languages. If <é> is composed in Unicode, it
is still one grapheme in a written language, not two distinct
characters. However, in natural languages two characters
On Saturday, 28 May 2016 at 22:29:12 UTC, Andrew Godfrey wrote:
[snip]
From all the detail in this thread, I wonder now if "a
grapheme" is even an unambiguous concept across different
environments.
Unicode graphemes are not always the same as graphemes in natural
(written) languages. If
On 05/28/2016 03:04 PM, Andrei Alexandrescu wrote:
> On 5/28/16 6:59 AM, Marc Schütz wrote:
>> The fundamental problem is choosing one of those possibilities over the
>> others without knowing what the user actually wants, which is what both
>> BEFORE and AFTER do.
>
> OK, that's a fair argument,
On Saturday, 28 May 2016 at 12:04:20 UTC, Andrei Alexandrescu
wrote:
OK, that's a fair argument, thanks. So it seems there should be
no "default" way to iterate a string
Yes!
So it harkens back to the original mistake: strings should NOT
be arrays with the respective primitives.
If you're
On Saturday, 28 May 2016 at 19:04:14 UTC, Walter Bright wrote:
On 5/28/2016 5:04 AM, Andrei Alexandrescu wrote:
So it harkens back to the original mistake: strings should NOT
be arrays with
the respective primitives.
An array of code units provides consistency, predictability,
flexibility,
On 5/28/2016 5:04 AM, Andrei Alexandrescu wrote:
So it harkens back to the original mistake: strings should NOT be arrays with
the respective primitives.
An array of code units provides consistency, predictability, flexibility, and
performance. It's a solid base upon which the programmer can
On Friday, 27 May 2016 at 18:11:22 UTC, Andrei Alexandrescu wrote:
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length
== 2
Would normalization make length 1? -- Andrei
No, I've tried it. I think dchar[] returns one or you check by
grapheme.
On 5/28/16 6:59 AM, Marc Schütz wrote:
The fundamental problem is choosing one of those possibilities over the
others without knowing what the user actually wants, which is what both
BEFORE and AFTER do.
OK, that's a fair argument, thanks. So it seems there should be no
"default" way to
On Friday, 27 May 2016 at 13:34:33 UTC, Andrei Alexandrescu wrote:
On 5/27/16 6:56 AM, Marc Schütz wrote:
It is not, which has been shown by various posts in this
thread.
Couldn't quite find strong arguments. Could you please be more
explicit on which you found most convincing? -- Andrei
On 28-May-2016 01:04, tsbockman wrote:
On Friday, 27 May 2016 at 20:42:13 UTC, Andrei Alexandrescu wrote:
On 05/27/2016 03:39 PM, Dmitry Olshansky wrote:
No, this is not the point of normalization.
What is? -- Andrei
1) A grapheme may include several combining characters (such as
On Fri, May 27, 2016 at 04:41:09PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
> On 05/27/2016 03:43 PM, H. S. Teoh via Digitalmars-d wrote:
> > That's what we've been trying to say all along!
>
> If that's the case things are pretty dire, autodecoding or not. --
> Andrei
Like it or
On 5/27/2016 11:27 AM, Andrei Alexandrescu wrote:
On 5/27/16 1:11 PM, Walter Bright wrote:
They mean code units.
Always valid or potentially invalid as well? -- Andrei
Some years ago I would have said always valid. Experience, however, says that
Unicode is often dirty and code should be
On Friday, 27 May 2016 at 22:12:57 UTC, Minas Mina wrote:
Those should be the same though, i.e compare the same. In order
to do that, there is normalization. What is does is to _expand_
the single codepoint Ä into A + ¨
Unless I'm mistaken, this depends on the form used. For example,
in NFKC
On Friday, 27 May 2016 at 20:42:13 UTC, Andrei Alexandrescu wrote:
On 05/27/2016 03:39 PM, Dmitry Olshansky wrote:
On 27-May-2016 21:11, Andrei Alexandrescu wrote:
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length
== 2
Would normalization make
On Friday, 27 May 2016 at 20:42:13 UTC, Andrei Alexandrescu wrote:
On 05/27/2016 03:39 PM, Dmitry Olshansky wrote:
No, this is not the point of normalization.
What is? -- Andrei
1) A grapheme may include several combining characters (such as
diacritics) whose order is not supposed to be
On Friday, 27 May 2016 at 20:42:13 UTC, Andrei Alexandrescu wrote:
On 05/27/2016 03:39 PM, Dmitry Olshansky wrote:
On 27-May-2016 21:11, Andrei Alexandrescu wrote:
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length
== 2
Would normalization make
On 05/27/2016 03:39 PM, Dmitry Olshansky wrote:
On 27-May-2016 21:11, Andrei Alexandrescu wrote:
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length == 2
Would normalization make length 1? -- Andrei
No, this is not the point of normalization.
On 05/27/2016 03:43 PM, H. S. Teoh via Digitalmars-d wrote:
That's what we've been trying to say all along!
If that's the case things are pretty dire, autodecoding or not. -- Andrei
On Fri, May 27, 2016 at 07:53:30PM +, Adam D. Ruppe via Digitalmars-d wrote:
> On Friday, 27 May 2016 at 19:30:53 UTC, Andrei Alexandrescu wrote:
> > It seems code points are kind of useless because they don't really
> > mean anything, would that be accurate? -- Andrei
>
> It might help to
On 5/27/16 3:30 PM, Andrei Alexandrescu wrote:
On 5/27/16 3:10 PM, ag0aep6g wrote:
I don't think there is value in distinguishing by language. The point of
Unicode is that you shouldn't need to do that.
It seems code points are kind of useless because they don't really mean
anything, would
On Friday, 27 May 2016 at 19:30:53 UTC, Andrei Alexandrescu wrote:
It seems code points are kind of useless because they don't
really mean anything, would that be accurate? -- Andrei
It might help to think of code points as being a kind of byte
code for a text-representing VM.
It's not
On Fri, May 27, 2016 at 03:30:53PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
> On 5/27/16 3:10 PM, ag0aep6g wrote:
> > I don't think there is value in distinguishing by language. The
> > point of Unicode is that you shouldn't need to do that.
>
> It seems code points are kind of
On Fri, May 27, 2016 at 02:42:27PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
> On 5/27/16 12:40 PM, H. S. Teoh via Digitalmars-d wrote:
> > Exactly. And we just keep getting stuck on this point. It seems that
> > the message just isn't getting through. The unfounded assumption
> >
On 05/27/2016 09:30 PM, Andrei Alexandrescu wrote:
It seems code points are kind of useless because they don't really mean
anything, would that be accurate? -- Andrei
I think so, yeah.
Due to combining characters, code points are similar to code units: a
Unicode thing that you need to know
On 27-May-2016 21:11, Andrei Alexandrescu wrote:
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length == 2
Would normalization make length 1? -- Andrei
No, this is not the point of normalization.
--
Dmitry Olshansky
On 5/27/16 1:11 PM, Walter Bright wrote:
The std.string algorithms I wrote all work much better (i.e. faster)
without autodecoding, while maintaining proper Unicode support.
Violent agreement is occurring here. We have plenty of those and need
more. -- Andrei
On 5/27/16 3:10 PM, ag0aep6g wrote:
I don't think there is value in distinguishing by language. The point of
Unicode is that you shouldn't need to do that.
It seems code points are kind of useless because they don't really mean
anything, would that be accurate? -- Andrei
On 05/27/2016 08:42 PM, Andrei Alexandrescu wrote:
Which languages are covered by code points, and which languages require
graphemes consisting of multiple code points? How does normalization
play into this? -- Andrei
I don't think there is value in distinguishing by language. The point of
On 5/27/16 12:40 PM, H. S. Teoh via Digitalmars-d wrote:
Exactly. And we just keep getting stuck on this point. It seems that the
message just isn't getting through. The unfounded assumption continues
to be made that iterating by code point is somehow "correct" by
definition and nobody can
On 5/27/16 1:11 PM, Walter Bright wrote:
They mean code units.
Always valid or potentially invalid as well? -- Andrei
On Friday, 27 May 2016 at 18:11:22 UTC, Andrei Alexandrescu wrote:
Would normalization make length 1? -- Andrei
In some, but not all cases.
On 5/27/16 10:15 AM, Chris wrote:
It has happened to me that characters like "é" return length == 2
Would normalization make length 1? -- Andrei
On 5/26/2016 9:00 AM, Andrei Alexandrescu wrote:
My thesis: the D1 design decision to represent strings as char[] was disastrous
and probably one of the largest weaknesses of D1. The decision in D2 to use
immutable(char)[] for strings is a vast improvement but still has a number of
issues.
The
On Fri, May 27, 2016 at 03:47:32PM +0200, ag0aep6g via Digitalmars-d wrote:
> On 05/27/2016 03:32 PM, Andrei Alexandrescu wrote:
> > > > However the following do require autodecoding:
> > > >
> > > > s.walkLength
> > > > s.count!(c => !"!()-;:,.?".canFind(c)) // non-punctuation
> > > > s.count!(c
On Friday, 27 May 2016 at 13:47:32 UTC, ag0aep6g wrote:
Misunderstanding. All examples work properly today because of
autodecoding. -- Andrei
They only work "properly" if you define "properly" as "in terms
of code points". But working in terms of code points is usually
wrong. If you want to
On 05/27/2016 03:32 PM, Andrei Alexandrescu wrote:
However the following do require autodecoding:
s.walkLength
s.count!(c => !"!()-;:,.?".canFind(c)) // non-punctuation
s.count!(c => c >= 32) // non-control characters
Currently the standard library operates at code point level even
though
On 5/27/16 6:26 AM, Kagamin wrote:
As I understand, design rationale
behind strings being plain arrays of code units is that it's impractical
for the string to smarter than array of code units - it just won't cut
it, while plain array provides simple and easy to understand
implementation of
On 5/27/16 6:56 AM, Marc Schütz wrote:
It is not, which has been shown by various posts in this thread.
Couldn't quite find strong arguments. Could you please be more explicit
on which you found most convincing? -- Andrei
On 5/27/16 7:19 AM, Chris wrote:
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu wrote:
[snip]
I would agree only with the amendment "...if used naively", which is
important. Knowledge of how autodecoding works is a prerequisite for
writing fast string code in D. Also, little
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu
wrote:
[snip]
I would agree only with the amendment "...if used naively",
which is important. Knowledge of how autodecoding works is a
prerequisite for writing fast string code in D. Also, little
code should deal with one code
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu
wrote:
This might be a good time to discuss this a tad further. I'd
appreciate if the debate stayed on point going forward. Thanks!
My thesis: the D1 design decision to represent strings as
char[] was disastrous and probably one of
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu
wrote:
11. Indexing an array produces different results than
autodecoding,
another glaring special case.
This is a direct consequence of the fact that string is
immutable(char)[] and not a specific type. That error predates
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu
wrote:
4. Autodecoding is slow and has no place in high speed string
processing.
I would agree only with the amendment "...if used naively",
which is important. Knowledge of how autodecoding works is a
prerequisite for writing
On 05/26/2016 07:23 PM, H. S. Teoh via Digitalmars-d wrote:
Therefore, instead of:
myString.splitter!"abc".joiner!"def".count;
we have to write:
myString.representation
.splitter!("abc".representation)
.joiner!("def".representation)
On Thu, May 26, 2016 at 12:00:54PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
> On 05/12/2016 04:15 PM, Walter Bright wrote:
[...]
> > 4. Autodecoding is slow and has no place in high speed string processing.
>
> I would agree only with the amendment "...if used naively", which is
On Thursday, 26 May 2016 at 16:00:54 UTC, Andrei Alexandrescu
wrote:
instead, it should use standard library algorithms for
searching,
matching etc. When needed, iterating every code unit is
trivially
done through indexing.
For an example where the std.algorithm/range functions don't cut
This might be a good time to discuss this a tad further. I'd appreciate
if the debate stayed on point going forward. Thanks!
My thesis: the D1 design decision to represent strings as char[] was
disastrous and probably one of the largest weaknesses of D1. The
decision in D2 to use
On Tuesday, 17 May 2016 at 09:53:17 UTC, Kagamin wrote:
With UTF-8 problems happened on a massive scale in LAMP setups:
mysql used latin1 as a default encoding and almost everything
worked fine.
^ latin-1 with Swedish collation rules.
And even if you set the encoding to "utf8", almost
On Friday, 13 May 2016 at 21:46:28 UTC, Jonathan M Davis wrote:
The history of why UTF-16 was chosen isn't really relevant to
my point (Win32 has the same problem as Java and for similar
reasons).
My point was that if you use UTF-8, then it's obvious _really_
fast when you screwed up
On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote:
Runs for each combination were done five times and the median
times used. The median times and the char[] to ubyte[] ratio
are below:
| | |char[] | ubyte[] |
| Compiler | Text type | time (ms) | time (ms) | ratio |
On Mon, May 16, 2016 at 12:31:04AM +, Jack Stouffer via Digitalmars-d wrote:
> On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote:
> >Given the importance of performance in the auto-decoding topic, it
> >seems reasonable to quantify it. I took a stab at this. It would of
> >course be prudent
On Sunday, 15 May 2016 at 23:10:38 UTC, Jon D wrote:
Given the importance of performance in the auto-decoding topic,
it seems reasonable to quantify it. I took a stab at this. It
would of course be prudent to have others conduct similar
analysis rather than rely on my numbers alone.
Here is
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
On 5/12/2016 9:29 AM, Andrei Alexandrescu wrote:
> I am as unclear about the problems of autodecoding as I am
about the necessity
> to remove curl. Whenever I ask I hear some arguments that
work well emotionally
> but are scant on
On Sunday, 15 May 2016 at 01:45:25 UTC, Bill Hicks wrote:
From a technical point, D is not successful, for the most part.
C/C++ at least can use the excuse that they were created
during a time when we didn't have the experience and the
knowledge that we do now.
Not really. The dominating
On Friday, 13 May 2016 at 09:28:45 UTC, Chris wrote:
PS I wonder does Bill Hicks know you're using his name? But I
guess he's lost interest in this planet and happily lives on
Mars now.
Maybe I'm using the name to avoid being harassed. Or maybe,
there are thousands of people in the world
On Friday, 13 May 2016 at 07:26:53 UTC, poliklosio wrote:
Also, you are missing the point by claiming that a technical
problem is sure to kill D. Note that very successful languages
like C++, python and so on also have undergone heated
discussions about various features, and often live
On 5/12/16 4:15 PM, Walter Bright wrote:
10. Autodecoded arrays cannot be RandomAccessRanges, losing a key
benefit of being arrays in the first place.
I'll repeat what I said in the other thread.
The problem isn't auto-decoding. The problem is hijacking the char[] and
wchar[] (and variants)
On Friday, 13 May 2016 at 14:06:28 UTC, Vladimir Panteleev wrote:
On Friday, 13 May 2016 at 13:41:30 UTC, Chris wrote:
PS Why does do I get a "StopForumSpam error" every time I post
today? Has anyone else experienced the same problem:
"StopForumSpam error: Socket error: Lookup error:
On Friday, 13 May 2016 at 13:41:30 UTC, Chris wrote:
PS Why does do I get a "StopForumSpam error" every time I post
today? Has anyone else experienced the same problem:
"StopForumSpam error: Socket error: Lookup error: getaddrinfo
error: Name or service not known. Please solve a CAPTCHA to
On Friday, 13 May 2016 at 13:17:44 UTC, Walter Bright wrote:
On 5/13/2016 2:12 AM, Chris wrote:
If autodecode is killed, could we have a test version asap?
I'd be willing to
test my programs with autodecode turned off and see what
happens. Others should
do likewise and we could come up with a
On 5/13/2016 3:43 AM, Marc Schütz wrote:
On Thursday, 12 May 2016 at 20:15:45 UTC, Walter Bright wrote:
7. Autodecode cannot be used with unicode path/filenames, because it is legal
(at least on Linux) to have invalid UTF-8 as filenames. It turns out in the
wild that pure Unicode is not
On 5/12/2016 11:50 PM, Bill Hicks wrote:
And I get called a troll and
other names when I list half a dozen things wrong with D, my posts get
removed/censored, etc, all because I try to inform people not to waste time with
D because it's a broken and failed language.
Posts that engage in
301 - 400 of 427 matches
Mail list logo