Re: [OT] Effect of UTF-8 on 2G connections

Wyatt via Digitalmars-d Wed, 01 Jun 2016 11:36:30 -0700

On Wednesday, 1 June 2016 at 16:45:04 UTC, Joakim wrote:

On Wednesday, 1 June 2016 at 15:02:33 UTC, Wyatt wrote:
It's not hard. I think a lot of us remember when a 14.4 modemwas cutting-edge.
Well, then apparently you're unaware of how bloated web pagesare nowadays. It used to take me minutes to download popularweb pages _back then_ at _top speed_, and those pages were a_lot_ smaller.

It's telling that you think the encoding of the text is anythingbut the tiniest fraction of the problem. You should look atwhere the actual weight of a "modern" web page comes from.

Codepages and incompatible encodings were terrible then, too.

Never again.
This only shows you probably don't know the difference betweenan encoding and a code page,

"I suggested a single-byte encoding for most languages, withdouble-byte for the ones which wouldn't fit in a byte. Use somekind of header or other metadata to combine strings of differentlanguages, _rather than encoding the language into everycharacter!_"

Yeah, that? That's codepages. And your exact proposal to putencodings in the header was ALSO tried around the time thatUnicode was getting hashed out. It sucked. A lot. (Not as badas storing it in the directory metadata, though.)

Well, when you _like_ a ludicrous encoding like UTF-8, notsure your opinion matters.
It _is_ kind of ludicrous, isn't it? But it really is theleast-bad option for the most text. Sorry, bub.
I think we can do a lot better.


Maybe.  But no one's done it yet.

The vast majority of software is written for _one_ language,the local one. You may think otherwise because the softwarethat sells the most and makes the most money isinternationalized software like Windows or iOS, because it canbe resold into many markets. But as a percentage of lines ofcode written, such international code is almost nothing.

I'm surprised you think this even matters after talking about webpages. The browser is your most common string processingsituation. Nothing else even comes close.

largely ignoring the possibilities of the header scheme Isuggested.

"Possibilities" that were considered and discarded decades ago bypeople with way better credentials. The era of single-byteencodings is gone, it won't come back, and good riddance to badrubbish.

I could call that "trolling" by all of you, :) but I'll insteadcall it what it likely is, reactionary thinking, and move on.

It's not trolling to call you out for clearly not doing yourhomework.

I don't think you understand: _you_ are the special case.

Oh, I understand perfectly. _We_ (whoever "we" are) can handleany sequence of glyphs and combining characters (correctly-formedor not) in any language at any time, so we're the special case...?


Yeah, it sounds funny to me, too.

The 5 billion people outside the US and EU are _not the specialcase_.


Fortunately, it works for them to.

The problem is all the rest, and those just below who cannotafford it at all, in part because the tech is not as efficientas it could be yet. Ditching UTF-8 will be one way to make itmore efficient.

All right, now you've found the special case; the case where thegeneric, unambiguous encoding may need to be lowered to somethingelse: people for whom that encoding is suboptimal because of_current_ network constraints.

I fully acknowledge it's a couple billion people and that'snothing to sneeze at, but I also see that it's a situation thatwill become less relevant over time.


-Wyatt

Re: [OT] Effect of UTF-8 on 2G connections

Reply via email to