subject:"\[Issue 7054\] std.format.formattedWrite uses code units count as width instead of characters count"

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2018-01-08 Thread d-bugmail--- via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

hst...@quickfur.ath.cx changed:

   What|Removed |Added

 CC||dran...@gmail.com

--- Comment #13 from hst...@quickfur.ath.cx ---
*** Issue 18205 has been marked as a duplicate of this issue. ***

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2017-09-07 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054
Issue 7054 depends on issue 13348, which changed state.

Issue 13348 Summary: std.uni.Grapheme is impure due to using C malloc and 
friends
https://issues.dlang.org/show_bug.cgi?id=13348

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-10-15 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

Andrei Alexandrescu  changed:

   What|Removed |Added

   Keywords||bootcamp
 CC||and...@erdani.com

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-24 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #12 from Marco Leise  ---
(In reply to hsteoh from comment #10)
> Even if we concede that modern terminals ought to be Unicode-aware (if not
> fully supporting Unicode), there is still the slippery slope of how to print
> bidirectional text, vertical text, scripts that require glyph mutation,
> etc.. Where does one draw the line as to what writefln ought/ought not
> handle?

I tend to think like Steward. If I was using a script other than Latin,
Cyrillic and similarly simple scripts I would most likely expect writefln's
output on a terminal to look like when I print a text file of the same script
to the terminal. Mixing vertical and horizontal text on a terminal is painfully
hard and my expectation is that there is at most an option to render either
horizontally or vertically (transposed). In that case "minimal width" would
become "minimal height" and we are out of trouble.

What exactly do you mean by glyph mutation? In most cases it is probably a task
for the text layout engine the terminal uses. In other cases the user of
writefln should be aware of how their script will display on a terminal and
prepare their text accordingly before printing. There is no simple way to make
plurals work in all languages either:
http://localization-guide.readthedocs.org/en/latest/l10n/pluralforms.html
Is that comparable to what you had in mind?

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-18 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #11 from Stewart Gordon  ---
What would supporting bidirectional text entail, exactly?  It seems to me it's
the job of the terminal to render characters in the correct order

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-17 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #10 from hst...@quickfur.ath.cx ---
Even if we concede that modern terminals ought to be Unicode-aware (if not
fully supporting Unicode), there is still the slippery slope of how to print
bidirectional text, vertical text, scripts that require glyph mutation, etc..
Where does one draw the line as to what writefln ought/ought not handle?

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-15 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #9 from Marco Leise  ---
I always regarded it as merely a means to print stuff with a non-proportional
font for humans to read that extends to text files. The match up of bytes and
visual characters in the early days printf is only a historical coincidence.

Most terminals - like programming languages and GUI toolkits - have to adapt to
the Unicode reality and I believe it is safe to assume that when someone calls
writefln or format with full-width symbols they use a terminal that can handle
them. The popular VTE library used by many recent Linux terminal emulators
works great for example.

That said, printf is no better, and we could just claim that the width is meant
to mean bytes or ASCII characters and you are supposed to use writefln only for
English text in debugging output and not user interaction. std.stdio never
cared about the user locale anyways. For all we know the output terminal might
expect KOI-8 (Cyrillic) or some Indian script. In Java for example you are
supposed to use an encoding wrapper if your stdout goes to a terminal, IIRC.
But as Unicode is kind of ubiquitous now, we might as well say that Dlang only
works on Unicode enabled systems. Sorry for the derail ... :)

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-12 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #8 from Stewart Gordon  ---
(In reply to Marco Leise from comment #6)
> https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms

So "halfwidth" means the width of a character cell, and "fullwidth" means
double that width.  Seems counter-intuitive.  I would have expected them to be
something like "singlewidth" and "doublewidth" respectively.

So there a few different units at work here:
- code units
- codepoints
- graphemes
- width units

A further complication is whether formattedWrite should be geared towards text
terminals, writing data to a text file designed for human reading, writing data
to a text file that follows a rigid format for machine processing or what.  So
it looks like there's no simple solution.  But in 99% of cases, using code
units (as it does at the moment) is bound to be wrong.

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-11 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #7 from hst...@quickfur.ath.cx ---
Argh. Welcome to Unicode, where exceptions *are* the norm, and no simple
algorithm is simple in practice.

And this is a double-argh, because when it comes to double-width characters,
whether or not the output will even *look* right depends on what kind of
terminal you're using, and how it handles double-width characters. Older
terminals may not recognize double-width characters, and such characters may
end up formatted as if they were single-width. (But then again, such terminals
will already make a big unreadable mess of double-width characters anyway, so
perhaps it's not so important to cater to them.)

But once you start down this slippery slope, the next thing that will come up
is making `writefln` support right-to-left text, then vertical text, etc., and
before you know it, we'll be reinventing libpango except poorly (and for a text
terminal where it's questionable whether such things are even relevant
anymore).

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2016-02-09 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

Marco Leise  changed:

   What|Removed |Added

 CC||marco.le...@gmx.de

--- Comment #6 from Marco Leise  ---
Graphemes work until you meet full-width characters.
Ｇｒａｐｈｅｍｅｓ  ｗｏｒｋ  ｕｎｔｉｌ  ｙｏｕ  ｍｅｅｔ  ｆｕｌｌ－ｗｉｄｔｈ  ｃｈａｒａｃｔｅｒｓ．

>From Wikipedia: "With fixed-width fonts, a halfwidth character occupies half
the width of a fullwidth character, hence the name."

https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms

We need UTF decoding, grapheme clustering, character categorizing,
super-cow-power width specifiers in our writeln.

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2014-08-29 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

Dmitry Olshansky dmitry.o...@gmail.com changed:

   What|Removed |Added

 CC||dmitry.o...@gmail.com

--- Comment #4 from Dmitry Olshansky dmitry.o...@gmail.com ---
(In reply to hsteoh from comment #3)
 Tried to fix this today, unfortunately it's blocked by std.uni.byGrapheme
 being impure, which causes a ripple of impurity down the call chain causing
 several unittest compile errors and CTFE errors.

Why should it call byGrapheme? Doesn't seem likly that we are doing grapheme
clustering only to output some damn text.

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2014-08-29 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #5 from hst...@quickfur.ath.cx ---
Because grapheme clustering is the only sane way to handle output to a field of
fixed length. For example, writeln(%5s, a\u0301) should treat a\u0301 as
occupying only a single position in the 5-position wide output field.

Any other solution would introduce further problems, e.g. if we count code
points instead, then the width field in the format string would be basically
useless (the caller will have to manually count output positions -- with
byGrapheme -- and adjust the width accordingly). Furthermore, it would
introduce more special cases (precomposed characters will format differently
from base char + combining diacritic; non-spacing characters will consume field
width but occupy no space in the actual output, etc.).

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2014-08-20 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

hst...@quickfur.ath.cx changed:

   What|Removed |Added

 CC||hst...@quickfur.ath.cx

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2014-08-20 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

hst...@quickfur.ath.cx changed:

   What|Removed |Added

 Depends on||13348

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2014-08-20 Thread via Digitalmars-d-bugs

https://issues.dlang.org/show_bug.cgi?id=7054

--- Comment #3 from hst...@quickfur.ath.cx ---
Tried to fix this today, unfortunately it's blocked by std.uni.byGrapheme being
impure, which causes a ripple of impurity down the call chain causing several
unittest compile errors and CTFE errors.

--

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

2012-07-06 Thread d-bugmail

http://d.puremagic.com/issues/show_bug.cgi?id=7054


Denis Shelomovskij verylonglogin@gmail.com changed:

   What|Removed |Added

   Priority|P5  |P4
 CC||verylonglogin@gmail.com
Summary|writef formats strings  |std.format.formattedWrite
   |containing UTF-8 multibyte  |uses code units count as
   |characters to the wrong |width instead of characters
   |width   |count


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

[Issue 7054] std.format.formattedWrite uses code units count as width instead of characters count

16 matches

Site Navigation

Mail list logo

Footer information