Am 2018-11-09 um 13:42 schrieb Mark E. Shoulson via Unicode:
Noticed something really fascinating in an old pamphlet I was reading
really interesting, thanks!
(Link is
?
this character is intended for invisible word
separation and for line break control; it has no
width, but its presence between two characters
does not prevent increased letter spacing in
justification
Best wishes,
Otto Stolz
ital sharp-S
is an optional alternative, but not the normal way of
uppercasing the sharp-S.
Best wishes,
Otto Stolz
the language.
I wonder how English and French ever could
be made to use a single script, let alone
German (“ß”), Icelandic (“þ”), Swedish (“å”),
Latvian (“ē”), Chech (“č”) or – you name it.
Best wishes,
Otto Stolz
2018-03-09 12:09 GMT+01:00 Mark Davis ☕️ via Unicode
:
De Papscht hät z’Schpiäz s’Schpäkchbschtekch z’schpaat bschtellt.
literally: The Pope has [in Spiez] [the bacon cutlery] [too late]
ordered.
Am 2018-03-09 um 12:52 schrieb
Hello,
am 2017-07-07 um 20:45 Uhr hat Asmus Freytag geschrieben:
I also checked whether there are accessible homework assignments that
mention Unicode ("Hausaufgabe Unicode"). I didn't go very deep, but it
seems that it's not untypical to relegate Unicode to a sidebar,
explaining the "\u"
Hello,
am 2017-07-07 um 17:14 Uhr hat William_J_G Overington geschrieben:
I found that the character a tilde as I now know it to be called is only used
in Portuguese.
Just for the record:
“Ô is used in Portuguese, Kashubian;
“Ñ” is used in Galician, Spanish, Mirandese, Catalan (only for
Hello,
on 03.07.2017 19:01, Otto Stolz via Unicode wrote:
Since German ist the only language using “ß” (if I am not mistaken), […]
Am 2017-07-03 um 20:15 Uhr hat Gerrit Ansmann geschrieben:
Some old Sorbian (blackletter) orthographies also employed the ß. It was
also used at the beginning
Hello,
am 2017-06-30 um 17:34 Uhr hat Michael Everson geschrieben:
It would be sensible to case-map ß to ẞ however.
Since German ist the only language using “ß” (if I am
not mistaken), Unicode should comply with the official
German orthographic rules with respect to this letter.
As I have
Hello,
am 2017-07-03 um 18:16 Uhr habe ich geschrieben:
This rule did hold for all consonants, there’s nothing
particular about double-s.
On 2017-07-03 at 18:05 Jörg Knappen had written:
the hyphenation oddity … never affected the letter s.
Jörg is right. I forgot the additional rule that
Hello,
der Rat für deutsche Rechtschreibung which is responsible
for the further development of the official German ortho-
graphy has finally recognized LATIN CAPITAL LETTER SHARP S
as a possible upper-case equvalent for the LATIN SMALL
LETTER SHARP S.
The report announcing the change is dated
Helo,
Am 31.03.2017 um 09:57 schrieb Eli Zaretskii:
Arial Unicode MS supports that character [U+23E8], FWIW.
From: Otto Stolz<otto.st...@uni-konstanz.de>
Date: Tue, 4 Apr 2017 15:21:02 +0200
Not on my good ole Wndows XP SP3 system.
On 4/4/2017 7:58 AM, Eli Zaretskii wrote:
Thi
Am 31.03.2017 um 09:57 schrieb Eli Zaretskii:
Arial Unicode MS supports that character [U+23E8], FWIW.
Not on my good ole Wndows XP SP3 system.
Best wishes,
Otto
Hello Michael, others,
On 2017/03/23 09:03, Michael Everson wrote:
Its the same diphthong (a sound) written with different
letters.
Am 23.03.2017 um 06:54 schrieb Martin J. Dürst:
I think this may well be the *historically* correct analysis. And that
may have some influence on how to encode
Am 18.11.2016 um 00:31 schrieb Doug Ewell:
Or, people could just say what they mean, using language.
This is not so easy, as already Lewis Carroll had seen,
cf. this snippet from “Alice in Wonderland”:
“Then you should say what you mean,” the March Hare went on.
“I do,” Alice hastily replied;
/codes.asp>
may provide partial answers.
Best wishes,
Otto Stolz
Ciao,
il 2016-06-22 alle 00:02 announceme...@unicode.org ha scritto:
Version 9.0 of the Unicode Standard is now available.
…
MOTOR SCOOTER
Almost exactly 70 years after its invention, “la vespa” has
found her way into Unicode. I have related that important news,
immediately, to the members
ending on the following key; the »~« key works as tilde accent
on the letter typed subsequently; and so on. This scheme allows the
conventional QWERTZ hardware to be used for multilingual typing –
with minimal re-learning and training. And still the »µ« key produces
the »µ« character :-)
Best wishes,
Otto Stolz
otonic Greek, and Yiddish) for personal use,
long ago.
Best wishes,
Otto Stolz
Hello,
Am 19.03.2016 um 17:40 schrieb Doug Ewell:
As one anecdote (which is even less like "data" than two anecdotes), I
could not find any of the characters IJ ij DŽ Dž dž LJ Lj lj NJ Nj nj or their hex
equivalents in any of the CLDR keyboard definitions. I'd imagine that
users just type the two
Hello,
I am wondering how U+02B9 MOFIFIER LETTER PRIME made
its way into the Unicode repertoire, and how it
acquired its comment “transliteration of mjagkij znak
(Cyrillic soft sign: palatalization)“.
ISO/R 9:1954 through ISO/R 9:1986 map the mjagkij znak
“ь” to the apostrophe, and so does DIN
with the standard.
Best wishes,
Otto Stolz
Am 10. September 2015 um 20:04 h schrieb Peter Constable:
[…] creating a Web page containing (say) some Latin characters
- not obscure, […] to use (say) Notepad and entering HTML
numeric character references; and that my findings were that
it worked.
Q1: Would you find that to be an
evidence that
an additional character really needs to be encoded, e. g.
because it is already widely used in print and cannot be
represented in Unicode.
Best wishes,
Otto Stolz
) that
the problem manifests itself (mainly or only) with italic style
letters; hence there remains virually no problem with normal
(non-italic) style.
Best wishes,
Otto Stolz
___
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo
for you, feel free to ask me privately, for a copy
of the font, configured for Serbian/Macedonian. I can send you
that copy, without any obligation to maintain it or to adapt forth-
coming versions.
Best wishes,
Otto Stolz
___
Unicode mailing list
described.
But how can they be described within the framework of Unicode?
Bemused,
Otto Stolz
already been mentioned
by other contributions to this thread.
Best wishes,
Otto Stolz
than 30 different dead keys).
I guess, also for a Viatnamese keyboard layout, sequences
of dead keys would come handy.
Best wishes,
Otto Stolz
, significant).
Best wishes,
Otto Stolz
Hello,
Am 16.02.2013 11:48, schrieb Stephan Stiller:
Or a non-name example: Buße (repentance)
vs Busse (buses). But then, non-name examples are far less likely to
remain ambiguous in context.
Years ago, I have seen with my own eyes, in a Swiss magazine
(where they consistently replace “ß”
with the UTF-32 specs;
so, if that idea would ever materialise, it would have to sail
under new colours.
Best wishes,
Otto Stolz
much clearer glyphs than the mimeograph
technique.
Best wishes,
Otto Stolz
Hello,
Leo Broukhis hatte geschrieben:
In Russian, the difference between Е and Ё is primary at the beginning
of a word as they are considered distinct letters of the alphabet, yet
secondary in the middle of a word, as the dieresis over Ё is not
mandatory. As an example, ель ёлка, but тёлка
the surrogates from my UTF-8 MPE
was really that I needed additional space for the user’s
guide on the reverse side.
Cheers,
Otto Stolz
Hello,
2012/12/16 Otto Stolz otto.st...@uni-konstanz.de
The reason I excluded the surrogates from my UTF-8 MPE
was really that I needed additional space for the user’s
guide on the reverse side.
Sorry, typo; I meant: “my UTF-16 MPE”. I added that
extra row (with the branch excluding
), and its two followers for
more UTFs. Display or print with a fixed-pitch font,
such as Lucida Console or Courier New. Enjoy!
Cheers,
Otto Stolz
Side 1 (print and cut out):
++---+---+--+
| U+ | yy zz |Cima's UTF-8 Magic | Hex= |
| U+007F
Hello Manuel,
am 2012-08-20 01:05, schrieb Manuel Strehl:
I'm looking for a data source, that maps countries to scripts used in
them. The target application is a visualization in the context of my
codepoints.net site, namely http://codepoints.net/scripts.
At the moment I've extracted the
tell the difference.
Best wishes,
Otto Stolz
,
Otto Stolz
(or shortcomings) in existing fonts, editors,
keyboard drivers, and other software and suggest
(or even provide) better solutions. Or you could
publish tutorials or examples of good practice.
In any case, it would be wise to know the existing
standards and comply with them.
Best wishes,
Otto Stolz
://www.unicode.org/faq/char_combmark.html#11,
and related FAQs.
Before asking for “improvements”, you should be familiar
with the pertinent FAQ collection, at the very least.
Sincerely,
Otto Stolz
/TR/html401/charset.html#h-5.3.
These may refer to any Unicode character, whatsoever.
However, they will take considerably more storage space
(and transmission bandwidth) than the UTF-8 encoded
characters would take.
Good luck,
Otto Stolz
: Mangal, Tahoma
#1607;zwj; zwj;#1607;zwj; zwj;#1607;/h1
/html
I. e., U+0647, with ZWJ on either, or both, sides, in various
font families.
Best regards,
Otto Stolz
be this: do you know of any browser that can handle
your test case?
Firefox 10.0.2 apparently does it right;
Opera 10.52 does a half-hearted job;
and Internet Explorer 8.0.6001.18702 completely screws it up.
Cf. attached screen shots.
Best wishes,
Otto Stolz
attachment: ZWJ-Opera.pngattachment
, that in the former case, the upside-down text will
overlap the following line (in this test a horizontal ruler).
Best wishes,
Otto Stolz
attachment: Test_UpsideDown.pngTitle: Test txtUpsideDown
福福福
福福
Van Anderson Van Anderson Van Anderson
Van Anderson Van Anderson
as a
distinct pair of letters, ro go between SS and ST.
Best wishes,
Otto Stolz
to single Unicode characters, and that keystroke sequences
are not necessarily mapped to character sequences. I hope well,
that Arns can exploit this freedom of the mapping to find an
ergonomically, and linguistically, pleasing keyboard layout for
the purpose at hand.
Best wishes,
Otto Stolz
community – and you will have to find a decent font
to display the Unicode characters (and sequences thereof), according
to the rules of your orthography.
Best wishes,
Otto Stolz
Hello,
Am 2010-08-31 16:57, schrieb Janusz S. Bień:
Can the diacritic be interpreted
as an already exisiting combining character?
Perhaps:
0326 Combining comma below
0329 Combining vertical line below
0337 Combining short solidus overlay
Cheers,
Otto Stolz
out how to best
represent the other two with Unicode.
The Daler sign resembles closely the GERMAN PENNY SIGN (U+20B0).
Best wishes,
Otto Stolz
Hello,
am 2010-08-08 18:56, schrieb António MARTINS-Tuválkin:
We all know why is good to have U+02BC separated from U+2019,
Which one is recommended, when transliterating, as the Latin equvalent
of the Cyrillic letter Soft Sign (044C)?
Thanks for any hints,
Otto Stolz
Am 2010-08-07 04:19, schrieb Murray Sargent:
In general to type in a character by its Unicode value,
type in the hex value and then alt+x.
In some MS programs, e. g. the German version of MS Word,
it’s rather Alt-C, as Alt-X is endowed with some other meaning.
Best wishes,
Otto Stolz
such, in the
versions to come).
• To find particular character assignments, start at
http://www.unicode.org/charts/.
Good luck,
Otto stolz
Hello Tulasi,
on 2010-06-18 04:24, you have asked:
Or do Unicode ISO/IEC use different number name for same letter/symbol?
You might find enlightening the FAQ on “Unicode and ISO 10646”
http://www.unicode.org/faq/unicode_iso.html.
Best wishes,
Otto Stolz
Hello,
am 2010-06-09 15:10, schrieb Frédéric Grosshans:
I think adding the relevant few lines in the Archive of
Notices of Non-Approval http://www.unicode.org/alloc/nonapprovals.html
might be useful
Also an FAQ entry might be useful. I just have submitted a suggestion.
Best wishes,
Otto
”; there is no language
known that has “hexadecim”, or anything alike, for 16.
Best wishes,
Otto Stolz
as the common, decimal
digit “9”, which would render any special Nystrom font
rather misleading to the reader.
Best wishes,
Otto Stolz
marks it as either a single ASCII byte,
a starting, or a continuation byte; hence you have not to
go back to the beginning of the whole data stream to recognize,
and decode, a group of bytes.
Best wishes,
Otto Stolz
of every number
you are going to parse. This stems from the fact that
the same digits are used for all number systems. Note
that Unicode is a character-encoding standard, hence
cannot do anything about this sort of ambiguity.
Best wishes,
Otto Stolz
the pertinent passage in the Unicode standard,
http://www.unicode.org/versions/Unicode5.2.0/ch05.pdf#G29675.
Good luck,
Otto Stolz
Mark Davis schrieb:
This is just a confusion among the hoi polloi.
And here we have yet another example: hoi is Greek for the
(hoi polloi = the many).
Best wishes,
Otto Stolz
Or, perhaps, you could (ab)use asterisks, or dots instead, e. g.
U+2042 Asterism (i. e. 3 asterisks, in a triangle shape)
U+2051 Two asterisks aligned vertically
U+22EE Vertical ellipsis
Best wishes,
Otto Stolz
in),
but I am sure I could find better examples if I would try in earnest.
Best wishes,
Otto Stolz
(growth tube)
(waxtube)
Best wishes,
Otto Stolz
is also UCS-2LE encoded).
Best wishes,
Otto Stolz
.
Best wishes,
Otto Stolz
://jigsaw.w3.org/css-validator,
and http://validator.w3.org/checklink.
Good luck,
Otto Stolz
: It does not help much
to simply add the proper Unicode-related HTML code at the top;
rather, you have to make sure that your HTML code is encoded properly,
and that the reader's browser will know about its encoding.
Best wishes,
Otto Stolz
compare http://www.microsoft.com/OpenType/OTSpec/WGL4E.HTM
to the pertinent ranges in THE BOOK.
When I wrote the advice quoted supra, I was mainly thinking of
some Punctuation and Symbols I'd like to use, which are not in WGL4.
Best wishes,
Otto Stolz
).
Best wishes,
Otto Stolz
| not found
---+++-
http://www.??.net/
http://www.%C9%99%C9%9B.net/
http://www./??.net/
http://%c5%bc%c3%b3%c5%82w.pl/
* Not found on 1st try after IE start,
IE 6 hung on subsequent tries
Best wishes,
Otto Stolz
Peter Kirk schrieb:
There appear to be two errors (not listed in the errata page
http://www.unicode.org/errata/) in Figure 15.2 on page 391 of The
Unicode Standard 4.0, the online version at
http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf.
The fourth and last column of the table appears
will consider this approach for my own work ;-)
Normally, I use Notepad, and Command Debug, on my XP system
for quick conversions.
Best wishes,
Otto Stolz
the discussion on MPEs, which are toys,
after all (though they could also be used to visualize the
three UTF encodings).
Cheers,
Otto Stolz
Side 1 (print and cut out):
╔╦═══╦═══╦══╗
║ U+ ║ yy zz ║Cima’s UTF-8 Magic ║ Hex↔ ║
║ U+007F
easily add a final step
to compute UTF-16LE, or to add a BOM).
Definitely, the world has longed for this, for years ;-) Enjoy!
Cheers,
Otto Stolz
þÿ A v e r s : P r i n t w i t h a f i x e d - w i t h f o n t , s u c h
a s L u c i d a C o n s o l e ,
a n d c u t o u t
replaced with an ß letter.
The rationale for the Swiss keyboard design is that the accented
characters (for French and Italian) were less dispensable than the
ß (only used in German, and easily replaced with the ss Digraph).
Again, best wishes,
Otto Stolz
books or love poetry can also render 'Mein Kampf'.
You cannot devise an aphabet incapable of spelling swear-words.
Cheers,
Otto Stolz
Azzedine Ait Khelifa wrote:
how i can send
mail, write word document using Arial Unicode MS font ?
Cf. http://www.alanwood.net/unicode/.
After having read, tested, and understood (at large) these pages,
try and ask more specific questions on the Unicode list.
Good luck,
Otto Stolz
. http://support.microsoft.com/default.aspx?scid=KB;en-us;q241538.
Karljürgen Feuerherm continued:
I'm looking for some other product which might suit the purpose, either free
or at least not expensive.
Mozilla 1.3 comprises a very reasonable e-mail client.
Best wishes,
Otto Stolz
.
Again, I recommend to read
http://support.microsoft.com/default.aspx?scid=KB;en-us;q241538.
Best wishes,
Otto Stolz
-7.2
and http://www.w3.org/TR/html401/charset.html#h-5.2.2.
Cheers,
Otto Stolz
, there is not comprehensive list of openers vs. closers possible.
Best wishes,
Otto Stolz
PS. In these tow languages, the quote-marks are paired thusly:
en_US: U+201C ... U+201D, and U+2018 ... U+2019
de_DE: U+201E ... U+201C, and U+201A ... U+2018
/doc/806-0634/6j9vo5akm?a=view
- http://docs.sun.com/db/doc/806-0624/6j9vek59d?a=view
Best wishes,
Otto Stolz
Yung-Fong Tang wrote:
could you point out which symbol in that two images need to be proposed?
either by using red ciricle on the image or tell use the surrounding text.
Thanks
http://www.rz.uni-konstanz.de/Antivirus/tests/genealog/
unter Mitw. von Dieter Berger ...]. - 19., neu bearb.
u. erw. Aufl. ISBN: 3-411-20900-3.
Best wishes,
Otto Stolz
askq1 askq1 wrote:
Actually my requirement is striaght-forward/common and I believe it
should be available somewhere on net.
In particular I need source code (or some way) for following requirements:
- Convert Unicode code-point to UTF8 encoding and vice-versa.
- Convert Unicode code-point to
I had written:
CP 1250 contains the ISO 8859-1 characters, hence it is not
suited for slavic laguages.
Eric Muller wrote:
I suspect that Otto meant to type CP 1252 contains...
Of course. Thanks for the correction.
Cheers,
OS
/docs/mod/core.html#adddefaultcharset,
http://httpd.apache.org/docs/mod/mod_mime.html#addcharset,
http://httpd.apache.org/docs/configuring.html, and
http://httpd.apache.org/docs/configuring.html#htaccess.
Best wishes,
Otto Stolz
Kenneth Whistler wrote:
we can
calculate the weight as being *approximately* 9.05 pounds
(avoirdupois) [or 10.99 troy pounds].
Apparently a weighty publication, that forthcoming Unicode standard...
Cheers,
Otto Stolz
, the relativistic
effects must be taken into account. I hope that the editors took
pains to find a wording that will not upset anybody to the extend
that he would throw the book away at a considerable fraction of
the speed of light...
Best wishes,
Otto Stolz
Best wishes,
Otto Stolz
://czyborra.com/charsets/codepages.html#CP1250
http://czyborra.com/charsets/iso8859.html#ISO-8859-2
Best wishes,
Otto Stolz
,
or the WWW page to be displayed requires a particular, yet
unsuitable font.
Cf. http://www.alanwood.net/unicode/ for more info on fonts
and browsers.
Best wishes,
Otto Stolz
.
F725 resembles U+20B0 GERMAN PENNY SIGN, which is probably a script d,
derived from the Latin word denarius. (Just add an upstroke on the
left hand of the Verdana PUA character.)
This is not convincing either, I know. Just my 0,02 ¤.
Best wishes,
Otto Stolz
(plain), or
- mark the plain (unabbreviated) occurence of the characters,
e. g., U+006D U+U+200B U+006D (plain) vs. U+006D U+006D (abbr.).
I'd prefer the former one, because it marks the deviation from the
prevalent usage.
Best wishes,
Otto Stolz
).
This problem was even worse with the medieval Textura font.
Hence, medieval scribes developed a rich set of abbreviations,
including the overbar for an omitted m or n. The latter has
survived into German handwriting, at least until the 1st half of
the 20th century.
Best wishes,
Otto Stolz
PS
Dominikus Scherkl wrote:
i. e. is an latin abbreviation for in exemplum meaning for example
not that is.
i. e. = id est = that is
e. g. = exempli gratia = for example
Cassel's English-German Dictionary, ISBN 0-02-522920-6, also says so.
Best wishes,
Otto Stolz
that recodes your 8-bit mail, you cannot claim that 8-bit mail is
supported everywhere; you can only claim that your server compensates
for the incompatibility of your MUA and the world at large.
Best wishes,
Otto Stolz
4.01 is defined in http://www.w3.org/TR/html401/;
recommended reading for every WWW author!
Best wishes,
Otto Stolz
1 - 100 of 225 matches
Mail list logo