Mijan scripsit:
>
>> Let's consider the ra+virama+ya case. In the mostpart the
ra+virama+ya is
>> displayed as ya+reph. This obviously seems to be an
>> instance of ambiguous interpretation because ra+virama+ya could
>>also represents
>> ra+ja-phalaa. ya+reph and ra+ja-phalaa are used in different
Ram Viswanadha wrote:
There is also some information at
http://oss.software.ibm.com/icu/docs/papers/binary_ordered_compression_for_unicode.html#Test_Results
Not sure if this is what you are looking
for.
thanks. not really. I am not look into the r
> -Original Message-
>
> Date/Time:Fri Mar 7 12:44:47 EST 2003
> Contact: [EMAIL PROTECTED]
> Report Type: Other Question, Problem, or Feedback
>
> I was wondering when writing code for a program in Visual
> Basic.NET. Just a very very simple code that converts
> characte
On Fri, Mar 07, 2003 at 17:27:08 +0100, David Oftedal wrote:
> We're not necessarily talking about Latin here. In Norwegian and Danish,
> æ is not a ligature, but a separate sound almost unpronounceable by
> English speakers.
I believe æ is also a character in the IPA.
Noah
John Hudson schreef:
> The most problematical part of this is that 8-bit codepages supporting
> Romanian use the old S and T with *cedilla* codepoints, not the new S and
T
> with comma codepoints.
Apple updated their Romanian codepage shortly after those new characters
appeared, five years ago.
N
At 01:49 AM 3/7/2003, Pim Blokland wrote:
Ah yes, the cedillas; now these are ambiguous!
What is the "correct form" for cedillas under N, K, L, R, S and T? What
should these look like? The fonts I've seen disagree on all of them: some
have commas, others have "real" cedillas.
Since Unicode 3.0 cam
At 08:23 -0800 2003-03-07, Doug Ewell wrote:
The names themselves are normative, of course. What is not normative is
the distinction between the terms LETTER, LIGATURE, and DIGRAPH used in
the names. Just wanted to clarify that for Pim.
I didn't say the names are not normative. I said the terms
> Actually, it is of orthographic significance: it is not
> uncommon for good fonts to have an fj ligature.
That typography, not orthography.
But I would appreciate if more fonts had an fj ligature, and
(e.g.) a gj ligature too (in some fonts gj otherwise have
overlapping glyphs).
/ken
What an interesting character ij, or y is. It really shows how languages
evolve over time. As for the æ:
How do you know that? Either "Caesar" or "Cæsar" is good Latin.
We're not necessarily talking about Latin here. In Norwegian and Danish,
æ is not a ligature, but a separate sound almost
Kent Karlsson scripsit:
> Ligating ae into æ works for Latin
> and sometimes English (could be done via a "smart" font).
Always for English, I think: if someone finds a counterexample, let them
use a + ZWNJ + e.
> Note that e.g. an fj
> ligature is just as legitimate and useful as an fi ligatu
Michael Everson wrote:
>> You mean that both ae and ij should be called ligatures, although one
>> is fused and the other isn't?
>> OK, I can live with that. I'd rather the ij were called a digraph,
>> though.
>
> These terms are not normative. Get used to it.
The names themselves are normative,
Pim Blokland scripsit:
> The ij is considered by some to be one letter in Dutch, and when written
> down, an "i" and a "j" together look very much like a written y with
> diaeresis. (See fonts like Script MT.) So I can understand foreigners
> getting confused and encoding it that way (as a y with
> > Typographically, it's a ligature either way.
>
> You mean that both ae and ij should be called ligatures,
> although one is fused and the other isn't?
No. What I'm trying to say is that the names do not really matter.
While there is a strive to give "good" names to characters,
they sometimes
> > E.g., it is quite legitimate to render, e.g. LIGATURE FI as
> an f followed
> > by an i, no ligation, whereas that is not allowed for the ae
> > ligature/letter, nor for the oe ligature.
>
> How do you know that? Either "Caesar" or "Cæsar" is good Latin.
That's the other way around. Ligatin
Kent Karlsson scripsit:
> E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed
> by an i, no ligation, whereas that is not allowed for the ae
> ligature/letter, nor for the oe ligature.
How do you know that? Either "Caesar" or "Cæsar" is good Latin.
--
After fixing the Y2K
At 15:36 +0100 2003-03-07, Pim Blokland wrote:
Kent Karlsson schreef:
Typographically, it's a ligature either way.
You mean that both ae and ij should be called ligatures, although one is
fused and the other isn't?
OK, I can live with that. I'd rather the ij were called a digraph, though.
These t
Kent Karlsson schreef:
> Typographically, it's a ligature either way.
You mean that both ae and ij should be called ligatures, although one is
fused and the other isn't?
OK, I can live with that. I'd rather the ij were called a digraph, though.
The ij is considered by some to be one letter in Du
On Fri, 7 Mar 2003, John H. Jenkins wrote:
> since different people speaking different languages
> often have different perceptions of what a symbol is.
Reminds me of ISIRI 3342 that officially considered symbol and character
the same thing and used one word ("namaad", Noon, Meem, Alef, Dal) for
On Thu, Mar 06, 2003 at 02:25:19PM -0500, Dean Snyder wrote:
> Ben Yehuda is a "modern" Hebrew dictionary, and, as I noted in my
> original email, I have little experience in modern, Israeli Hebrew -
> maybe the orthography is different there, I just don't know. Which is why
> I was limiting my rem
David Oftedal schreef:
> Hm, this whole concept seems stupid if you ask me.
That's beside the point. The issue of this discussion is not how stupid this
all is, but how consistent is the description of the UnicodeData.txt file.
So I DO care whether I should call something a digraph or a ligature.
On Friday, March 7, 2003, at 04:26 AM, Pim Blokland wrote:
Oh, in that case I must say I think the UnicodeData.txt file doesn't
do a
very good job.
For instance, the Danish ae (U+00E6) is not designated a ligature, but
the
Dutch ij (U+0133) is, even though the "a" and "e" are clearly fused
toget
> > For instance, the Danish ae (U+00E6) is not designated a ligature,
>
> It was in Unicode 1.0; I think politics were involved in that one.
> In Latin use, ae is most certainly a ligature, and likewise in the
> languages (including English) that have borrowed words involving it.
> In Danish use,
Oh, in that case I must say I think the UnicodeData.txt file doesn't do a
very good job.
For instance, the Danish ae (U+00E6) is not designated a ligature, but the
Dutch ij (U+0133) is, even though the "a" and "e" are clearly fused
together, while the "i" and "j" aren't.
Hm, this whole concept se
The names do NOT always provide correct descriptions of the
characters. This is especially true for "digraph" and "ligature"
(and in the case of U+00E6 too), as well as (e.g.) SCRIPT CAPITAL P,
which is neither script, nor capital (it's lowercase), though
it is a p... In addition, there are diff
Pim Blokland scripsit:
> For instance, the Danish ae (U+00E6) is not designated a ligature,
It was in Unicode 1.0; I think politics were involved in that one.
In Latin use, ae is most certainly a ligature, and likewise in the
languages (including English) that have borrowed words involving it.
In
> > By the way, although Unicode calls it a cedilla, the
> correct form to use
> > with G is the disconnected, 'under comma' form.
>
> Ah yes, the cedillas; now these are ambiguous!
> What is the "correct form" for cedillas under N, K, L, R, S
> and T? What should these look like?
Well, the e
Pim Blokland scripsit:
> Now I must admit, I haven't come across many texts which used Ts with
> cedillas. Not in printed form, that is; the only ones I have seen were in
> electronic form, where their appearance depends on the font used.
T with cedilla should never have existed. When s with com
John Cowan schreef:
> Digraphs and ligatures are both made by combining two glyphs. In a
digraph,
> the glyphs remain separate but are placed close together. In a ligature,
> the glyphs are fused into a single glyph.
Oh, in that case I must say I think the UnicodeData.txt file doesn't do a
very
John Hudson schreef:
> By the way, although Unicode calls it a cedilla, the correct form to use
> with G is the disconnected, 'under comma' form.
Ah yes, the cedillas; now these are ambiguous!
What is the "correct form" for cedillas under N, K, L, R, S and T? What
should these look like? The font
29 matches
Mail list logo