Elaine Keown
Tucson
Hi,
Asmus wrote:
>Only very few foldings make sense to apply on a
>permanent basis. Think of casefolding for example.
>Such a folding is mostly useful for searches, where
>it is applied *transiently*.
Is it possible that Hebrew script needs more than one
; [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk
> Sent: Monday, July 19, 2004 8:53 PM
> To: Mark E. Shoulson
> Cc: Jony Rosenne; 'Unicode List'
> Subject: Re: Folding algorithm and canonical equivalence
>
>
> On 19/07/2004 03:20, Mark E. Shoulson wrote:
>
&
Peter Kirk wrote:
On 19/07/2004 03:20, Mark E. Shoulson wrote:
...
Jony's right: when it's down to brass tacks in Hebrew, it's
consonants and whitespace (and punctuation, I guess).
Agreed. But then there are a few characters which are not combining
marks but which are really part of the accent s
Mark E. Shoulson wrote:
Even so, there's probably some language out there that requires some
diacritics left in place on Hebrew letters (I don't know much about
other languages written in Hebrew letters; Elain Keown knows that
better).
I have printed texts in Ladino and Arabic in Hebrew script w
Elaine Keown
Tucson
Dear Mark and List:
I have even less of an idea than usual what on earth
you are all talking about, but
Today I am working on the 6th set of Hebrew
diacritics. They are called 'Palestinian' and are
found exclusively in the Cairo Genizah material.
The 'Cai
On 19/07/2004 23:23, Asmus Freytag wrote:
At 01:56 PM 7/19/2004, Mark Davis wrote:
You did point out an oversight; Asmus and I have been working on the
issue.
‎Mark
As Mark wrote, your point is taken and we've taken that onboard.
However, we won't try to *edit* text on the list, that's why we
At 01:56 PM 7/19/2004, Mark Davis wrote:
You did point out an oversight; Asmus and I have been working on the issue.
Mark
As Mark wrote, your point is taken and we've taken that onboard. However,
we won't try to *edit* text on the list, that's why we are not engaging in
a long discussion on the
You did point out an oversight; Asmus and I have been working on the issue.
âMark
- Original Message -
From: "Peter Kirk" <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Sent: Monday, July 19, 2004 13:21
Subject: Back to the subject:
There has been extensive discussion in this thread on the specifics of
accent and diacritic folding. But no one has answered my point, repeated
below, that there seems to be a conflict between the folding algorithm
(rather than the details of specific foldings) and the principle of
canonical eq
ECTED] On Behalf Of Peter Kirk
> Sent: Monday, July 19, 2004 8:53 PM
> To: Mark E. Shoulson
> Cc: Jony Rosenne; 'Unicode List'
> Subject: Re: Folding algorithm and canonical equivalence
>
>
> On 19/07/2004 03:20, Mark E. Shoulson wrote:
>
> > ...
> >
&g
At 02:38 AM 7/19/2004, Michael Everson wrote:
At 22:25 -0400 2004-07-18, Mark E. Shoulson wrote:
Though for all that, a lot of Yiddish I've seen is also written without
vowel-points. So the patah-alef and qamats-alef vowels, and the
yod-yod-patah vs. yod yod diphthongs, must be distinguished fro
On 19/07/2004 03:20, Mark E. Shoulson wrote:
...
Jony's right: when it's down to brass tacks in Hebrew, it's consonants
and whitespace (and punctuation, I guess).
Agreed. But then there are a few characters which are not combining
marks but which are really part of the accent system and so shoul
At 22:25 -0400 2004-07-18, Mark E. Shoulson wrote:
Though for all that, a lot of Yiddish I've seen is also written
without vowel-points. So the patah-alef and qamats-alef vowels, and
the yod-yod-patah vs. yod yod diphthongs, must be distinguished from
context, like everything else.
For much of
At 07:53 PM 7/18/2004, Jony Rosenne wrote:
By this logic, I cannot see why you lump Latin/Greek/Cyrillic together.
Latin/Greek/Cyrillic share the fact that for searches you may want to
remove accents, but, except for very unusual circumstances, it's not a good
idea to transform text permanently.
Sent: Monday, July 19, 2004 12:16 AM
> To: Peter Kirk
> Cc: John Cowan; Unicode List; jony Rosenne
> Subject: Re: Folding algorithm and canonical equivalence
>
>
> At 05:25 AM 7/18/2004, Peter Kirk wrote:
> >I accept that there might be some script-specific cases in which
Michael Everson wrote:
At 13:00 +0300 2004-07-18, Jony Rosenne wrote:
> Jony is arguing to extend AccentFolding to Hebrew (fold to
unpointed). His
suggestion is to fold *all* combining marks used with Hebrew
in that case.
I want to double check that he really means all combining
marks in the
Jony Rosenne wrote:
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Asmus Freytag
Sent: Sunday, July 18, 2004 10:53 AM
To: John Cowan
Cc: Peter Kirk; Unicode List; jony Rosenne
Subject: Re: Folding algorithm and canonical equivalence
Jony is
Asmus Freytag scripsit:
> There are two options for a starting set:
> select all 'accents' (note, not baseforms) that occur in some
> precomposed character. And then add additional ones on a case by case
> basis (e.g. stroke overlay).
>
> Or, start with all gc=Mn from the 0300 and 1DC0 blocks (
Peter Kirk scripsit:
> Anyway, is Yiddish in fact never written completely unpointed? That
> would surprise me.
It might have happened at some point, but the standard (YIVO) Yiddish
orthography would become illegible if points were stripped.
--
Principles. You can't say A is John Cowa
On 18/07/2004 22:15, Asmus Freytag wrote:
At 05:25 AM 7/18/2004, Peter Kirk wrote:
I accept that there might be some script-specific cases in which
particular accents should not be removed. The breve in Cyrillic i
kratkoe might be an example; but then this might be rather too
language-specific a
At 05:25 AM 7/18/2004, Peter Kirk wrote:
I accept that there might be some script-specific cases in which
particular accents should not be removed. The breve in Cyrillic i kratkoe
might be an example; but then this might be rather too language-specific
as well. But these should be clearly define
At 05:28 AM 7/18/2004, Peter Kirk wrote:
I can see that there might be cases when the Hebrew folding should be
invoked without other scripts being affected. But I think that anyone
applying a general accent or diacritic folding would expect this to
include all Hebrew (and Arabic, Syriac etc) com
At 10:43 AM 7/18/2004, Jony Rosenne wrote:
If folding is not suitable for Yiddish texts or Biblical texts or ancient
Greek texts or any other text then I suggest that the user of said text
seriously considers not using folding.
Only very few foldings make sense to apply on a permanent basis. Think
At 20:43 +0300 2004-07-18, Jony Rosenne wrote:
> In the Hebrew language, perhaps. But in other languages, like
Yiddish, which use the Hebrew script, at least some points are NOT
optional, and "dropping" them causes textual corruption and loss of
data.
Dropping them always causes loss of data. T
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson
> Sent: Sunday, July 18, 2004 2:51 PM
> To: 'Unicode List'
> Subject: RE: Folding algorithm and canonical equivalence
>
>
> At 13:00
On 18/07/2004 12:51, Michael Everson wrote:
At 13:00 +0300 2004-07-18, Jony Rosenne wrote:
> Jony is arguing to extend AccentFolding to Hebrew (fold to
unpointed). His
suggestion is to fold *all* combining marks used with Hebrew
in that case.
I want to double check that he really means all com
On 18/07/2004 08:56, Asmus Freytag wrote:
At 11:17 PM 7/17/2004, John Cowan wrote:
Peter Kirk scripsit:
> But I think the best thing to do is to drop *all* Hebrew
> combining marks; the result of this is valid unpointed Hebrew.
I agree.
OK, in my last message I was cofused, this was Peter's sugges
On 18/07/2004 08:52, Asmus Freytag wrote:
At 11:15 PM 7/17/2004, John Cowan wrote:
I agree that in the TR#30 context, the Right Thing is to remove the
character pair mappings altogether, and all of the single-character
mappings that have canonical decompositions
In other words, in your opinion, th
At 13:00 +0300 2004-07-18, Jony Rosenne wrote:
> Jony is arguing to extend AccentFolding to Hebrew (fold to
unpointed). His
suggestion is to fold *all* combining marks used with Hebrew
in that case.
I want to double check that he really means all combining
marks in the
> Hebrew block, or jus
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Asmus Freytag
> Sent: Sunday, July 18, 2004 10:53 AM
> To: John Cowan
> Cc: Peter Kirk; Unicode List; jony Rosenne
> Subject: Re: Folding algorithm and canonical equivalence
>
W liście z sob, 17-07-2004, godz. 16:46 -0700, Asmus Freytag napisał:
> I wonder whether that's truly intended, or whether it could be replaced
> by a combination of
>
> AccentFolding
> OtherDiacriticFolding
>
> where AccentFolding removes *all* nonspacing marks following Latin, Greek
> or Cyri
At 11:17 PM 7/17/2004, John Cowan wrote:
Peter Kirk scripsit:
> But I think the best thing to do is to drop *all* Hebrew
> combining marks; the result of this is valid unpointed Hebrew.
I agree.
OK, in my last message I was cofused, this was Peter's suggestion and Jony
had seconded it.
I take it
At 11:15 PM 7/17/2004, John Cowan wrote:
I agree that in the TR#30 context, the Right Thing is to remove the
character pair mappings altogether, and all of the single-character
mappings that have canonical decompositions
In other words, in your opinion, the reasonable thing to do would be for
some
Asmus Freytag scripsit:
> John, you proposed the initial set. Do you have any suggestion here?
My original submission had only the single-character mappings, not the
character pair mappings, which are just the result of decomposing the
precomposed set and don't IMHO make much sense: they are too
Peter Kirk scripsit:
> But I think the best thing to do is to drop *all* Hebrew
> combining marks; the result of this is valid unpointed Hebrew.
I agree.
--
Schlingt dreifach einen Kreis vom dies!John Cowan <[EMAIL PROTECTED]>
Schliesst euer Aug vor heiliger Schau, http://www.reuters
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Asmus Freytag
> Sent: Sunday, July 18, 2004 2:46 AM
> To: Peter Kirk; Unicode List
> Cc: [EMAIL PROTECTED]
> Subject: Re: Folding algorithm and canonical equivalence
>
>
On 18/07/2004 00:46, Asmus Freytag wrote:
Thank you for reviewing this.
DiacriticFolding (unlike AccentFolding) is selective about which
combining marks it removes for which base character. I wonder whether
that's truly intended, or whether it could be replaced by a
combination of
AccentFolding
Thank you for reviewing this.
DiacriticFolding (unlike AccentFolding) is selective about which combining
marks it removes for which base character. I wonder whether that's truly
intended, or whether it could be replaced by a combination of
AccentFolding
OtherDiacriticFolding
where AccentFolding
I was just reviewing the UTR #30 draft in response to Rick's notice
about it. And I believe I may have found a point in which the folding
algorithm as given may violate the principle of canonical equivalence.
But I would like some clarification from list members before providing
formal input on
39 matches
Mail list logo