Thank you, although the word break does still affect things like
double-clicking to select.

And people do seem to want to use U+02BC for this reason (and I'm trying to
articulate why that isn't what U+02BC is meant for).

James

On Fri, Jan 25, 2019 at 12:34 PM Mark Davis ☕️ <m...@macchiato.com> wrote:

> U+2019 is normally the character used, except where the ’ is considered a
> letter. When it is between letters it doesn't cause a word break, but
> because it is also a right single quote, at the end of words there is a
> break. Thus in a phrase like «tryin’ to go» there is a word break after the
> n, because one can't tell.
>
> So something like "δ’ αρχαια" (picking a phrase at random) would have a
> word break after the delta.
>
> Word break:
> δ’ αρχαια
>
> However, there is no *line break* between them (which is the more
> important operation in normal usage). Probably not worth tailoring the word
> break.
>
> Line break:
> δ’ αρχαια
>
> Mark
>
>
> On Fri, Jan 25, 2019 at 1:10 PM James Tauber via Unicode <
> unicode@unicode.org> wrote:
>
>> There seems some debate amongst digital classicists in whether to use
>> U+2019 or U+02BC to represent the apostrophe in Ancient Greek when marking
>> elision. (e.g. δ’ for δέ preceding a word starting with a vowel).
>>
>> It seems to me that U+2019 is the technically correct choice per the
>> Unicode Standard but it is not without at least one problem: default word
>> breaking rules.
>>
>> I'm trying to provide guidelines for digital classicists in this regard.
>>
>> Is it correct to say the following:
>>
>> 1) U+2019 is the correct character to use for the apostrophe in Ancient
>> Greek when marking elision.
>> 2) U+02BC is a misuse of a modifier for this purpose
>> 3) However, use of U+2019 (unlike U+02BC) means the default Word Boundary
>> Rules in UAX#29 will (incorrectly) exclude the apostrophe from the word
>> token
>> 4) And use of U+02BC (unlike U+2019) means Glyph Cluster Boundary Rules
>> in UAX#29 will (incorrectly) include the apostrophe as part of a glyph
>> cluster with the previous letter
>> 5) The correct solution is to tailor the Word Boundary Rules in the case
>> of Ancient Greek to treat U+2019 as not breaking a word (which shouldn't
>> have the same ambiguity problems with the single quotation mark as in
>> English as it should not be used as a quotation mark in Ancient Greek)
>>
>> Many thanks in advance.
>>
>> James
>>
>

-- 
*James Tauber*
Greek Linguistics: https://jktauber.com/
Music Theory: https://modelling-music.com/
Digital Tolkien: https://digitaltolkien.com/

Twitter: @jtauber

Reply via email to