Den 2015-05-07 16:02, Jonathan Kew skrev:
Would it be feasible to define this negatively instead --
something like "a sigma is final if it is NOT followed by another
letter"?
A possible refinement is that a lone sigma, neither preceded nor
followed by another letter, should probably be lowercased as σ
rather than ς.
I have used this Perl regular expression substitution to change σ into
ς for some years, with satisfactory results so far,
s/(?<= \p{Script=Greek} ) (?<= \pL ) σ (?! \pL | - )/ς/gx
That is: change a σ into ς if it is preceded by a Greek letter
and not followed by a letter or a hyphen. NB that this
substitution as written above only works with NFC text. For NFD
you would need to use the following, since the perl regex engine
doesn't support variable-length lookbehind:
s/(?<= \p{Script=Greek} ) (?<= \pL ) ( \pM* ) σ (?! \pL | -
)/$1ς/gx
I guess that when intersection character classes are possible one
should change the negative lookahead into "when not followed by a
Greek letter or a hyphen.
(?! (?[ \p{Script=Greek} & \pL ]) | - )
/bpj
--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex