Le mercredi, 22 juin 2016 à 01:32, Laurentiu Iancu a écrit :
> Re #1, the ^ symbol indeed denotes a start-of-line anchor, in usual regex
> notation, and the corresponding rules could use sot instead.
By the way it seems to me that an equivalent formulation of GB12/GB13 and
WB15/WB16 would be to have the sequence of rules:
RI RI ÷ RI RI
RI x RI
This fits particularly well in the case of word breaking since you already need
as much context as this because of the rules WB{6,7,11,12}. It also avoids
regexps and negation.
Best,
Daniel