It also fixed a minor bug in combine(): if a sub-pattern is found twice (or
more) in the main pattern, then all occurences were changed instead of (the
correct) last occurence. Only example in hyphen.us is 'tanta3'
I'm not that familiar with the algorithm, so: does that have an effect on the
final result, i.e. the way hyphenation works?
There are two differences:
1) the output file is sorted (the perl output wasn't)
2) 'tant3a' +'1ta' gets converted to 'tan1t3a' instead of '1tan1t3a'. As
the algorithm tries to find a right side match, this seems to be the
correct solution. The perl code found the right-side 'ta' and then
upgraded _all_ 'ta's in the main expression.
I just discovered that it is not supposed to work with utf-8, but with
8-bit character sets. I will fix the code so that it works with both (it
is a pity that the OO code is 8 bit). The speed will not change much.
The reason I rewrote it is that we're using the OO hyphenation code for
a new TeX version, which will be utf-8/unicode based.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]