Re: [lingu-dev] C replacement for substrings.pl

Nanning Buitenhuis Wed, 26 Jul 2006 13:24:46 -0700

It also fixed a minor bug in combine(): if a sub-pattern is found twice (or 
more) in the main pattern, then all occurences were changed instead of (the 
correct) last occurence. Only example in hyphen.us is 'tanta3'


I'm not that familiar with the algorithm, so: does that have an effect on the 
final result, i.e. the way hyphenation works?

There are two differences:
1) the output file is sorted (the perl output wasn't)

2) 'tant3a' +'1ta' gets converted to 'tan1t3a' instead of '1tan1t3a'. Asthe algorithm tries to find a right side match, this seems to be thecorrect solution. The perl code found the right-side 'ta' and thenupgraded _all_ 'ta's in the main expression.

I just discovered that it is not supposed to work with utf-8, but with8-bit character sets. I will fix the code so that it works with both (itis a pity that the OO code is 8 bit). The speed will not change much.The reason I rewrote it is that we're using the OO hyphenation code fora new TeX version, which will be utf-8/unicode based.





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [lingu-dev] C replacement for substrings.pl

Reply via email to