To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=60584
User cloph changed the following: What |Old value |New value ================================================================================ Status|RESOLVED |REOPENED -------------------------------------------------------------------------------- Resolution|FIXED | -------------------------------------------------------------------------------- ------- Additional comments from [EMAIL PROTECTED] Mon Jan 16 16:30:29 -0800 2006 ------- > Documentation has fixed at > http://lingucomponent.openoffice.org/hyphenator.html. Hmm. I cannot find anything describing the algorithm there - but at least the instructions on how to create a hyphenation dictionary have been clarified. > (There was good documentation only in the standalone > hyphenator:http://lingucomponent.openoffice.org/altlinux_Hyph.zip.) Hmm - I cannot find an updated description of the algorithm either. Or better on how it differs from the one described in the README. It still reads "The hyphenation algorithm is basically the same as Knuth's TeX algorithm. However, the implementation is quite a bit faster." It states that it needs a preprocessing step, but doesn't explain why. And it still shows the pattern-matching example of the TeX-algorithm. > Levien's implementation uses prepared TeX hyphenation pattern, where every > embedded subpattern are expanded to left: Hmm. I guess that is exactly what I want to have added to the documentation, although I don't quite understand it yet. Maybe you could try to explain in other words what is meant by "expanded to left". > a3cke is embedded i jacken7tasche, expanded to left: ja3cke. I think here is one part of the things that start to confuse me. If it is already embedded, why has it to be added explicitly again? Does the algorithm only use the "biggest junk" that fits (and only one single pattern), starting from the left? ja3cke fits one more letter on the left of the pattern (two, the "j" and the "a") whereas a3cke only matches one ("a")? When there is no ja3cke but only a3cke and jacken7tasche then still Jacke would be compared against the "jacken7tasche" pattern *only* (since it matches more letters beginning from the left), but since it doesn't contain a digit, it won't be hyphenated? On the other hand, when I have the patterns a3cke and cken7tasche everythink would be hyphenated as expected since now the a3cke is applied since it matches "earlier" in the string (second letter instead of third)? If I have both a3cke and acken7tasche Are both applied like in the original implementation or what is the case here? Is the crucial part the number of matched letters before the digit or from the left? >From a test with the pattern: ISO8859-1 ack7entasche cken7taschen $ ./example hyphtestpat Jacke Jack-entasche Jack-en-taschen compared to the pattern ISO8859-1 ack7entasche cken7tasche $ ./example hyphtestpat Jacke Jack-entasche Jack-entaschen It seems that not only the left-side is what matters, but also the right side. So ist my sketch of the algorithm completely wrong and both the left and right-hand letters count? i.e. if one patterm matches 7 letters, the other one 5, then only the one with 7 matching letters is applied? If both have 5 matches, both are applied? But again this cannot be the case since otherwise "Jacken" would not have been hyphenated with the pattern ISO8859-1 a3cke jacken7tasche ja3cke (jacken7tasche still has one match more) - why is "jacken" not hyphenated with the very fist pattern? (without "ja3cke") - does it really match the longer one? > See documentation of the libhnj library (or standalone version of AltLinux > libhnj hyphenator). Please point me to the file. I couldn't find that description in the zip. (I'm not looking for a description on how to create the hyphenation dictionary, but how the algorithm uses the patterns) - therefore I reopen this one. --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]