Hello Nathalie, Interesting subject you chose! The reference and, in my opinion, best complete explanation of TeX’s hyphenation algorithm is appendix H of the TeXbook (https://www.worldcat.org/oclc/826569026). You’ll find there everything you need to get a basic understanding. For a more in-depth analysis, see Frank Liang’s PhD thesis at http://tug.org/docs/liang/. I thought the English Wikipedia’s article on “Hyphenation Algorithm” offered a summary of Appendix H, but I can’t find it, so in short:
In order to hyphenate a word in a given language, you need a list of patterns for that language. Let’s say the word is “hyphenation” and the patterns are Knuth and Liang’s file hyphen.tex (available from CTAN: http://mirror.ctan.org/systems/knuth/dist/lib/hyphen.tex). You start by finding all the patterns that, ignoring the digits, match the word: hy3ph he2n hena4 hen5at 1na n2at 1tio 2io o2n That is to say, ‘hyph’ matches “hyphenation” (because you ignore the 3); so does “hen”, etc. Once you’ve got that list, you build a sequence of letters and digits, and you insert them into the original word, taking the maximum if their are several possible digits. In this case you get: hy3phe2n5a4tio2n There are three places where two digits would be possible: after the ‘e’, you could insert either 2 (from “he2n”) or 1 (from “1na”), so you take 2, the maximum of the two; before the ‘a’ you have 5 from “hen5at” and 2 from “n2at”, so here you get 5; and after the ‘a’ you have “hena4” that produces 4, and “1tio” that produces 1, so you take 4. Then, you may hyphenate the word where there is an odd number, otherwise you may not. Hence: hy-phen-ation. That’s all. Hope this helps! Best, Arthur
