(I'm adding the TeX hyphenation mailing list to recipients; I apologise for cross-posting. Hyphenation-patterns-related discussion may continue on hyphenation list (or off-list if needed). XeLaTeX issues, in particular "how not to start the line with word-or-sentence-separator" may stay on the XeTeX list since that's more or less engine- and polyglossia-related.)
On Thu, Nov 4, 2010 at 15:53, Gareth Hughes wrote: > Dear Adam, > > Line 7 of gloss-amharic.ldf in the polyglossia package has > > hyphennames={amharic,nohyphenation}, > > which I take to mean that you'll get no hyphenation wherever 'amharic' > is active. The next line is commented out > > %hyphenmins={2,2}, > > so I presume that some rules were intended (François?). If the rules are > that hyphenation can occur anywhere, I'm sure this would be fairly > easily to implement. An example of hyphenation patterns is attached. I do not claim that the patterns work perfectly (they probably don't, but it might be a starting point). I simply added a number 1 after each valid Unicode character between U+1200 and U+135A (without removing non-existing characters in Amharic and without using those from Unicode 6, 2D80–2DDF). 1.) You need to put the file hyph-am.tex into /usr/local/texlive/2010/texmf-dist/tex/generic/hyph-utf8/patterns/tex/hyph-am.tex 2.) Put loadhyph-am.tex into /usr/local/texlive/2010/texmf-dist/tex/generic/hyph-utf8/loadhyph/loadhyph-am.tex 3.) Add amharic loadhyph-am.tex to /usr/local/texlive/2010/texmf-var/tex/generic/config/language.dat 4.) Change "%hyphenmins={2,2}," into "hyphenmins={1,1}," in /usr/local/texlive/2010/texmf-dist/tex/xelatex/polyglossia/gloss-amharic.ldf 5.) Run sudo fmtutil-sys --byfmt xelatex You can also test with the following (keep the rest of document unchanged): \newdimen\savehsize \savehsize\hsize \def\test#1{\endgraf\hsize=1pt\noindent #1\endgraf\hsize=\savehsize} \begin{amharic} \test{እስመ ፡ አግዚአብሔር ፡ አምላክ ፡ ማእምር ፡ ውእቱ ። እግዚአብሔር ፡ አስተደወ ፡ መንብሮ ። ወአድከመ ፡ ቅሥተ ፡ ኀያላን ። ወአቅነቶሙ ፡ ኀይለ ፡ ለድኩማን ። ጽጉማን ፡ እክል ፡ ርኅቡ ። ወርኁባን ፡ ጸግቡ ። እስመ ፡ መካን ፡ ወለደት ፡ ሰብዐተ ፡ ወወለድሰ ፡ ስእነት ፡ ወሊደ ፡ እግዚአብሔር ፡ ይቀትል ፡ ወየሐዩ ። ያወርድኒ ፡ ውስተ ፡ ሲእል ፡ ወየዐርግ ። እግዚአብሔር ፡ ያነዲ ፡ ወያብዕል ። ያኀስርሂ ፡ ወያከብር ፡ ዘያነሥኦ ፡ እምድር ፡ ለነዳይ ። ከመ ፡ ያንብሮ ፡ ምስለ ፡ ዓበይ[ተ] ።} \end{amharic} The problem of colons that may not start a new line has to be solved on a different level. You could write like that: እስመ~፡ አግዚአብሔር~፡ አምላክ~፡ ማእምር~፡ ውእቱ~። እግዚአብሔር~፡ አስተደወ~፡ መንብሮ~። ወአድከመ~፡ ቅሥተ~፡ ኀያላን~። ወአቅነቶሙ~፡ ኀይለ~፡ ለድኩማን~። ጽጉማን~፡ እክል~፡ ርኅቡ~። ወርኁባን~፡ ጸግቡ~። እስመ~፡ መካን~፡ ወለደት~፡ ሰብዐተ~፡ ወወለድሰ~፡ ስእነት~፡ ወሊደ~፡ እግዚአብሔር~፡ ይቀትል~፡ ወየሐዩ~። ያወርድኒ~፡ ውስተ~፡ ሲእል~፡ ወየዐርግ~። እግዚአብሔር~፡ ያነዲ~፡ ወያብዕል~። ያኀስርሂ~፡ ወያከብር~፡ ዘያነሥኦ~፡ እምድር~፡ ለነዳይ~። ከመ~፡ ያንብሮ~፡ ምስለ~፡ ዓበይ[ተ]~። This works perfectly fine, but you probably don't want to write like that. I leave it up to others to solve that problem. The hyphenchar can easily be changed to "nothing" though. Mojca > Adam McCollum wrote: >> Dear list members, >> >> I've recently drawn up a short document in Ge`ez (classical Ethiopic) using >> Polyglossia and I see that the hyphenation is wrong. As some of you know, >> languages that use the Ethiopic script, including Ge`ez and Amharic, place a >> word divider—it looks somewhat like a thick colon—between each word and two >> of these dividers side by side between sentences; see some Amharic examples >> here<http://books.google.com/books?id=r87yh5z66TEC&printsec=frontcover&dq=amharic&hl=en&ei=U7TSTIX-Ds2r8AaT6LxF&sa=X&oi=book_result&ct=book-thumbnail&resnum=6&ved=0CEwQ6wEwBQ#v=onepage&q&f=false>. >> That being the case, a word may be broken at any syllable (the script is a >> syllabary, not an alphabet) at the end of a line, but there is nothing >> corresponding to a hyphen. An additional matter of importance is that no >> line should begin with the single or double word divider. How should this be >> fixed? >> >> Here is a minimal example: >> >> \documentclass[12pt]{article} >> >> \usepackage{fontspec} >> \usepackage{polyglossia} >> >> \setmainlanguage{english} >> \setotherlanguage{amharic} >> >> \newfontfamily\amharicfont[Script = Ethiopic, Scale = 1.3]{Abyssinica SIL} >> >> \begin{document} >> >> \title{Sample in Gǝ`ǝz} >> \maketitle >> >> \begin{amharic} >> እስመ ፡ አግዚአብሔር ፡ አምላክ ፡ ማእምር ፡ ውእቱ ። እግዚአብሔር ፡ አስተደወ ፡ መንብሮ ። ወአድከመ ፡ ቅሥተ ፡ >> ኀያላን ። ወአቅነቶሙ ፡ ኀይለ ፡ ለድኩማን ። ጽጉማን ፡ እክል ፡ ርኅቡ ። ወርኁባን ፡ ጸግቡ ። እስመ ፡ መካን ፡ >> ወለደት ፡ ሰብዐተ ፡ ወወለድሰ ፡ ስእነት ፡ ወሊደ ፡ እግዚአብሔር ፡ ይቀትል ፡ ወየሐዩ ። ያወርድኒ ፡ ውስተ ፡ ሲእል >> ፡ ወየዐርግ ። እግዚአብሔር ፡ ያነዲ ፡ ወያብዕል ። ያኀስርሂ ፡ ወያከብር ፡ ዘያነሥኦ ፡ እምድር ፡ ለነዳይ ። ከመ ፡ >> ያንብሮ ፡ ምስለ ፡ ዓበይ[ተ] ። >> \end{amharic} >> >> \end{document} >> >> With many thanks in advance for the help, >> >> Adam McCollum, Ph.D. >> Lead Cataloger, Eastern Christian Manuscripts >> Hill Museum & Manuscript Library >> Saint John's University >> P.O. Box 7300 >> Collegeville, MN 56321 >> >> (320) 363-2075 (phone) >> (320) 363-3222 (fax) >> www.hmml.org > -- > Gareth Hughes > Doctoral candidate in Syriac studies > > Department of Eastern Christianity > Oriental Institute > Pusey Lane > Oxford > OX1 2LE
hyph-am.tex
Description: TeX document
loadhyph-am.tex
Description: TeX document
-------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex