Re: [NTG-context] Hyphenation patterns
Denis’ latest question reminded me of an earlier query he had about hyphenation, asking why “applicable” and “obligated” were hyphenated by ConTeXt as ap-plic-a-ble and ob-lig-at-ed, and not ap-pli-ca-ble and ob-li-ga-te(d) like in Merriam-Webster (the discussion started at https://mailman.ntg.nl/pipermail/ntg-context/2020/099695.html). First of all, I note that while Webster’s dictionary is a useful guide, and indeed a major reference for any American typographer, there’s no absolute rule that we have to follow it either. The break applic-able, for example, does look acceptable to me; oblig-ated, less so. Taco reminded that when producing a set of hyphenation patterns from a list of hyphenated words, we’re essentially compressing information, and that some minor deviations are to be expected. However, in my experience, unexpected breakpoints are almost never due to chance, but to a deliberate decision. Then Hraban said that: On Fri, Oct 09, 2020 at 10:15:17AM +0200, Henning Hraban Ramm wrote: > Usually Arthur’s (hail the emperor of hyphenation and protector of the > patterns) patterns are flawless, so I guess it’s not a bug but an exception > of the rules. I see that my self-appointed title is catching on, nice :-) Unfortunately the patterns are just as likely to contain errors as anything else, and in this particular case we’ll probably never know for sure, because the original hyphenated word list was never published (all the word lists from which patterns were produced in the 80s and 90s have been lost, for all languages). We’re thus reduced to guessing the intent of those who compiled the lists. We can get hints from looking at the patterns involved in the debatable breaks. Hans has a useful script: $ mtxrun --script patterns --language=us --left=2 --right=2 --hyphenate applicable hyphenator | hyphenator | . a p p l i c a b l e . . a p p l i c a b l e . hyphenator |4p1p0 0 4 1 0 0 0 0 0 0 0 0 hyphenator | 1p2l2 0 4 1 2 2 0 0 0 0 0 0 hyphenator | 0p0l0i2c1a0b0 0 4 1 2 2 2 1 0 0 0 0 hyphenator |1c0a0 0 4 1 2 2 2 1 0 0 0 0 hyphenator |0c0a1b0l0 0 4 1 2 2 2 1 1 0 0 0 hyphenator |0b2l2 0 4 1 2 2 2 1 1 2 2 0 hyphenator |0b4l0e0.0 0 4 1 2 2 2 1 1 4 2 0 hyphenator | .0a4p1p2l2i2c1a1b4l2e0. . a p-p l i c-a-b l e . hyphenator | mtx-patterns| us 2 2 : applicable : ap-plic-a-ble That tells us that there are seven patterns involved in hyphenating the word applicable: 4p1, 1p2l2, pli2c1ab, 1ca, ca1bl, b2l2, and b4le. (the final dot is part of that last pattern). The pattern responsible for the break applic-able is pli2c1ab. If we now refer to the source repository for hyphenation patterns (since comments are stripped in the ConTeXt sources): https://github.com/hyphenation/tex-hyphen/blob/master/hyph-utf8/tex/generic/hyph-utf8/patterns/tex/hyph-en-us.tex -- we can see line 4508 hyphen.tex patterns end here, and additional patterns begin: which means that the pattern pli2c1ab, line 4817, is an “additional pattern”. The background story is that hyphen.tex, the original hyphenation pattern file for American English, produced in 1982-1983 from a list of hyphenated words (following mostly Webster’s), was later augmented with more patterns that were supposed to improve hyphenation for many words. The person who added these new patterns apparently had a list of words hyphenated incorrectly (according to him) by hyphen.tex, but both that list and the one used to produce hyphen.tex are as mentioned above now lost, probably forever. In any case, the pattern that causes the break applic-able was clearly added intentionally; and as I said that break seems quite reasonable to me. Not so for the one in oblig-ated, so let’s have a look at that: $ mtxrun --script patterns --language=us --left=2 --right=2 --hyphenate obligated hyphenator | hyphenator | . o b l i g a t e d . . o b l i g a t e d . hyphenator | 0o0b0l0i2g1 0 0 0 0 2 1 0 0 0 0 hyphenator |0b2l2 0 0 2 2 2 1 0 0 0 0 hyphenator | 5l0i0g0a0t0e0 0 0 5 2 2 1 0 0 0 0 hyphenator |2i0g0 0 0 5 2 2 1 0 0 0 0 hyphenator | 1g0a0 0 0 5 2 2 1 0 0 0 0 hyphenator | 2t1e0d0 0 0 5 2 2 1 2 1 0 0 hyphenator | .0o0b5l2i2g1a2t1e0d0. . o b-l i g-a t-e d . hyphenator | mtx-patterns| us 2 2 : obligated : ob-lig-at-ed Here we see that the dubious break is caused by the pattern obli2g1, also an “additional pattern” (line 4783), and here it’s not hard to guess where
Re: [NTG-context] Hyphenation patterns
Am 09.10.2020 um 14:48 schrieb Hans Hagen: On 10/9/2020 9:01 AM, Denis Maier wrote: [...] I see. I've noticed lang-us.lua has a list of exceptions in it: ["exceptions"]={ ["characters"]="abcdefghijlmnoprstuyz", ["data"]="as-so-ciate as-so-ciates dec-li-na-tion oblig-a-tory phil-an-thropic present presents project projects reci-procity re-cog-ni-zance ref-or-ma-tion ret-ri-bu-tion ta-ble", ["length"]=168, ["n"]=14, }, Would it be possible to add more exceptions to that list as they come up? Or is that inappropriate? you can add your own runtime in a style: \hyphenation {fo-ob-ar} \hsize 1mm foobar Sure. I use \startexceptions[en] for that. I just thought everyone might benefit... Denis ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
On 10/9/2020 9:01 AM, Denis Maier wrote: Am 09.10.2020 um 08:57 schrieb Taco Hoekwater: On 9 Oct 2020, at 08:52, Denis Maier wrote: Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble" According to Meriam-Webster it should just be "ap·pli·ca·ble". {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate" According to Meriam-Webster it should be "ob·li·gate". I've had a look at the files mentioned by Tomáš, but as these are not just wordlists I can not really tell what is happening. So, is that a bug? Not really. hyphenation patterns are a bit like applying JPEG compression to a dictionary. It makes the data size smaller by recognising patterns while ignoring outliers. Occasional errors are to be expected, which is why \hyphenation exists. I see. I've noticed lang-us.lua has a list of exceptions in it: ["exceptions"]={ ["characters"]="abcdefghijlmnoprstuyz", ["data"]="as-so-ciate as-so-ciates dec-li-na-tion oblig-a-tory phil-an-thropic present presents project projects reci-procity re-cog-ni-zance ref-or-ma-tion ret-ri-bu-tion ta-ble", ["length"]=168, ["n"]=14, }, Would it be possible to add more exceptions to that list as they come up? Or is that inappropriate? you can add your own runtime in a style: \hyphenation {fo-ob-ar} \hsize 1mm foobar - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble" According to Meriam-Webster it should just be "ap·pli·ca·ble". {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate" According to Meriam-Webster it should be "ob·li·gate". I've had a look at the files mentioned by Tomáš, but as these are not just wordlists I can not really tell what is happening. So, is that a bug? Best, Denis ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
On 10/9/2020 10:15 AM, Henning Hraban Ramm wrote: Am 09.10.2020 um 08:52 schrieb Denis Maier : Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext Wow, that's super helpful. BTW \hyphenatedword works the same. I didn’t see anything colored. There are some more commands like this, even \hyphenatedfile, see https://source.contextgarden.net/tex/context/base/mkiv/supp-box.mkiv?search=hyphenated Usually Arthur’s (hail the emperor of hyphenation and protector of the patterns) patterns are flawless, so I guess it’s not a bug but an exception of the rules. ancient secret features: >mtxrun --script patterns --hyphenate applicable --language=gb hyphenator | hyphenator | . a p p l i c a b l e . . a p p l i c a b l e . hyphenator | 2a0p0 2 0 0 0 0 0 0 0 0 0 0 hyphenator |4p1p2 2 4 1 2 0 0 0 0 0 0 0 hyphenator | 0p2l2 2 4 1 2 2 0 0 0 0 0 0 hyphenator | 1a0b0 2 4 1 2 2 0 1 0 0 0 0 hyphenator |2b0l2 2 4 1 2 2 0 1 2 0 2 0 hyphenator | 4l0e0.0 2 4 1 2 2 0 1 2 4 2 0 hyphenator | .2a4p1p2l2i0c1a2b4l2e0. . a p-p l i c-a b l e . hyphenator | mtx-patterns| gb 3 3 : applicable : applic-able >mtxrun --script patterns --hyphenate applicable --language=us hyphenator | hyphenator | . a p p l i c a b l e . . a p p l i c a b l e . hyphenator |4p1p0 0 4 1 0 0 0 0 0 0 0 0 hyphenator | 1p2l2 0 4 1 2 2 0 0 0 0 0 0 hyphenator | 0p0l0i2c1a0b0 0 4 1 2 2 2 1 0 0 0 0 hyphenator |1c0a0 0 4 1 2 2 2 1 0 0 0 0 hyphenator |0c0a1b0l0 0 4 1 2 2 2 1 1 0 0 0 hyphenator |0b2l2 0 4 1 2 2 2 1 1 2 2 0 hyphenator |0b4l0e0.0 0 4 1 2 2 2 1 1 4 2 0 hyphenator | .0a4p1p2l2i2c1a1b4l2e0. . a p-p l i c-a-b l e . hyphenator | mtx-patterns| us 3 3 : applicable : applic-a-ble not the kind of stuff one wants to expose a new user to Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
On 10/8/2020 7:05 PM, Henning Hraban Ramm wrote: Am 08.10.2020 um 17:41 schrieb Denis Maier : where can I find the hyphenation patterns used by ConTeXt? I have two wrongly hyphenated words, and I want to check whether this is due to incorrect patterns. (I tried the source browser... not much luck so far.) The words are: 1. applicable => hyphenated as applic-able 2. obligated => hyphenated as oblig-ated I know I can use \hyphenation to correct that, but I wanted to check the patterns nevertheless. I guess it’s just a valid option. You can check possible hyphenations like this: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext americans and brits hyphnetate differently \starttext {\language[usenglish] {\tt US \number\normallanguage}: \hyphenatedcoloredword{applicable}}\par {\language[ukenglish] {\tt UK \number\normallanguage}: \hyphenatedcoloredword{applicable}}\par \stoptext syllable vs stem (but I bet Arthur can explain better) hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
> Am 09.10.2020 um 08:52 schrieb Denis Maier : > > Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: >> \starttext >> >> {EN: \en\hyphenatedcoloredword{applicable}} >> >> {DE: \de\hyphenatedcoloredword{applicable}} >> >> \stoptext >> > Wow, that's super helpful. BTW \hyphenatedword works the same. I didn’t see anything colored. There are some more commands like this, even \hyphenatedfile, see https://source.contextgarden.net/tex/context/base/mkiv/supp-box.mkiv?search=hyphenated Usually Arthur’s (hail the emperor of hyphenation and protector of the patterns) patterns are flawless, so I guess it’s not a bug but an exception of the rules. Hraban ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
Am 09.10.2020 um 08:57 schrieb Taco Hoekwater: On 9 Oct 2020, at 08:52, Denis Maier wrote: Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble" According to Meriam-Webster it should just be "ap·pli·ca·ble". {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate" According to Meriam-Webster it should be "ob·li·gate". I've had a look at the files mentioned by Tomáš, but as these are not just wordlists I can not really tell what is happening. So, is that a bug? Not really. hyphenation patterns are a bit like applying JPEG compression to a dictionary. It makes the data size smaller by recognising patterns while ignoring outliers. Occasional errors are to be expected, which is why \hyphenation exists. I see. I've noticed lang-us.lua has a list of exceptions in it: ["exceptions"]={ ["characters"]="abcdefghijlmnoprstuyz", ["data"]="as-so-ciate as-so-ciates dec-li-na-tion oblig-a-tory phil-an-thropic present presents project projects reci-procity re-cog-ni-zance ref-or-ma-tion ret-ri-bu-tion ta-ble", ["length"]=168, ["n"]=14, }, Would it be possible to add more exceptions to that list as they come up? Or is that inappropriate? Denis ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
> On 9 Oct 2020, at 08:52, Denis Maier wrote: > > Am 08.10.2020 um 19:05 schrieb Henning Hraban Ramm: >> \starttext >> >> {EN: \en\hyphenatedcoloredword{applicable}} >> >> {DE: \de\hyphenatedcoloredword{applicable}} >> >> \stoptext >> > Wow, that's super helpful. The English pattern seems to be "ap-plic-a-ble" > According to Meriam-Webster it should just be "ap·pli·ca·ble". > > {EN: \en\hyphenatedcoloredword{obligate}} gives me "ob-lig-ate" > According to Meriam-Webster it should be "ob·li·gate". > > I've had a look at the files mentioned by Tomáš, but as these are not just > wordlists I can not really tell what is happening. > > So, is that a bug? Not really. hyphenation patterns are a bit like applying JPEG compression to a dictionary. It makes the data size smaller by recognising patterns while ignoring outliers. Occasional errors are to be expected, which is why \hyphenation exists. Best wishes, Taco ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
[NTG-context] Hyphenation patterns
Hi, where can I find the hyphenation patterns used by ConTeXt? I have two wrongly hyphenated words, and I want to check whether this is due to incorrect patterns. (I tried the source browser... not much luck so far.) The words are: 1. applicable => hyphenated as applic-able 2. obligated => hyphenated as oblig-ated I know I can use \hyphenation to correct that, but I wanted to check the patterns nevertheless. Best, Denis ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
> Am 08.10.2020 um 17:41 schrieb Denis Maier : > > where can I find the hyphenation patterns used by ConTeXt? I have two wrongly > hyphenated words, and I want to check whether this is due to incorrect > patterns. (I tried the source browser... not much luck so far.) The words are: > 1. applicable => hyphenated as applic-able > 2. obligated => hyphenated as oblig-ated > > I know I can use \hyphenation to correct that, but I wanted to check the > patterns nevertheless. I guess it’s just a valid option. You can check possible hyphenations like this: \starttext {EN: \en\hyphenatedcoloredword{applicable}} {DE: \de\hyphenatedcoloredword{applicable}} \stoptext Hraban ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns
Hi, you can find patterns on this directory: texlive/2020/texmf-dist/tex/context/patterns/mkiv/ Best wishes, Tomáš Thu, Oct 08, 2020 ve 05:41:09PM +0200 Denis Maier napsal(a): # Hi, # # where can I find the hyphenation patterns used by ConTeXt? I have # two wrongly hyphenated words, and I want to check whether this is # due to incorrect patterns. (I tried the source browser... not much # luck so far.) The words are: # 1. applicable => hyphenated as applic-able # 2. obligated => hyphenated as oblig-ated # # I know I can use \hyphenation to correct that, but I wanted to check # the patterns nevertheless. # # Best, # Denis # ___ # If your question is of interest to others as well, please add an entry to the Wiki! # # maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context # webpage : http://www.pragma-ade.nl / http://context.aanhet.net # archive : https://bitbucket.org/phg/context-mirror/commits/ # wiki : http://contextgarden.net # ___ Tomáš Hála Mendelova univerzita, Provozně ekonomická fakulta, ústav informatiky Zemědělská 1, CZ-613 00 Brno, tel. +420 545 13 22 28 http://akela.mendelu.cz/~thala ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning:?ConTeXt vs. LuaTeX
On Fri, Feb 25, 2011 at 04:45:31PM +0100, Ulrike Fischer wrote: > > Ah, yes, the transcript of my first example clearly shows fontspec > > operating in node mode. > > > Please excuse my naive asking: Is there any way to continue using > > fontspec's setmainfont command (it is convenient for someone > > unexperienced like me) and at the same force luaotfload into > > using base mode? > > The following seems to work: > > \documentclass{article} > \usepackage[ngerman]{babel} > \usepackage{fontspec} > \setmainfont[RawFeature={mode=base},FeatureFile=bonum.fea]{TeX Gyre Bonum} Better "Renderer=Basic". -- Khaled Hosny Egyptian Arab ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
On Fri, Feb 25, 2011 at 03:41:10PM +0100, Ulrike Fischer wrote: > So I think it isn't true that the manual of luaotfload claims "By > default mode=base is used". It used to be like that but we changed it a while ago, looks like I didn't update the manual. Regards, Khaled -- Khaled Hosny Egyptian Arab ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Am Fri, 25 Feb 2011 16:45:31 +0100 schrieb Ulrike Fischer: >> Please excuse my naive asking: Is there any way to continue using >> fontspec's setmainfont command (it is convenient for someone >> unexperienced like me) and at the same force luaotfload into >> using base mode? >> > > The following seems to work: > > \documentclass{article} > \usepackage[ngerman]{babel} > \usepackage{fontspec} > \setmainfont[RawFeature={mode=base},FeatureFile=bonum.fea]{TeX Gyre > Bonum} > \begin{document} > fh aufhalten > \end{document} Oh joy! Thank you all, once again, for your help. It is much appreciated. (The downside: Now that I can, I have no choice but to actually fix all the terrible kerning mistakes in TeX Gyre Bonum. Hours and hours of work. Sigh...) - Till ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Am Fri, 25 Feb 2011 16:45:31 +0100 schrieb Ulrike Fischer: >> Ah, yes, the transcript of my first example clearly shows fontspec operating >> in node mode. Yes, but I could also reproduce the problem without fontspec (only with luaotfload). >> Please excuse my naive asking: Is there any way to continue using >> fontspec's setmainfont command (it is convenient for someone >> unexperienced like me) and at the same force luaotfload into >> using base mode? > > The following seems to work: > > \documentclass{article} > \usepackage[ngerman]{babel} > \usepackage{fontspec} > \setmainfont[RawFeature={mode=base},FeatureFile=bonum.fea]{TeX Gyre > Bonum} > \begin{document} > fh aufhalten > \end{document} And after a look in the fontspec code: \setmainfont[Renderer=Basic,FeatureFile=bonum.fea]{TeX Gyre Bonum} -- Ulrike Fischer ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Am Fri, 25 Feb 2011 16:37:26 +0100 schrieb Heilmann, Till A.: > Am Fri, 25 Feb 2011 14:41:10 +0100 schrieb Ulrike Fischer: > >>> In base mode kerning and and hyphenation >>> happen in the traditional tex way, so there is not much extra trickery >>> taking place. >> >> Well, as you mention "base mode": This reminded me that I had to >> force base mode to get my reencoding to work in latex. So I tried in >> context + latex/luaotfload (with german hyphenation patterns): >> >> [...] >> >> And bingo: with mode=base it works in both formats, with mode=node >> the kern disappears. Without mode declaration the kern disappears in >> latex. > > Ah, yes, the transcript of my first example clearly shows fontspec operating > in node mode. > Please excuse my naive asking: Is there any way to continue using > fontspec's setmainfont command (it is convenient for someone > unexperienced like me) and at the same force luaotfload into > using base mode? The following seems to work: \documentclass{article} \usepackage[ngerman]{babel} \usepackage{fontspec} \setmainfont[RawFeature={mode=base},FeatureFile=bonum.fea]{TeX Gyre Bonum} \begin{document} fh aufhalten \end{document} -- Ulrike Fischer ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Am Fri, 25 Feb 2011 14:41:10 +0100 schrieb Ulrike Fischer: >> In base mode kerning and and hyphenation >> happen in the traditional tex way, so there is not much extra trickery >> taking place. > > Well, as you mention "base mode": This reminded me that I had to > force base mode to get my reencoding to work in latex. So I tried in > context + latex/luaotfload (with german hyphenation patterns): > > [...] > > And bingo: with mode=base it works in both formats, with mode=node > the kern disappears. Without mode declaration the kern disappears in > latex. Ah, yes, the transcript of my first example clearly shows fontspec operating in node mode. Please excuse my naive asking: Is there any way to continue using fontspec's setmainfont command (it is convenient for someone unexperienced like me) and at the same force luaotfload into using base mode? Thanks, - Till ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Am Fri, 25 Feb 2011 14:35:10 +0100 schrieb Hans Hagen: >> As a new LuaTeX user, I came across the following problem: Using >> Lua(La)TeX, customized kerning of letter pairs (via the >> FeatureFile capability of fontspec) is ignored when it coincides >> with a possible hyphenation of a word (e.g. between 'f' and 'h' >> in German words like 'aufhalten'; see first minimal example >> below). >> 1. Lua(La)TeX > I cannot test that (I only have the context minimals installed) but I > don't know anything about latex internals so it would be a wild guess. > Maybe babel is interfering? No, the problem exists also if you only load the german patterns. > In base mode kerning and and hyphenation > happen in the traditional tex way, so there is not much extra trickery > taking place. Well, as you mention "base mode": This reminded me that I had to force base mode to get my reencoding to work in latex. So I tried in context + latex/luaotfload (with german hyphenation patterns): \font\test="name:TeX Gyre Bonum:mode=base:featurefile=bonum.fea;+kern" and \font\test="name:TeX Gyre Bonum:mode=node:featurefile=bonum.fea;+kern" And bingo: with mode=base it works in both formats, with mode=node the kern disappears. Without mode declaration the kern disappears in latex. So I think it isn't true that the manual of luaotfload claims "By default mode=base is used". -- Ulrike Fischer ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
On 25-2-2011 1:18, Heilmann, Till A. wrote: Maybe the ConTeXt community can be of assistance to the LuaTeX bunch ... As a new LuaTeX user, I came across the following problem: Using Lua(La)TeX, customized kerning of letter pairs (via the FeatureFile capability of fontspec) is ignored when it coincides with a possible hyphenation of a word (e.g. between 'f' and 'h' in German words like 'aufhalten'; see first minimal example below). Ulrike Fischer was so kind to point out two things (http://tug.org/pipermail/luatex/2011-February/002569.html): First, the problem seems to be the break points between the adjusted kerning pairs. Second, ConTeXt seems to handle this case correctly (see second minimal example below; feature file bonum.fea from first example required). I am no expert in neither LuaTeX nor context, but Ulrike suggested I post here and ask if the (typographically correct) ConTeXt behavior or solution can be reproduced with Lua(La)TeX. Thanks, - Till 1. Lua(La)TeX \begin{filecontents*}{bonum.fea} languagesystem DFLT dflt; languagesystem latn dflt; feature kern { pos f h 100; } kern; \end{filecontents*} \documentclass{article} \usepackage[ngerman]{babel} \usepackage{fontspec} \setmainfont[FeatureFile=bonum.fea]{TeX Gyre Bonum} \begin{document} fh aufhalten \end{document} I cannot test that (I only have the context minimals installed) but I don't know anything about latex internals so it would be a wild guess. Maybe babel is interfering? In base mode kerning and and hyphenation happen in the traditional tex way, so there is not much extra trickery taking place. 2. ConTeXt \mainlanguage [de] \definefontfeature[test][featurefile=bonum,kern=yes] \definefont[test][name:texgyrebonum*test] \starttext \test fh aufhalten \stoptext Indeed I see a kern. Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] Hyphenation patterns and adjusted kerning: ConTeXt vs. LuaTeX
Maybe the ConTeXt community can be of assistance to the LuaTeX bunch ... As a new LuaTeX user, I came across the following problem: Using Lua(La)TeX, customized kerning of letter pairs (via the FeatureFile capability of fontspec) is ignored when it coincides with a possible hyphenation of a word (e.g. between 'f' and 'h' in German words like 'aufhalten'; see first minimal example below). Ulrike Fischer was so kind to point out two things (http://tug.org/pipermail/luatex/2011-February/002569.html): First, the problem seems to be the break points between the adjusted kerning pairs. Second, ConTeXt seems to handle this case correctly (see second minimal example below; feature file bonum.fea from first example required). I am no expert in neither LuaTeX nor context, but Ulrike suggested I post here and ask if the (typographically correct) ConTeXt behavior or solution can be reproduced with Lua(La)TeX. Thanks, - Till 1. Lua(La)TeX \begin{filecontents*}{bonum.fea} languagesystem DFLT dflt; languagesystem latn dflt; feature kern { pos f h 100; } kern; \end{filecontents*} \documentclass{article} \usepackage[ngerman]{babel} \usepackage{fontspec} \setmainfont[FeatureFile=bonum.fea]{TeX Gyre Bonum} \begin{document} fh aufhalten \end{document} 2. ConTeXt \mainlanguage [de] \definefontfeature[test][featurefile=bonum,kern=yes] \definefont[test][name:texgyrebonum*test] \starttext \test fh aufhalten \stoptext ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] hyphenation patterns
Mojca Miklavec (2010-05-24 02:16): > Dear Claudio, > > Thanks a lot for your prompt reply. > > On Mon, May 24, 2010 at 00:39, Claudio Beccari wrote: > > Dear Mojca, > > no proper Italian word ends in ch (this digraph in normal Italian words is > > pronunced as k, not as č or ć). > > Nevertheless there are a number of surnames dating back to the old times > > (150 years ago) when North East Italy was under Austro-Hungarian ruling, > > when Istrian names, mainly Croatian and Slovenian, where transliterated in > > such a way that the tipical patronimic ending -ič or -ić (I don't know the > > exact spelling in Latin letters of the Croatian/Slovenian names) was > > transliterated for the Empire bureaucracy with -ich. > > Thanks a lot for some more insight. I admit that I didn't know the > details (I should be ashamed) and in my area they were more radical > with surname changes (mine was Michelazzi and I think that most > surnames here were "properly Romanized", for example Filipčič -> > Filippi, so again no problems with hyphenation :) :) :). > > > This spelling remained > > when North East Italy and Istria were annexed to the Kingdom of Italy at the > > end of WW1. After WW2 most of Istria returned mainly to Croatia and a small > > part to Slovenia, but the Slovenians and Croatians that had moved the NE > > Italy and had become Italian citizens maintained their surnames with the > > Austro-Hungarian spelling. > > > > When I prepared the hyphen patterns for Italian ad Latin I did think to > > this particular spelling, but I concluded that it was not so important; I > > was wrong, and I apologize. > > There's no need to apologize. First, there's an "infinite" number of > foreign names, so that one simply cannot get all of them right. I > guess that Lju-bl-ja-na is not properly hyphenated either (Lu-bia-na > is ok), but in my opinion it's a valid argument that one should change > the language when writing foreign names if they are to be hyphenated > properly. I can also easily imagine Slovenian patterns that would > hyphenate: > Fis-cher, Aac-hen, Go-ethe > when not knowing that those letters represent a single "letter"/sound > in foreign words. > > Second, I have no idea, but I think it was a pure coincidence that the > "problem" reported by Rogutės Sparnuotos is the same as that for > surnames of a group of people on North-East (I think that the name in > question comes from Russia with translitaration done by English). On > the other hand if it's just a tiny pattern that solves them all ... Thank you Mojca and Claudio for your replies. Mojca has guessed correctly: I merely noticed that the surname Manovich is hyphenated wrongly in the three languages I've tested. And I don't mind using \hyphenation{} or switching language for foreign names. I don't know how hyphenation patterns are made, so I was surprised to see the main rule of at least Latin/Italian/Lithuanian hyphenation broken (a syllable must contain a vowel). From your explanations it seems that hyphenation patterns are kind of case-by-case rules, so this problem is not suprising, since no common words end with '-ch' in these languages. Wonder if I'll find a maintainer of the Lithuanian patterns... -- -- Rogutės Sparnuotos ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] hyphenation patterns
On Sun, May 23, 2010 at 11:38 PM, Mojca Miklavec wrote: > hyphenate properly in Italian. Italian is a > what-you-see-is-what-you-pronounce language (in contrast to English) Apart some traps like glicine vs tagliare where syllable 'gli' is spelled in completely different way or anno (year) vs hanno (have in "they have") where the sound is the same or àncora (anchor) vs ancóra (again) and we usually write ancora vs ancora (yes, no difference: only the sound is different) or péro (pear tree) vs però (but) and so on. -- luigi ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] hyphenation patterns
On 24-5-2010 2:16, Mojca Miklavec wrote: There's no need to apologize. First, there's an "infinite" number of foreign names, so that one simply cannot get all of them right. I guess that Lju-bl-ja-na is not properly hyphenated either (Lu-bia-na why not just use hyphenmin values of 3 to prevent such cases - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] hyphenation patterns
Dear Claudio, Thanks a lot for your prompt reply. On Mon, May 24, 2010 at 00:39, Claudio Beccari wrote: > Dear Mojca, > no proper Italian word ends in ch (this digraph in normal Italian words is > pronunced as k, not as č or ć). > Nevertheless there are a number of surnames dating back to the old times > (150 years ago) when North East Italy was under Austro-Hungarian ruling, > when Istrian names, mainly Croatian and Slovenian, where transliterated in > such a way that the tipical patronimic ending -ič or -ić (I don't know the > exact spelling in Latin letters of the Croatian/Slovenian names) was > transliterated for the Empire bureaucracy with -ich. Thanks a lot for some more insight. I admit that I didn't know the details (I should be ashamed) and in my area they were more radical with surname changes (mine was Michelazzi and I think that most surnames here were "properly Romanized", for example Filipčič -> Filippi, so again no problems with hyphenation :) :) :). > This spelling remained > when North East Italy and Istria were annexed to the Kingdom of Italy at the > end of WW1. After WW2 most of Istria returned mainly to Croatia and a small > part to Slovenia, but the Slovenians and Croatians that had moved the NE > Italy and had become Italian citizens maintained their surnames with the > Austro-Hungarian spelling. > > When I prepared the hyphen patterns for Italian ad Latin I did think to > this particular spelling, but I concluded that it was not so important; I > was wrong, and I apologize. There's no need to apologize. First, there's an "infinite" number of foreign names, so that one simply cannot get all of them right. I guess that Lju-bl-ja-na is not properly hyphenated either (Lu-bia-na is ok), but in my opinion it's a valid argument that one should change the language when writing foreign names if they are to be hyphenated properly. I can also easily imagine Slovenian patterns that would hyphenate: Fis-cher, Aac-hen, Go-ethe when not knowing that those letters represent a single "letter"/sound in foreign words. Second, I have no idea, but I think it was a pure coincidence that the "problem" reported by Rogutės Sparnuotos is the same as that for surnames of a group of people on North-East (I think that the name in question comes from Russia with translitaration done by English). On the other hand if it's just a tiny pattern that solves them all ... > I will submit, at least for Italian, a revised > pattern file. I doubt I should do it also for Latin, although it does not > cost anything... In case you do submit any updates, I would be extremely grateful for submitting an update to http://www.ctan.org/tex-archive/language/hyph-utf8/tex/generic/hyph-utf8/patterns/hyph-it.tex instead of (or at least in addition to) the original file (you may remove the initial comments). Also, if you happen to have the original of http://www.tug.org/TUGboat/Articles/tb13-1/tb34becc.pdf it would be nice to include it into repository as documentation about Italian hyphenation (but that's all too off-topic for the ConTeXt mailing list). Thanks again, Mojca ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] hyphenation patterns
On Mon, May 24, 2010 at 01:22, Rogutės Sparnuotos wrote: > > \setuplayout[textwidth=0.2cm] > \starttext > \language[la] Manovich. > \stoptext > > hyphenates 'Manovich' into Ma-no-vi-ch, while it should be Ma-no-vich. The > same applies for Italian and Lithuanian languages (in LaTeX as well). > > Could there be such an omission in the hyphenation patterns? Or am I > missing something? Both Italian and Latin have the pattern "1c" meaning "break in front of any letter c unless another patterns prohibits that". Lithuanian patterns contain "i1c" which means "break between i and c". Nothing in ConTeXt can or will be fixed, but here's a short answer with four options of what you can do: 1. Use \hyphenation{Ma-no-vich} on top of your document 2. Use "Manovič" instead of Manovich (it then hyphenates properly in Latin at least, I didn't try the others); or "Манович" :) 3. Use \mainlanguage[la] bla bla bla {\language[en] Manovich} 4. Complain to the authors of Italian/Latin/Lithuanian patterns and ask them for a fix. Some explanation: I assume that this is not a native Latin, Italian or Lithuanian word. If you are talking about the artist name (Lev Manovich) then you are using English transliteration of Russian word and expect it to hyphenate properly in Italian. Italian is a what-you-see-is-what-you-pronounce language (in contrast to English) and you cannot expect that it will hyphenate properly all the foreign names that are not even transliterated "properly". An Italian word would most probably never end with "ch", so there's currently no pattern present that would prohibit that behaviour. I don't know Russian enough, but I would blindly guess that the right transliteration would be Manovič anyway (of course everyone would have a problem with getting the right accent and with proper pronounciation then) and German wikipedia somehow confirms that: Lev Manowitsch (russ. Лев Манович, wiss. Transliteration Lev Manovič; * 1960 in Moskau) Note that Germans transliterate the name differently and Italians could transliterate it in a different way as well. Since Lithuanian contains the letter "č", I would assume that they would transliterate the name with č anyway (disclaimer: my knowledge about Lithuanian is zero, so I'm not even sure how they pronounce that letter). For example particular - Serbian will never have a problem with hyphenation of foreign names: http://sr.wikipedia.org/sr-el/Алберт_Ајнштајн Albert Ajnštajn (nem. Albert Einstein) je bio teorijski fizičar ... The question is always: how many different foreign names to you want to hyphenate properly in any given language? On the other hand, even with Italian pronunciation, I guess that ch is considered to be a "single consonant" (I may be wrong in that, but it's not too relevant either), so adding an additional pattern "2ch." (or "4ch.", not sure which one is needed) cannot hurt. Mojca ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] hyphenation patterns
Is there anyone here who understands hyphenation patterns? Such a document: \setuplayout[textwidth=0.2cm] \starttext \language[la] Manovich. \stoptext hyphenates 'Manovich' into Ma-no-vi-ch, while it should be Ma-no-vich. The same applies for Italian and Lithuanian languages (in LaTeX as well). Could there be such an omission in the hyphenation patterns? Or am I missing something? Thanks, -- Rogutės Sparnuotos ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___