Re: [NTG-context] Greek hyphenation patterns

2006-06-30 Thread Peter Heslin
Hans Hagen <[EMAIL PROTECTED]> writes:

> i prefer the rules, so if you can sort that out with peter

In that case, you can examine the internals of my Perl script
elhyph-utf8 and translate its logic to Ruby in ctxtools.  But that is a
non-trivial effort, and I cannot do it.  A better alternative may be to
have ctxtools simply call elhyph-utf8 as an external script.  Does
Context still have a dependency on Perl?  If so, it would be much easier
just to call the Perl script.  I would be happy to ensure that
elhyph-utf8 remains format-neutral.

[A footnote: the original patterns are not Latex-specific, as you said,
but are specific to the LGR encoding, which Latex Babel happens to use;
but that Greek encoding is older than Babel, I think, and is also used
elsewhere in the TeX world.]

> since there is no infrastructure for patterns, and since i want to 
> independent of anything happening in that area (keep in mind that we've 
> been bitten by that too often: renaming, disappearing, funny internals, 
> latex specific, limited encodings, etc)

I can appreciate your pain, but I'm sure that you are aware that there
is also a danger in having Context fork its own patterns: that you may
introduce bugs (as happened in this case), or that you may not pick up
on upstream bug-fixes.  Jonathan Kew has suggested that it might be
desirable to have a set of general-purpose utf-8 hyphenation patterns in
the texmf tree, which could be used by various TeX applications.  From
your comments it is clear that, in order for the Context community to
buy into such a scheme, it would be necessary for this collection of
patterns to be managed carefully, by consensus, and in a format-neutral
manner, with good advance communication of any changes.  If this were to
happen, the advantage for Context is that the dangers I mentioned above
could be minimized.  But it is up to you to balance the potential risks
and benefits for Context.

-- 
Peter Heslin (http://www.dur.ac.uk/p.j.heslin)

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


[NTG-context] Greek hyphenation patterns

2006-06-29 Thread Peter Heslin

A few weeks ago, I looked at Context, because I wanted utf-8 hyphenation
patterns for ancient Greek, but then I saw that the patterns shipped
with Context have serious bugs.  I had hoped to patch ctxtools, but the
required changes went beyond my knowledge of Ruby.

I recently posted a Perl script to the xetex mailing list that should
perform the conversion to utf-8 correctly.  I would be happy to modify
the script to make the output more useful to Context users, but I don't
use Context myself.  Feedback is welcome.

The essential problem with the patterns shipped with Context is that it
is the result of a simple conversion, but the hyphenation rules in Greek
are based on the definition of vowels and consonants, which changes in
utf-8.  The original 8-bit patterns of Dimitrios Filippou depend on the
fact that in the Babel encoding accents come before the vowel (except
for iota subscript), whereas in Unicode they are either combined with
the vowel or come after it, depending on whether you use precomposed
characters or not.

-- 
Peter Heslin (http://www.dur.ac.uk/p.j.heslin)

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context