
I've tried to find something relevant about the terrible Czech sorting :-)

The first thing to note is that there is a standard (from 1970 or so) that
is not implementable in fact, it requires such stupid sorts like Karel IV <
Karel III as one should sort it as the numbers were written in words
(ctvrty, treti) :-)

So in practice, there are more-or-less accurate approximations. Quite good
intro is http://www.vitsoft.info/sortkit.htm

The sorting is considered very reasonable if it conforms with order stated
in http://www.fi.muni.cz/~adelton/l10n/cssort/cssort.table . Characters on
a single line in the table are considered equivalent. Note the `ch'
character that is sorted between h and i. This table contains accented
letters that are not used in Czech (like crossed l, z dot above). It should
IMHO be also completely OK for Slovak (as they, I hope, inherited the

I think that it would be completely OK to sort according to that table
taking chars on single lines as equivalent. The modules the table is from
implements a four-pass sorting algorithm that reflects pretty damn rules,
see http://www.fi.muni.cz/~adelton/l10n/cssort/csort.c .

An example of sorted sequences is
http://www.fi.muni.cz/~adelton/l10n/cssort/sort.tab .

The question is if it is reasonable to implement it internally in ConTeXt
or to use an external module. An external Perl module was prepared by Tom
Hudec once (he even modified the sorting table, he preferred all letters
with `hacek (\v{})' to be greater than without \v. If you consider
internal ConTeXt implementation feasible, I'd be happy if you commented the
sorting macros a bit, so that I could contact native Czech users and
fine-tune it. I'd like to consult it with our Czech TeX frieds, I don't
feel myself to be a sorting expert (it's quite tricky, isn't it).


Early to rise, early to bed, makes a man healthy, wealthy and dead.
-- Terry Pratchett, "The Light Fantastic"
ntg-context mailing list

Reply via email to