Thanks again Jirka. Ok, then here's my understanding of the status of using autoidx.xsl with non-English indexes (assuming Saxon). Please correct me if I've got anything wrong.
To use autoidx.xsl for non-English languages (in addition to using the classes for Saxon mentioned below), I have to modify autoidx.xsl in two ways: 1) Supply upper and lower case letters of the alphabet which autoidx uses to create indexdivs. For languages lacking the distinction between upper and lower case, I just put the alphabet in both places so that indexdivs are created. Any words beginning with a character not in the alphabet provided here ends up in the symbol category. 2) Add an appropriate lang attribute to each xsl:sort in autoidx.xsl, whether hard coded or gotten by looking at a @lang somewhere in the input document so that Saxon will sort using the right Collator. For languages with accented characters, my choices are: a) Add the accented characters to &uppercase; and &lowercase; and so have words that begin with accented character end up in their own indexdivs, or b) Don't add these character to &uppercase; and &lowercase; and so have words that begin with those characters end up in the Symbols indexdiv c) Don't use words as indexterms if the first letter of the term has a diacritical mark of some kind :) For Traditional Chinese, where I understand indexdivs are based on the number of strokes rather than the initial character in the word, autoidx.xsl doesn't support automatically generated indexdivs. To do that, the stylesheet would have to be rewritten (and include the number of strokes in an attribute on the <primary> element). I understand that currently there is no way to have the stylesheets store multiple alphabets for &uppercase; and &lowercase; and use the appropriate one without the intervention of a processing system. I'm thinking of something along the lines of storing the declarations for uppercase and lowercase in files (en.ent, fr.ent), include parameter entity declarations that point to these files, and a reference to one of them, then have the processing system munge my customization of autoidx.xsl so that it includes the correct entity reference before using the xsl to process the document. The alternative to something like that is to have a separate customization layer (with its own autoidx.xsl) for each target language. Some of these things I'll understand better as we get further in our experimentation, but it's helpful to know what behavior to expect since it saves you from debugging something that's really working as designed :) Once I've got this figured out, I'll write something up that we can include somewhere in the docs or faq. Thanks, David -----Original Message----- From: Jirka Kosek [mailto:[EMAIL PROTECTED]] Sent: Friday, September 20, 2002 1:24 PM To: David Cramer Cc: [EMAIL PROTECTED] Subject: Re: DOCBOOK-APPS: Sorting and non-en_US indexes David Cramer wrote: > > Thanks Jirka. Just to make sure I understand how to use this: Once I > compile one class for each target language following the naming > convention Compare_<replaceable>language code</replaceable> (Compare_ja, > etc), I run saxon with the appropriate langague code? > > > java -Duser.language=<replaceable>language-code</replaceable> > com.icl.saxon.StyleSheet... > > ...and it should use the Compare class for the user.language? I never used user.language before, so I don't have idea for what is good for. I used <xsl:sort ... lang="<replaceable>language code</replaceable>"/> > In the case of Japanese, where there's no notion of upper/lowercase, I > shouldn't have to edit the declarations for &uppercase; and &lowercase;, > correct? I think that you should, because each letter creates separate division of index. As there are currently only English letters, Japanese index terms will show all in symbol division. Jirka -- ----------------------------------------------------------------- Jirka Kosek e-mail: [EMAIL PROTECTED] http://www.kosek.cz