At 10:40 PM 11/12/01 +0100, Corinna Hischke wrote:
>Hi,
>
>> I'm trying to generate a traditional end-of-book index using FOP (version
>> 0.20.2).
>> The idea is that in the XML document source the author can specify
>something
>> like:
>> ...
>>
>> Anyway, I've got most of it worked out in my head (and it hurts (:-)
>> especially the bit about multiple page-number-citations referencing a
>single
>> index entry). It all is so godawful gnarly and I thought I'd ask to see if
>> anybody has figured out an easier way to do this in XSL:FO.
>>
>> Any tips or references would be appreciated.
>
>I also thought of something like 'multiple page-number-citations' and came
>to the
>conclusion that markers could be used for that. I didn't try yet, but am I
>wrong?
>
>Corinna

Markers (fo:retrieve-marker) put content (as determined by the "best" 
qualifying fo:marker) in the static content. Plus only one marker gets 
retrieved. So you can see they are not intended for indexes - the spec 
indicates that markers are suited for manufacturing header (or footer, or 
sidebar) content that is somewhat dependent on current context - e.g. what 
is the current chapter title, what is the current section title, etc etc.

I think you might be able to do something with straight XSL, but it would be 
ugly, and I think you would normally have redundancies (3 occurrences of an 
indexed word on one page - what do you do?). Perhaps the best solution is a 
2-stage one: consider the possibility of doing one formatting pass that 
generates the XML area tree. Use a Perl or Python script to generate an 
index from this data. After all, it _is_ paginated. Then write an extra 
fo:page-sequence that creates the index, and re-run FOP to produce the final 
PDF document.

This is the kind of thing you have to do with LaTeX (well, with makeindex, 
not Perl scripts), for good reason. It's tough to do well any other way. :-)

I should add, LaTeX \index entries go right into the formatted text. There 
is an advantage to doing this with XSL also, as the decision-making remains 
with the original XML. In this case the XSL/FOP procedure for index 
generation could be identical to LaTeX (no need to use the XMLRenderer any 
more):

1) Place <index entry="index_text"/> in those spots in your original XML 
where you know that have content that you wish to index with "index_text";
2) Run your XSLT, and have the <index.../> tags converted into some 
<fox:index.../> construct. These elements have meaning to indexer only, 
which can be invoked when FOP is run - the effect is to open up an index 
file and record entries by page number;
3) Review the index file. Edit it, OR edit the original XML and rerun FOP, 
or both, until the index file is satisfactory;
4) Run a Perl or Python script (I admit grudgingly that it could be Java 
also) to take the index file and produce an XML file that will convert into 
a page-sequence (the XSLT needs to be ready for this, as required); this can 
be added into the original XML with a reference.
5) Rerun XSLT and FOP, and voila.

I think that an index will require this much work in general, no more and no 
less. It is an art form to produce a good, useful index and it is just not 
going to happen with a simple, automated pass. I also want to stress that 
indexes are derivative - they represent new content, and have parallels with 
footnotes. Some of the discussion so far has seemingly treated indexes as 
being more like word search indexes, and that is not what we are talking
about.

Just some thoughts.

AHS



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to