Re: [docbook-apps] Japanese index

Tony Graham Tue, 24 Apr 2018 12:54:56 -0700

On 24/04/2018 19:39, Jan Tosovsky wrote:

has anybody any experience with generating Japanese back-of-the-book index
from DocBook source?


More than 20 years ago.

I am facing same issues discussed in this old thread (all entries end up in
the Symbols section):
https://lists.oasis-open.org/archives/docbook-apps/200605/msg00063.html

If I understand correctly, indices in Japanese should be grouped
phonetically:
https://www.slideshare.net/k16shikano/imybp-light

I've found promising Kuromoji library https://github.com/atilika/kuromoji
I can imagine it could somehow pre-process all index entries and generate
values for the 'sortas' attribute.


Slide 35 of those slides shows a corner case that a morphological
analyzer could get wrong. (I'm not able to test it, myself.)

If you were using 'kuromoji', you could concatenate the values of the
'Reading' feature for all of the parts of speech of an index entry and
use that as the 'sortas' value.

But it is still unclear how to tweak the index code to generate groups from
non-latin characters.


I don't know, either.

Or are there better ways?


It's probably not what you want to hear, but Antenna House does have a
commercial product for doing DocBook indexes:

https://www.antennahouse.com/antenna1/i18n-index-library/

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.
----
Skerries, Ireland
tgra...@antenna.co.jp

---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

Re: [docbook-apps] Japanese index

Reply via email to