On Mon, Apr 30, 2012 at 12:25 PM, DM Smith <dmsm...@crosswire.org> wrote: > On 04/30/2012 10:36 AM, Jonathan Morgan wrote: > > Hi DM, > > On Tue, May 1, 2012 at 12:00 AM, DM Smith <dmsm...@crosswire.org> wrote: >> >> >> On 04/30/2012 09:37 AM, Daniel Owens wrote: >>> >>> >>> >>> On 04/30/2012 06:54 AM, Chris Little wrote: >>>> >>>> On 4/30/2012 4:39 AM, David Troidl wrote: >>>>> >>>>> Hi Chris, >>>>> >>>>> I'm certainly no expert on your TEI dictionaries, but wouldn't it make >>>>> sense to have the first key be one that would sort properly, and >>>>> present >>>>> the dictionary in true alphabetical order? I'm thinking of Middle >>>>> Liddell, as well as the Hebrew. This key wouldn't even necessarily have >>>>> to be shown to the user. The second key, the title, could then maintain >>>>> the proper accents for display, without hindering sorting, searching or >>>>> navigation. >>>> >>>> >>>> I confess, I don't understand what you're proposing this as an >>>> alternative to. >>>> >>>> In the example Karl cites, there's just one actual key per entry. It is >>>> an uppercased version of the entryFree's n attribute. This is the key that >>>> is sorted. >>>> >>>> The un-uppercased version from the n attribute is being rendered as part >>>> of the entry text via the TEI filters. This is the part I'm proposing we >>>> retain, but render somewhere else, e.g. right-justified at the bottom of >>>> the >>>> entry. >>>> >>>> We also render all the text of the entry, which in these cases includes >>>> the text from a title element. >>>> >>>> I don't know what 'true alphabetical order' means, but if you mean >>>> localized sort order, it's not possible with the current implementation of >>>> this module type. >>>> >>>> --Chris >>>> >>> >>> I think David's concern is something that needs to be dealt with. A >>> number of possibilities could be pursued, some of them together: >>> >>> 1. The current implementation is to sort by unicode code points. This >>> works particularly well with numeric keys. A quick solution for languages >>> for which such sorting is not alphabetical would be to follow David's >>> suggestion of using keys that the user does not even see. This has the >>> advantage of providing a workable solution right away, but there are some >>> problems with this. First, we could create a new "strongs" standard because >>> the current implementation does not actually hide keys. That could be solved >>> by making the keys so obscure that no one would remember them. Second, any >>> future, more robust solution would require reworking all modules keyed to >>> it. I have toyed with this solution, and it might be the pragmatic way >>> forward, but it is not ideal. >>> >>> 2. A localized sort order, which I think this is what David means by >>> true alphabetical order, would be a better long-term solution. >>> >>> 3. In addition, using genbooks for lexica would work for lexica that >>> are sorted by root, with subentries nested in a hierarchy, just like in the >>> Hesychius module and BDB. I have been working with Troy on this. >>> Unfortunately, front-ends do not recognize the Feature=HebrewDef option in >>> the conf file and allow genbooks as lexica. I can send anyone an example >>> lexicon if you are interested in working on this. In that case, instead of >>> @n as the key, */x-entry/@osisID would be the key. >>> >>> Any thoughts? >> >> >> I think there is a problem with the sorting of entries in dictionaries >> where the keys are not ascii. I don't remember the details, but I seem to >> remember it having been discussed here. >> >> For JSword, we'll be building a Lucene search index for the key, the term >> and the whole entry. A user lookup will be normalized and the search will >> return the key with which lookup will proceed internally as it does today. >> ICU provides the ability to create a localized sort key (not at all suitable >> for display) that can be used to sort dictionary entries for the end-users >> locale. I'm thinking that for TEI dictionaries the representation of the key >> should not be shown at all. > > > BPBible, and I believe some other frontends as well use binary search on the > original module order to locate a key in a virtual list. This provides very > noticeable speedups on large dictionaries like ISBE. I think this would > require the original module creation to place a module in localised key > order if we really wanted to order by that, not just have a lookup which as > I understand it would only be done when actually looking for a key? It also > really means that a module can be sorted in one and only one way. > > Then again, I'm not even sure we can guarantee any kind of binary search on > localised keys. > > A related issue for English dictionaries is allowing mixed-case dictionary > keys (and I think I have heard similar comments about Greek and maybe other > languages). At the moment I think SWORD requires dictionary keys to be > upper-case to ensure that they sort correctly, but really "Aaron's Rod" > looks much better than "AARON'S ROD". BPBible now attempts to automatically > and heuristically turn keys to mixed case, which I think looks a lot better, > but ideally this would be done in the same way as for other languages: > separating sort order from codepoint order in some way. > > > The idea given above is to have an index to the SWORD index. It can be built > to be ordered and accessed in whatever way is needed to solve the problems.
Last time I checked, this is what BibleTime does - creates a cache of the entries in a dictionary or such and updates them when it detects a version change in the installed module. I could be wrong, but that's how it used to work. --Greg > > As you note, the problem is that SWORD makes severe assumptions about the > order and nature of the keys. Unless care is taken uppercasing is not always > appropriate. For example in Turkish the uppercase of 'i' is not 'I'. > > In Him, > DM > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page