Hi all, @Caolan, Petr: I have made this answer of mine a cross post to lingucopmponent.dev as well. And since it is about lingucomponent issues it would be nice to continue the discussion there
@lingucomoment reades: This mail is a reply to a posting in the openoffice.dev list. > On Fri, 2008-02-08 at 14:05 +0100, Petr Mladek wrote: >> >> I think that the best solution would be to get rid of share/dict/ooo and >> look >> for the dictionaries into a common place, for example /usr/share/myspell. >> >> It would be nice get rid of share/dict/ooo/dictionary.lst. The dictionaries >> have well defined names. It is possible to create symlinks for compatible >> languages, ... Well, there might be problems with symlinks on Windows but it >> would be very useful on Linux. > > Specifically wrt dictionaries, as you probably know that's precisely > what we do on fedora where we've done away with dictionary.lst (well it > still works if you want to use it) and just auto-detect them and the > language/locale they service based on their names and add looking in a > system /usr/share/myspell location as well the shared OOo one and then > the per-user one. > > > If there's any interest in it, then I could try and perhaps upstream > this work and co-opt the existing --without-myspell-dicts or whatever > its called into a sort of --with-system-dicts=LOCATION and bind the code > off that, or something of that nature. It seems you guys have your own way with fedora to get rid of the dictionary.lst. Since we currently are in the same process I'd like to describe shortly what we are doing. From what I understood here so far our concept is different but both should be able to be used concurrently. Well at least if we sort out some issues of precedence if dictionaries for the same language and purpose are installed at various places and be identified with various means. Our planned, and for the most part by now implemented, idea was to allow for dictionaries to be installed/distributed as extensions. Thus our approach needs several new configuration entries. BTW as with OOo 3.0 we want to get red of the way those things currently work in OOo. In the meantime when my CWS tl41 is finished an is integrated the old and new behaviour will work both for a while. And for OOo 3.0 a proper migration from the old-working-way to the new one using configuration entries is planned. After that the old code should be removed. Now on to what we currently do or did in the CWS - the path settings for 'Linguistic' and 'Dictionary' have been changed to be multi-paths. The new 'Dictionary' path is now dedicated to those personal user-dictionaries as it always should have been. And the 'Linguistic' path is for data etc. that is to be used and found by an actual spell checker, hyphenator, ... implementation Thus those cnfiguration setting will soon look like this: <node oor:name="Linguistic" oor:op="fuse" oor:mandatory="true"> <node oor:name="InternalPaths"> <node oor:name="$(insturl)/share/dict" oor:op="fuse"/> <node oor:name="$(insturl)/share/dict/ooo" oor:op="fuse"/> </node> <prop oor:name="UserPaths"> <value>$(userurl)/wordbook</value> </prop> </node> <node oor:name="Dictionary" oor:op="fuse" oor:mandatory="true"> <node oor:name="InternalPaths"> <node oor:name="$(insturl)/share/wordbook/$(vlang)" oor:op="fuse"/> </node> <prop oor:name="WritePath"> <value>$(userurl)/wordbook</value> </prop> </node> As you can see the 'Linguistic' path covers all places where previously data files for the linguistic might have been installed. The 'UserPaths' entry is actually a string list and thus can also hold more than one path. The next we did is: - spell checkers, hyphenators, ... need to make configuration entries that describe what type of dictionary the may make use of. Such an enty will look like this: <node oor:name="SpellCheckers"> <node oor:name="org.openoffice.lingu.MySpellSpellChecker" oor:op="fuse"> <prop oor:name="SupportedDictionaryFormats" oor:type="oor:string-list"> <value>DICT_SPELL MySpell_old</value> </prop> </node> </node> The component has to specifiy it's implementation name and a list of dictionary formats it may make use of. We don't have implementations that make use of more than one format at the same time yet but we want to be flexible and future-safe with our new configuration entries. For example in the future we could have a dictionary format named DICT_SPELL_EXCEPT that is used to identify exception dictionaries. Something that Hunspell currently does not implement, but hopefully will do so at some point. Then it would be normal to support the two formats DICT_SPELL and DICT_SPELL_EXCEPT at the same time. On the other side of the line we now have the new entries for dictionaries: - dictionaries need to make entries in the configuration that state what they are to be used for. I may look like this: <node oor:name="Dictionaries"> <node oor:name="HunSpellDic_de_CH" oor:op="fuse"> <prop oor:name="Locations" oor:type="oor:string-list"> <value>%origin%/dictionaries/de_CH.aff %origin%/dictionaries/de_CH.dic</value> </prop> <prop oor:name="Format" oor:type="xs:string"> <value>DICT_SPELL</value> </prop> <prop oor:name="Locales" oor:type="oor:string-list"> <value>de-CH</value> </prop> </node> <node oor:name="HunSpellDic_en_US" oor:op="fuse"> <prop oor:name="Locations" oor:type="oor:string-list"> <value>%origin%/dictionaries/en_US.aff %origin%/dictionaries/en_US.dic</value> </prop> <prop oor:name="Format" oor:type="xs:string"> <value>DICT_SPELL</value> </prop> <prop oor:name="Locales" oor:type="oor:string-list"> <value>en-US</value> </prop> </node> </node> Especially this will easily allow to use the very same dictionary for more than one language. And a dictinary can only support one single format. The 'Locations' entry specifies where to fin the files. How the entry is to look like may depend upon the actual spell checker implementation though. It migt not be necessary to list all files needed but it would probably be safe in the odd case that more than one spell checker implementation is going to use the same dictionary. You may have noticed by now that there is actually no direct connection from the dictionaries to the spell checker. The only link is the indirect connection by the format name. Thus in SvtlinguConfig code has already been added that can be used by a spell checker to get the list of all dictionaries (within all paths listed for the 'Linguistic' path) that implement a specific format. By calling the respective function the spell checker can immediately get the list of dictionaries he can make use of. Thus I think Caolan approach to auto-detect the available dictionries can easily be joined with our planned setup. There are only a limited number of things to take care of: - the paths where to auto-detect installed dictionaries need to be added to be added to the list of 'Linguistic' paths. - We should not mess up a single path with different content as was done already in */user/wordbook were originally only the personal dictionaries belonged and later on the downloaded dictionaries for the linguistic were placed as well. And even worse dictionaries with different content had now the same extensions and were placed into the same directory. *ouch* - We need to define an order for precedence in case a dictionary (or better different versions of it) are installed in different places. Only one should be used... So what do you think Caolan? Can both of our solutions be joined? I think it should easily be possible. There is one thing I wonder about though: When you auto-detect those dictionaries aren't you indirectly making use of the code that maintains the 'DataFilesChangedCheckValue' value in the configuration (and of course the old/current linguistic entries). We actaully wanted to remove those code parts after OOo 3.0 since with configuration entries we no longer need to check what files are actually installed on the hard disk. We wanted to save even that occasionally required amount of time to scan for dictionaries... I think I have not missed anything important of the changes on our side. Thus I'm eager to hear the thoughts of both of you about joining forces here. Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]