Thank you. That is a very clear answer. It does pose a challenge for us from the point of roll out (the compiled dictionaries are packaged as part of the deployment at the moment). I do see a solution, but then it must be possible to refer to external sources from the default metafile. I found the following passage in the documentation:
A Composite Dictionary appears as a simple text file beginning with the magic line "@multilink:", followed by lines containing the URL of sub-dictionaries. URL are generally relative to the composite dictionary, but can also be absolute. Referenced dictionaries can in turn be Composite. and I'm hoping that such a URL can also point to a compiled dictionary outside the composite dictionary, let's say on a server. Is this going to work? What is the downside of this approach (assuming that this is possible)? If this does not work, what are the alternatives? Best regards / Mit freundlichen Gr??en / Sinc?res salutations Rudolf de Grijs Hussein Shafie <hussein at xmlmind.com> Sent by: xmleditor-support-bounces at xmlmind.com 21-01-2009 17:25 Please respond to "xmleditor-support at xmlmind.com" <xmleditor-support at xmlmind.com> To Rudolf de Grijs <rdegrijs at epo.org> cc "xmleditor-support at xmlmind.com" <xmleditor-support at xmlmind.com> Subject Re: [XXE] Spellchecking derived words Rudolf de Grijs wrote: > > What I did notice is that for the dutch language there is a list with > base words and a list with derived words. There is no concept of derived words in our spell-checker. Our spell-checker is designed to efficiently crunch flat word lists. Note that a word list may be really huge (*millions* of words). > If I would check the German word > /Anmeldungsgegenstand /with the default included dictionary then this > word is not recognized. But the words /Anmeldung /and /Gegenstand /are > known. That's right. You need to add words like Anmeldungsgegenstand to your word list, because there is an "s" between Anmeldung and Gegenstand. Note that you don't have to do that for ``true compound words''. German example: In "Obama beginnt mit Krisengespr?chen", Krisengespr?chen is found to be OK by our spell-checker though it has not been explicitly added to the German word list from which the corresponding dictionary comes from. However the German word list indeed contains: Krisen and Gespr?chen. > So I do get the feeling that a list is required for those (most > frequently used) derived words (like I have included for the dutch > dictionary). My question is: is my assumption correct or is their some > algorithm that can perform the spellcheckon these derived words? There is no such algorithm. Feel free to add all the derived words you want. But no need to add ``true compound words'' and no need to add words starting with common prefixes (e.g. auto). See "-prefixes word_list" in http://www.xmlmind.com/_dictbuilder/doc/using_builder.html. See also "%compoundmin length" in http://www.xmlmind.com/_dictbuilder/doc/hints_file.html --- PS: For the reasons explained before, our spell-checker does not detect any error for Anmeldunggegenstand (without the "s"). The German language is very, very, difficult to spell-check. The fact that our spell-checker has severe flaws in the case of German is one of the reasons why we have retired our spell-checker as a commercial product. -- XMLmind XML Editor Support List xmleditor-support at xmlmind.com http://www.xmlmind.com/mailman/listinfo/xmleditor-support -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20090122/2452cf0a/attachment.htm

