Thank you. That is a very clear answer.

It does pose a challenge for us from the point of roll out (the compiled 
dictionaries are packaged as part of the deployment at the moment). I do 
see a solution, but then it must be possible to refer to external sources 
from the default metafile. I found the following passage in the 
documentation:

A Composite Dictionary appears as a simple text file beginning with the 
magic line "@multilink:", followed
by lines containing the URL of sub-dictionaries. URL are generally 
relative to the composite dictionary, but
can also be absolute. Referenced dictionaries can in turn be Composite.

and I'm hoping that such a URL can also point to a compiled dictionary 
outside the composite dictionary, let's say on a server.

Is this going to work? What is the downside of this approach (assuming 
that this is possible)? If this does not work, what are the alternatives?

Best regards / Mit freundlichen Gr??en / Sinc?res salutations

Rudolf de Grijs





Hussein Shafie <hussein at xmlmind.com> 
Sent by: xmleditor-support-bounces at xmlmind.com
21-01-2009 17:25
Please respond to
"xmleditor-support at xmlmind.com" <xmleditor-support at xmlmind.com>


To
Rudolf de Grijs <rdegrijs at epo.org>
cc
"xmleditor-support at xmlmind.com" <xmleditor-support at xmlmind.com>
Subject
Re: [XXE] Spellchecking derived words






Rudolf de Grijs wrote:
> 
> What I did notice is that for the dutch language there is a list with
> base words and a list with derived words. 

There is no concept of derived words in our spell-checker. Our
spell-checker is designed to efficiently crunch flat word lists. Note
that a  word list may be really huge (*millions* of words).



> If I would check the German word
> /Anmeldungsgegenstand /with the default included dictionary then this
> word is not recognized. But the words /Anmeldung /and /Gegenstand /are
> known.

That's right. You need to add words like Anmeldungsgegenstand to your
word list, because there is an "s" between Anmeldung and Gegenstand.

Note that you don't have to do that for ``true compound words''. German
example: In "Obama beginnt mit Krisengespr?chen", Krisengespr?chen is
found to be OK by our spell-checker though it has not been explicitly
added to the German word list from which the corresponding dictionary
comes from. However the German word list indeed contains: Krisen and
Gespr?chen.



> So I do get the feeling that a list is required for those (most
> frequently used) derived words (like I have included for the dutch
> dictionary). My question is: is my assumption correct or is their some
> algorithm that can perform the spellcheckon these derived words?

There is no such algorithm. Feel free to add all the derived words you
want. But no need to add ``true compound words'' and no need to add
words starting with common prefixes (e.g. auto). See "-prefixes
word_list" in http://www.xmlmind.com/_dictbuilder/doc/using_builder.html.

See also "%compoundmin length" in
http://www.xmlmind.com/_dictbuilder/doc/hints_file.html



---
PS: For the reasons explained before, our spell-checker does not detect
any error for Anmeldunggegenstand (without the "s"). The German language
is very, very, difficult to spell-check. The fact that our spell-checker
has severe flaws in the case of German is one of the reasons why we have
retired our spell-checker as a commercial product.


 
--
XMLmind XML Editor Support List
xmleditor-support at xmlmind.com
http://www.xmlmind.com/mailman/listinfo/xmleditor-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.xmlmind.com/pipermail/xmleditor-support/attachments/20090122/2452cf0a/attachment.htm
 

Reply via email to