On Friday 06 August 2004 13:28, Magnus Johansson wrote: > Splitting compound words can be done quite effectively simply by using > a large wordlist. I have done this for swedish.
It is, however, difficult to get right for German. On the one hand there are compounds in German with more than two parts, on the other hand there are extra characters in the middle of some compound words (e.g. Arbeit + Aufwand = ArbeitSaufwand). Also, the compounds have their inflectional endings, e.g. the plural of Bergbahn is Bergbahnen. At http://lemmi.intrafind.org you can see a demo that deals with almost all cases, even things like "dazugekauftes" (but it's not freely available). Regards Daniel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]