Maybe the key to compund words is in the hyphen.

Languages with many compound words also frequently use a special form
of hyphenation.  In English, one example of this case would be "wooden
and brick buildings", which in German is written "Holz- und
Backsteingeb�ude". Note that this hyphen has nothing to do with line
breaks.  This kind of hyphen most often appears before "and" or "or",
or (in lists) before the comma (Holz-, Backstein- und Stahlgeb�ude).
(There are a few more cases, e.g. "wooden rather than brick
buildings", where "rather than" takes the place of "and".)

In compound words where a glue letter is used, the glue letter appears
before the hyphen.  In compound words where a special form of the
first word is used, this special form appears before the hyphen even
though it would not be allowed as a word by it self.

In Swedish, a typicaly nound (such as "girl") has eight different
forms:

  flicka       - singular, nominative, indefinite   =     girl
  flickan      - singular, nominative, definite     = the girl
  flickas      - singular, genitive, indefinite     =     girl's
  flickans     - singular, genitive, definite       = the girl's
  flickor      - plural, nominative, indefinite     =     girls
  flickorna    - plural, nominative, definite       = the girls
  flickors     - plural, genitive, indefinie        =     girls'
  flickornas   - plural, genitive, definite         = the girls'

Added to this, however, is the form used in compound words: flick-
e.g. flick-cykel (girl's bicycle), flick-aktig (girl-ish).  A shop can
advertise new models of "flick- och pojkcyklar" (girls' and boys'
bicycles).

To cover Swedish (and Danish and Norwegian, and probably German), it
would be sufficient to distinguish the hyphenated form (flick-) as a
legal word of its own and the only legal prefix for compound words.
Thus, the typical noun would have nine different forms rather than
eight.

Just like English "sheep" (plural: sheep), there are many words in
these languages where some of the nine forms coincide.  In many cases,
the hyphenated form coincide with the singular-nominative-indefinite
form (often called "the basic form").  Still, when adding a word to
the dictionary, a form with nine fields would be generally applicable
for Swedish, Danish, and Norwegian.

In the old aspell dictionary format (all words listed), it would
suffice to list the nine forms:

  flicka
  flickas
  flickan
  flickans
  flickor
  flickors
  flickorna
  flickornas
  flick-

and the "flick-" form could be freely used as a prefix in compound
words.  Any word could be used as a suffix in compund words.

Could we have this support for "dictionary words ending in hyphen"
implemented?  It would be a great help to designing better
dictionaries for these languages.


-- 
  Lars Aronsson ([EMAIL PROTECTED])
  Aronsson Datateknik - http://aronsson.se/



_______________________________________________
Aspell-devel mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/aspell-devel

Reply via email to