If regexp is good enough, you could do that in disambiguation. But no
certainty that the word as a whole is correct.
Ruud
On 06-02-13 13:05, Jaume Ortolà i Font wrote:
2013/2/6 Ruud Baars <baar...@xs4all.nl <mailto:baar...@xs4all.nl>>
Jaume,
I was not working on compouding; that has been adresses quite well
in Hunspell already (though not as fast as we would like, I know).
It is about uncompounding. Suppose you found an unknown word, and
want to know the postag. It would then be possible to disassebmle
the word, and use the last part to get the postag.
'lange/termijn/plan' would get the postag of 'plan'.
I was thinking in the same process: disassemble an unknown word to
find the postag. I will try to implement something useful for affixes.
Jaume
On 06-02-13 11:55, Jaume Ortolà i Font wrote:
2013/2/6 Ruud Baars <baar...@xs4all.nl <mailto:baar...@xs4all.nl>>
Thanks for the info. I did not mean affixes as compounding
parts though, just compounding of full words.
Compounding of full words is not so usual as derivation with affixes.
Is americanoplaça an acceptable word?
I would say that It is well formed, but not used. With americano-
(which is not a word we have in the tagger dictionary, but a
prefix you can find in common dictionaries or grammar books) we
would build only words like these:
With some suffixes: americanòfil, americanofília, americanòfob,
americanofòbia...
Or with some adjectives: americanocatòlic (Catholic American),
americanoirlandès (Irish American), americanofrancès (French
American)...
"Americanoliteratura" (American literature) or "Americanocotxe"
(American car) are not used.
Is the alteration from o to ó (and maybe others) controlled
by rules, or is it 'random'?
There are rules. In this case it comes from the suffix.
Then, in conclusion, would it be needed different approaches for
derivation with affixes and compounding?
Regards,
Jaume
On 06-02-13 10:25, Jaume Ortolà i Font wrote:
In Catalan new words are created by compounding and
derivation. It would suffice to have a list of common
prefixes and suffixes, to know the class of words to which
every affix can be united (i.e. noun, adjective, verb,
another affix), and a few rules of ortographical change in
the concatenation point. Rules like this:
ç before a becomes c before e
c before a becomes qu before e
g before a becomes gu before e
j before a becomes g before e
and a few more
For example, "plaça" (square) becomes "placeta" (little
square). In the dictionary we have "plaça" as a noun. And we
should have -et/-eta/-ets/-etes as a suffix that creates a
diminutive and can be added to nouns and adjectives.
In a few cases, a diacritic can be added:
americano- + -fil becomes americanòfil
Regards,
Jaume Ortolà
2013/2/6 Ruud Baars <baar...@xs4all.nl
<mailto:baar...@xs4all.nl>>
A long time ago I prototyped a word uncompounder for Dutch.
Though it worked, it was far from elegant and supporting
only Dutch.
Earlier this week I found a more elegant soution, able
to uncompound
words like
'langetermijnplanning' into 'lange termijn planning'.
In Dutch there are 4 possible compounding insertions:
none (word+word),
an s (word+s+word), a dash (word+-+word) and the
combination (word+s-+word).
The number of parts in the compound is not limited in
any way
(theoretically).
Generally, uncompounding works well with parts of at
least 5 chars.
Shorter parts lead to wrongly uncompounded words. Some
parts of shorter
length are still safe to use though (e.g. jazz).
Now my question: What about other languages?
- Is your language compounding or not?
* Are there special situations when compounding, like
changing the
letters on the concatenation point?
- which cancatenation insertions are there for your
language?
- Which part of the compound is sematically the essence
of the word (
langetermijnplanning, long term plan, is mostly a plan,
term and long
are specifiers)
When I know a bit more, I could try to adjust the
prototype code to
support multiple languages by design.
Thanks in advance,
Ruud
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
<mailto:Languagetool-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
<mailto:Languagetool-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
<mailto:Languagetool-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
<mailto:Languagetool-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
<mailto:Languagetool-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel