If regexp is good enough, you could do that in disambiguation. But no certainty that the word as a whole is correct.

Ruud

On 06-02-13 13:05, Jaume Ortolà i Font wrote:
2013/2/6 Ruud Baars <baar...@xs4all.nl <mailto:baar...@xs4all.nl>>

    Jaume,

    I was not working on compouding; that has been adresses quite well
    in Hunspell already (though not as fast as we would like, I know).
    It is about uncompounding. Suppose you found an unknown word, and
    want to know the postag. It would then be possible to disassebmle
    the word, and use the last part to get the postag.
    'lange/termijn/plan' would get the postag of 'plan'.


I was thinking in the same process: disassemble an unknown word to find the postag. I will try to implement something useful for affixes.

Jaume

    On 06-02-13 11:55, Jaume Ortolà i Font wrote:
    2013/2/6 Ruud Baars <baar...@xs4all.nl <mailto:baar...@xs4all.nl>>

        Thanks for the info. I did not mean affixes as compounding
        parts though, just compounding of full words.


    Compounding of full words is not so usual as derivation with affixes.

        Is americanoplaça an acceptable word?


    I would say that It is well formed, but not used. With americano-
    (which is not a word we have in the tagger dictionary, but a
    prefix you can find in common dictionaries or grammar books) we
    would build only words like these:

    With some suffixes: americanòfil, americanofília, americanòfob,
    americanofòbia...
    Or with some adjectives: americanocatòlic (Catholic American),
    americanoirlandès (Irish American), americanofrancès (French
    American)...

    "Americanoliteratura" (American literature) or "Americanocotxe"
    (American car) are not used.

        Is the alteration from o to ó (and maybe others) controlled
        by rules, or is it 'random'?


    There are rules. In this case it comes from the suffix.

    Then, in conclusion, would it be needed different approaches for
    derivation with affixes and compounding?

    Regards,
    Jaume


        On 06-02-13 10:25, Jaume Ortolà i Font wrote:
        In Catalan new words are created by compounding and
        derivation. It would suffice to have a list of common
        prefixes and suffixes, to know the class of words to which
        every affix can be united (i.e. noun, adjective, verb,
        another affix), and a few rules of ortographical change in
        the concatenation point. Rules like this:

        ç before a becomes c before e
        c before a becomes qu before e
        g before a becomes gu before e
        j before a becomes g before e
        and a few more

        For example, "plaça" (square) becomes "placeta" (little
        square). In the dictionary we have "plaça" as a noun. And we
        should have -et/-eta/-ets/-etes as a suffix that creates a
        diminutive and can be added to nouns and adjectives.

        In a few cases, a diacritic can be added:
        americano- + -fil becomes americanòfil

        Regards,
        Jaume Ortolà




        2013/2/6 Ruud Baars <baar...@xs4all.nl
        <mailto:baar...@xs4all.nl>>

            A long time ago I prototyped a word uncompounder for Dutch.
            Though it worked, it was far from elegant and supporting
            only Dutch.

            Earlier this week I found a more elegant soution, able
            to uncompound
            words like
            'langetermijnplanning' into 'lange termijn planning'.

            In Dutch there are 4 possible compounding insertions:
            none (word+word),
            an s (word+s+word), a dash (word+-+word) and the
            combination (word+s-+word).
            The number of parts in the compound is not limited in
            any way
            (theoretically).
            Generally, uncompounding works well with parts of at
            least 5 chars.
            Shorter parts lead to wrongly uncompounded words. Some
            parts of shorter
            length are still safe to use though (e.g. jazz).

            Now my question:  What about other languages?
            - Is your language compounding or not?
            * Are there special situations when compounding, like
            changing the
            letters on the concatenation point?
            - which cancatenation insertions are there for your
            language?
            - Which part of the compound is sematically the essence
            of the word (
            langetermijnplanning, long term plan, is mostly a plan,
            term and long
            are specifiers)

            When I know a bit more, I could try to adjust the
            prototype code to
            support multiple languages by design.

            Thanks in advance,

            Ruud

            
------------------------------------------------------------------------------
            Free Next-Gen Firewall Hardware Offer
            Buy your Sophos next-gen firewall before the end March 2013
            and get the hardware for free! Learn more.
            http://p.sf.net/sfu/sophos-d2d-feb
            _______________________________________________
            Languagetool-devel mailing list
            Languagetool-devel@lists.sourceforge.net
            <mailto:Languagetool-devel@lists.sourceforge.net>
            https://lists.sourceforge.net/lists/listinfo/languagetool-devel




        
------------------------------------------------------------------------------
        Free Next-Gen Firewall Hardware Offer
        Buy your Sophos next-gen firewall before the end March 2013
        and get the hardware for free! Learn more.
        http://p.sf.net/sfu/sophos-d2d-feb


        _______________________________________________
        Languagetool-devel mailing list
        Languagetool-devel@lists.sourceforge.net  
<mailto:Languagetool-devel@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/languagetool-devel


        
------------------------------------------------------------------------------
        Free Next-Gen Firewall Hardware Offer
        Buy your Sophos next-gen firewall before the end March 2013
        and get the hardware for free! Learn more.
        http://p.sf.net/sfu/sophos-d2d-feb
        _______________________________________________
        Languagetool-devel mailing list
        Languagetool-devel@lists.sourceforge.net
        <mailto:Languagetool-devel@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/languagetool-devel




    
------------------------------------------------------------------------------
    Free Next-Gen Firewall Hardware Offer
    Buy your Sophos next-gen firewall before the end March 2013
    and get the hardware for free! Learn more.
    http://p.sf.net/sfu/sophos-d2d-feb


    _______________________________________________
    Languagetool-devel mailing list
    Languagetool-devel@lists.sourceforge.net  
<mailto:Languagetool-devel@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/languagetool-devel


    
------------------------------------------------------------------------------
    Free Next-Gen Firewall Hardware Offer
    Buy your Sophos next-gen firewall before the end March 2013
    and get the hardware for free! Learn more.
    http://p.sf.net/sfu/sophos-d2d-feb
    _______________________________________________
    Languagetool-devel mailing list
    Languagetool-devel@lists.sourceforge.net
    <mailto:Languagetool-devel@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/languagetool-devel




------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb


_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to