Hello,

>> I would like to add a new language to LanguageTool, namely Walloon 
>> language (code "wa").
>
> that's great. As we have issues with language support becoming 
> unmaintained, I'd only like to add new languages when I can be sure they 
> are maintained in the long run. For you, this means: please keep working 

Walloon doesn't have much computer savvy speakers, so work on the
tool will not have the same level as in other languages.
On the other hand, the grammar rules don't change that often :)

I plan however to keep an eye on it and not abandon it.


I already implemented a few rules (regexp based, I don't yet understand
how to add POS tags; I tried putting a few lines in added.txt file
but it seems to have no effect).

And I found out what seems to be a bug (or at least a strange behaviour).

In French for saying "here I am" you said "me voilà".
However in Walloon "mi vola" is wrong, it should be "vo m' la";
that is, the pronoun(s) is(are) inserted in the middle of the "vola" word.

So I did a rule like this:

 <pattern>
   <token regexp="yes">([mtl]i|è[mt]|el)</token>
   <token regexp="yes">vo?(la|ci|cial|chal)</token>
 </pattern>
 <message> ... <suggestion>vo <match no="1" case_conversion="startlower"
        regexp_match="[èeÈE]?([mtlMTL])i?$" 
        regexp_replace="$1'"></match><match no="2" 
        regexp_match="vo?(la|ci|cial|chal)$"
        regexp_replace="$1"></match></suggestion></message>

In the suggestion it writes "vo " then the apostrophed pronoun
( mi/èm -> m', ti/èt -> t', li/el -> l' ), and the ending part of
the "vola" or variants.
If the pattern part starts in uppercase (eg: "Mi vla") then the suggestion
will transfer that casing, that is the starting "vo " will be "Vo ".
Then I set case_conversion="startlower" for the rest (to have "Vo m' la"
and not "Vo M' la").

That works fine as I want.

Then I did a rule using regexp:
(it is for things like "nel vola-t i nén" --> "vo nel la-t i nén")

 <regexp>[Nn](' el|el|i l') (vo?(la|ci|cial|chal))(-t i|-t ele|) nén</regexp>
 <message> ... <suggestion>vo <match ..... ></message>

well, the "vo " part is ignored; that is, things inside <suggestion> but
outside <match> seem to be ignored in such case.

So I rewrite the suggestion as this:
  <suggestion><match no="1"
     regexp_match="[Nn].* vo?(la|ci|cial|chal)(-t i|-t ele|) nén"
     regexp_replace="vo nel $1$2 nén"/></suggestion></message>

but I lost the case preservation feature (eg if it is "Nel vola..."
I only get "vo nel..." and not "Vo nel...")

Thanks


-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://chanae.walon.org/pablo/          PGP Key available, key ID: 0xD9B85466

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to