Francis Tyers <fty...@prompsit.com> writes:

> El dg 28 de 02 de 2010 a les 21:40 +0200, en/na Harri Pitkänen va
> escriure:
>> On Sunday 28 February 2010, Francis Tyers wrote:
>> > > I don't know Icelandic at all and therefore can't tell whether some of
>> > > the  words are accepted or rejected incorrectly.
>> > 
>> > Nice, it looks good. Some of the capitalised words should be recognised
>> > corrected, at least 'Bretlandi' and 'Norðmenn' .
>> 
>> I tried to fix the checking of capitalized words but started to run into 
>> problems. It seems that the library API works in somewhat surprising (at 
>> least 
>> to me) ways when you enter a word that starts with a capital letter and ends 
>> with garbage.
>> 
>> The implementation is here
>> http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/src/morphology/LttoolboxAnalyzer.cpp?revision=3182&view=markup
>> 
>> and test cases here
>> http://voikko.svn.sourceforge.net/viewvc/voikko/trunk/libvoikko/python/ApertiumIcelandicTest.py?revision=3183&view=markup
>> 
>> I was able to get all test cases expect the one with TODO in method name 
>> implemented. How would you suggest fixing the code so that all tests would 
>> pass? Of course a patch would be most welcome :)
>
> Hmm, strangely enough, when I try an unknown word I get similar strange
> output:
>
> $ ./test mor.bin 
> ^Reykjanghfghesi$ -->
> ^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$

Seems to be a bug with partly-matching regexes in the biltrans
functions.

Testing the different functions, I get:

    biltransWithQueue: 
^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$
 qSize: 0
    biltransWithoutQueue: 
^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$
    biltrans: 
^Reykja<vblex><actv><inf>/Reykja<vblex><actv><pri><p3><pl>/Reykur<n><m><pl><gen><ind>$
    biltransfull: ^$

But, if I comment out the two regex entries

    <e>                      <par n="persons"/></e>
    <e>                      <par n="organisations"/></e>

at the end of apertium-is-en.is.dix, I get

    biltransWithQueue: @Reykjanghfghesi qSize: 0
    biltransWithoutQueue: @Reykjanghfghesi
    biltrans: @Reykjanghfghesi
    biltransfull: @Reykjanghfghesi

Similarly on the command line with lt-proc -b (while regular lt-proc -a
returns unknown, as it should – the persons/orgnisations regexes don't
fully match either).


-- 
Kevin Brubeck Unhammer

------------------------------------------------------------------------------
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to