On 17:14 Mon 26 Sep     , [EMAIL PROTECTED] wrote:
> > For each flag in the affix file, it applies
> > the rules under that flag *in reverse*
> > (i.e. strips affixes) to all of
> > the words it sees in the corpus and looks for
> > common "root words".
> 
> Can it only strip? The condition and 
> modification (addition) are important.

Yes, it does both  ("strip" was a bad choice
of words).  It applies the rules in the affix
file in reverse.

> > such that 5 out of 6 of the flag A rules apply
> > to give the root "grok".  Then it might be safe to add
> > "grok/A" to the word list.  
> 
> Why would it be safe if you could not find the sixth
> rule in the corpus?

It *might* be safe.  For the real-world examples I've
tried this on (Irish, Swahili, Basque, Hiligaynon, Tagalog,
Finnish) we've found this happens more often than not:
most, but not all, of the rules under an affix are
reflected in real corpora, so you have to check the
missing examples manually.    That or flag them as
"rarely occurring" and not worry about them.

> directed 36
> direction 31
> directional 36
> directions 32
> 
> What do 31, 32, etc mean?
> Are they groups of flags in myspell?
> Where are these flags documented?
> Where is their connection with the affixes documented?

These represent parts of speech 31=N, 32=plural N, etc.
They are just an example (there is no real working English
version of Gramadóir, just this illustration).  
You specify a mapping between these codes and XML tags
in the input file pos-en.txt; something like this:

31 <N>
32 <N p="y">
33 <V>
34 <V t="y">
35 <V t="n">
36 <A>
37 <R>
38 <C>

Gramadóir doesn't use the myspell affix code or file format directly,
so there is no connection between these tags and myspell flags.

> For example the error in Hungarian:
> I see two boys
> if we write boys it is an error, because 
> if the number is there, the noun must be 
> singular.
> 
> In human language: after a verb if there is a number, or words
> that express quantites( many, several, some), the subsequential noun
> must be singular. 
> 
> How to formulate this in gramadoir?

It depends on how you've chosen the part of speech tags;
"simple" tags like the ones above might give:

<V>ANYTHING</V> <A num="y">ANYTHING</A> <N p="y">ANYTHING</N>:SINGULAR

ANYTHING is a macro for a regular expression matching any word at all.
And here I'm assuming you've tagged numbers with <A num="y">. 
You could also tag "many, several", etc. the same way and this rule
would work without modification.  Or just add another:

<V>ANYTHING</V> <A>(?:many|several|some)</A> <N p="y">ANYTHING</N>:SINGULAR

Kevin


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to