Missatge de Grzegorz Kulik <gregorykku...@gmail.com> del dia dc., 22 de
des. 2021 a les 16:35:

> I have another question. In Polish "ich" can mean both "their" and "them".
> The tagger always chooses the "their" meaning, so I want to create a
> negative match, so that if the word is not followed by a noun, it would
> choose "them". I'm trying to put it together based on the documentation but
> I must be missing something.
>
> REMOVE ICH IF (0 ICH LINK NOT 1 NOUN);
>
> If this worked, it would tag the wordform as an inflected personal
> pronoun. What am I doing wrong?
>

Not necessarily. I don't know what exactly "ICH" is supposed to be. If it
is the lemma and both ICH ("their" and "them") have the same lemma "ich",
you can choose e.g.

REMOVE prn IF (0 ICH LINK NOT 1 NOUN);

Hèctor

>
> Best
> Greg
>
> We wtorek, 21 gru 2021 ô godzinie 17:59, Grzegorz Kulik (
> gregorykku...@gmail.com) pisze:
>
> Now I get it. Thank you very much!
>
> Best
> Greg
>
> We wtorek, 21 gru 2021 ô godzinie 10:27, Daniel Swanson (
> awesomeevildu...@gmail.com) pisze:
>
> (1C NOUN) would match ^a/b<n>/c<n>$ or ^a/b<n>$ but it would not match
>
> ^a/b<n>/c<adj>$ - you can read it as "if the next word can only be a
>
> noun".
>
>
> On Tue, Dec 21, 2021 at 10:10 AM Grzegorz Kulik <
>
> gregorykku...@gmail.com
>
> > wrote:
>
>
> Thank you both for the suggestions. I never considered CG because it looked 
> complicated but I actually got a grip of it right away. I went with:
>
>
> REMOVE NOUN IF (0 DET) (0 NOUN) (1 (n mp));
>
>
> and it works perfectly. It did not work with 1C there. I looked up the C 
> symbol in the documentation and it says "Every reading this position must 
> match the pattern (normally only 1 has to)". I don't know what this sentence 
> means. Every time this position is read, it must match the pattern? Can I 
> find any elaboration on this anywhere? I checked
>
> http://beta.visl.sdu.dk/cg3/single/
>
>  but can't seem to find anything about it there.
>
>
> Thank you!
>
> Greg
>
>
> We wtorek, 21 gru 2021 ô godzinie 09:25, Hèctor Alòs i Font (
>
> hectora...@gmail.com
>
> ) pisze:
>
>
>
>
> Missatge de Daniel Swanson <
>
> awesomeevildu...@gmail.com
>
> > del dia dt., 21 de des. 2021 a les 7:57:
>
>
> Hi Greg,
>
>
> The file where you want to write rules for this is
>
> https://github.com/apertium/apertium-pol/blob/master/apertium-pol.pol.rlx
>
>
>
> If you want something like "tacy is <det> before <n>", you could get that with
>
>
> SELECT DET IF (0 DET) (0 NOUN) (1 NOUN) ;
>
>
>
> The problem with this rule is that (1 NOUN) is not necessarily a noun, but 
> something that can be analysed as a noun at the moment this rule is executed. 
> Similarly, the 0 word may be correctly analysed as something else, like an 
> adjective. So, a more cautious rule can be, for instance:
>
>
> REMOVE NOUN IF (0 DET) (0 NOUN) (1C NOUN) ;
>
>
> The problem with this alternative variant of the rule is that it matches less 
> often than the first one. It may not solve cases Daniel's version solve, 
> although it probably makes less wrong decisions. Your knowledge of the 
> language, and testing on corpus, should help you decide what is better, or 
> maybe you will choose something else in the middle. Tuning can be done adding 
> a few rules, previous to the general one, for often words/cases.
>
>
> Hèctor
>
>
>
>
> Daniel
>
>
> On Mon, Dec 20, 2021 at 1:40 PM Grzegorz Kulik <
>
> gregorykku...@gmail.com
>
> > wrote:
>
>
> Hello all,
>
>
> I haven't contacted you for some time, I hope you are all well. I developed 
> the pol-szl pair and although the translation is quite reasonable, I decided 
> to make it better by improving the lexical selection. I've been reading the 
> documentation and managed to write several rules for forms that need 
> disambiguation and are the same parts of speech. However, I cannot find any 
> information anywhere about what to do if there is a form that can mean two 
> completely different things. Example in Polish:
>
>
> tacy (such) = taki<det><dem><mp><pl><nom>
>
> tacy (of a tablet) = 
> taca<n><f><sg><gen>/taca<n><f><sg><dat>/taca<n><f><sg><loc>
>
>
> The first meaning is obviously much more frequent but the translator chooses 
> the second one, which is less than desirable.
>
>
> What can I do to remedy this? Can I write rules for that manually? Should I 
> train the tagger? If so, what method would be the best? There's multiple 
> training methods and I don't know which one to choose for my pair. Could you 
> recommend me the best approach?
>
>
> Thank you in advance
>
> Greg
>
> _______________________________________________
>
> Apertium-stuff mailing list
>
> Apertium-stuff@lists.sourceforge.net
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
>
> _______________________________________________
>
> Apertium-stuff mailing list
>
> Apertium-stuff@lists.sourceforge.net
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> _______________________________________________
>
>
> Apertium-stuff mailing list
>
>
> Apertium-stuff@lists.sourceforge.net
>
>
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
>
> _______________________________________________
>
> Apertium-stuff mailing list
>
> Apertium-stuff@lists.sourceforge.net
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
>
> _______________________________________________
>
> Apertium-stuff mailing list
>
> Apertium-stuff@lists.sourceforge.net
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> _______________________________________________
>
> Apertium-stuff mailing list
>
> Apertium-stuff@lists.sourceforge.net
>
>
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to