Re: [Apertium-stuff] Update about superblanks in transfer
Tanmai Khanna čálii: > we no longer need the user to be worried about blank > positions in transfer rules. The latest update to the apertium code makes > it such that is now the same as . You can change the pos="X"/> in your transfer rules to just and it'll work. > > Now, the only thing you need to worry about when writing transfer rules is > whether you want a blank between the two LUs or not. signature.asc Description: PGP signature ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] Fixing Phonological Processes
Hi Zanga, Given the highly agglutinative nature of Yao morphology, using dix to model it is probably not a great option. Also, as you and Hèctor have concluded, the morphophonology will be much easier to model using twol. Given the extent to which the morphology involves prefixes, lexc (what we traditionally use with twol) is probably also a poor choice for modeling the morphology. However, lexd was designed as a replacement for lexc for languages like Yao (and works well with twol). I think this is the route you should take. Documentation is available here: https://github.com/apertium/lexd/blob/master/Usage.md Some languages in Apertium whose morphologies are already implemented in lexd (none are entirely complete yet, but some are pretty far along): Swahili: https://github.com/apertium/apertium-swa Lingala: https://github.com/apertium/apertium-lin Nivkh: https://github.com/apertium/apertium-niv Wamesa: https://github.com/apertium/apertium-wad I probably forgot a few, but these should provide good models (and two are related to Yao). There are also a couple other languages being developed using lexd that aren't public (yet). And of course you can message this list if you have trouble, or ask in real time in the IRC channel. -- Jonathan On Sat, Aug 29, 2020, 02:30 Zanga Chimombo wrote: > Yes. I think I should be using twol > > On Fri, Aug 28, 2020 at 3:56 PM Hèctor Alòs i Font > wrote: > > > > I don't think you have to do anything with the modes or the compilation > file. The problem is in the post-yao.dix file. > > If you add , it works: > > > > > > > > nk > > ng > > > > > > > > > > $ echo "~nka" | lt-proc -p yao.autopgen.bin > > nga > > $ echo "~nkb" | lt-proc -p yao.autopgen.bin > > nkb > > > > I don't know why without there is no match, but in any case you > need to add to the relevant places (words, affixes, etc.) you want to > trigger this rule. If you want that always nk + vowel should be ng, you > should this in twol, not here. > > > > Hèctor > > > > Missatge de Zanga Chimombo del dia dv., 28 d’ag. > 2020 a les 15:41: > >> > >> I am still not getting anywhere and both modes.xml and the Makefile > >> seem ok. My code is here: > >> https://gitlab.com/zangaphee/CiBantu/-/tree/master/twoc/apertium-yao > >> > >> On Fri, Aug 28, 2020 at 7:36 AM Hèctor Alòs i Font < > hectora...@gmail.com> wrote: > >> > > >> > The relevant files are modes.xml and Makefile.am I recommend taking a > look to them in e.g. apertium-fra and apertium-fra-cat (or any other > released pair using post-generation). In the first one you define the > pipeline, so copy and adapt the call to autopgen in the end. In the second > one you have the actual compilation of the programme. > >> > > >> > Missatge de Zanga Chimombo del dia dv., 28 > d’ag. 2020 a les 7:52: > >> >> > >> >> Hi again, I actually have: > >> >> > >> >> > >> >> > >> >> nk > >> >> ng > >> >> > >> >> > >> >> > >> >> > >> >> But it doesn't seem to get executed. Is there a missing flag/ switch > >> >> that I was supposed to initialise/ build with? I am not seeing > >> >> anything relating to building autopgen in the modes.xml file in the > >> >> monolingual directory...? > >> >> > >> >> On Thu, Aug 27, 2020 at 2:57 PM Hèctor Alòs i Font < > hectora...@gmail.com> wrote: > >> >> > > >> >> > Yes, it is in the monodix. It is just a mark put on the right > side, e.g. > >> >> > > >> >> > que > >> >> > que que n="itg"/> > >> >> > > >> >> > If you want, you may not put it, but if you have in the post-dix > file something like: > >> >> > > >> >> > > >> >> > > >> >> > nk > >> >> > ng > >> >> > > >> >> > > >> >> > > >> >> > ... then every nk will be substituted by ng. That is not what you > want, for sure. So better to put a mark in the dictionnary to know which > "nk" may be changed (in some contexts) to nk. > >> >> > > >> >> > Missatge de Zanga Chimombo del dia dj., 27 > d’ag. 2020 a les 15:18: > >> >> >> > >> >> >> Looking at the examples in apertium-fra.post-fra.dix it is clear > that > >> >> >> the tilde/ ~/ is inserted as some sort of marker earlier in > the > >> >> >> pipeline so that the PG recognises it and actions on it. > >> >> >> > >> >> >> Where in the pipeline is it inserted? Could you give me a line > number > >> >> >> of the insertion within the monodix perhaps? > >> >> >> > >> >> >> On Thu, Aug 27, 2020 at 12:12 PM Hèctor Alòs i Font > >> >> >> wrote: > >> >> >> > > >> >> >> > You can take a look, for instance to > https://github.com/apertium/apertium-fra/blob/master/apertium-fra.post-fra.dix > >> >> >> > > >> >> >> > For example (at line 633) : > >> >> >> > nen' > >> >> >> > > >> >> >> > Missatge de Hèctor Alòs i Font del dia > dj., 27 d’ag. 2020 a les 13:07: > >> >> >> >> > >> >> >> >> There two things in: > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> nk > >> >> >>
[Apertium-stuff] Update about superblanks in transfer
Hey guys! The wordbound blanks project handles blanks that are supposed to be reordered. Therefore, we no longer need the user to be worried about blank positions in transfer rules. The latest update to the apertium code makes it such that is now the same as . You can change the in your transfer rules to just and it'll work. Now, the only thing you need to worry about when writing transfer rules is whether you want a blank between the two LUs or not. *Input blanks will be stored as a queue and will be printed in order in all available spots in the rule output. * *Note:* - If the output rule has more blank spots than input blanks, then the remaining blank spots will be spaces. - If the output rule has less blank spots than input blanks, then the remaining input blanks will be output after the rule output. - If the input blank is an empty string, it is stored as a space. In some transfer rules, there are input patterns which don't have a space between them. In the output section of these transfer rules, used to give an empty string, but it will now give a space. To remove the blank from the output, you will need to remove the from the transfer rule and it will be fine. Here are some examples from the tests. EXAMPLE 1: Input: [blank1] ^worda/wordta$ ;[blank2]; ^wordb/wordtb$ [blank3]; ^hun/ho$ [blank4] There's no in rule output, so all blanks are after flushed after rule output. Output: [blank1] ^test1{^wordta$^wordtb$^ho$}$ ;[blank2]; [blank3]; [blank4] EXAMPLE 2: Input: [blank1] ^wordb/wordtb$ ;[blank2]; ^worda/wordta$ [blank3]; ^hun/ho$ [blank4] There's one in rule output, so it prints one and flushes the rest. Output: [blank1] ^test1{^wordta$ ;[blank2]; ^ho$}$ [blank3]; [blank4] This has been implemented for the chunker, interchunk, and postchunk. If you have any questions, suggestions, comments, etc., I'll be happy to respond to them. Thanks and Regards, *तन्मय खन्ना * *Tanmai Khanna* ___ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff
Re: [Apertium-stuff] Fixing Phonological Processes
Yes. I think I should be using twol On Fri, Aug 28, 2020 at 3:56 PM Hèctor Alòs i Font wrote: > > I don't think you have to do anything with the modes or the compilation file. > The problem is in the post-yao.dix file. > If you add , it works: > > > > nk > ng > > > > > $ echo "~nka" | lt-proc -p yao.autopgen.bin > nga > $ echo "~nkb" | lt-proc -p yao.autopgen.bin > nkb > > I don't know why without there is no match, but in any case you need to > add to the relevant places (words, affixes, etc.) you want to trigger > this rule. If you want that always nk + vowel should be ng, you should this > in twol, not here. > > Hèctor > > Missatge de Zanga Chimombo del dia dv., 28 d’ag. 2020 > a les 15:41: >> >> I am still not getting anywhere and both modes.xml and the Makefile >> seem ok. My code is here: >> https://gitlab.com/zangaphee/CiBantu/-/tree/master/twoc/apertium-yao >> >> On Fri, Aug 28, 2020 at 7:36 AM Hèctor Alòs i Font >> wrote: >> > >> > The relevant files are modes.xml and Makefile.am I recommend taking a look >> > to them in e.g. apertium-fra and apertium-fra-cat (or any other released >> > pair using post-generation). In the first one you define the pipeline, so >> > copy and adapt the call to autopgen in the end. In the second one you have >> > the actual compilation of the programme. >> > >> > Missatge de Zanga Chimombo del dia dv., 28 d’ag. >> > 2020 a les 7:52: >> >> >> >> Hi again, I actually have: >> >> >> >> >> >> >> >> nk >> >> ng >> >> >> >> >> >> >> >> >> >> But it doesn't seem to get executed. Is there a missing flag/ switch >> >> that I was supposed to initialise/ build with? I am not seeing >> >> anything relating to building autopgen in the modes.xml file in the >> >> monolingual directory...? >> >> >> >> On Thu, Aug 27, 2020 at 2:57 PM Hèctor Alòs i Font >> >> wrote: >> >> > >> >> > Yes, it is in the monodix. It is just a mark put on the right side, e.g. >> >> > >> >> > que >> >> > que que> >> > n="itg"/> >> >> > >> >> > If you want, you may not put it, but if you have in the post-dix file >> >> > something like: >> >> > >> >> > >> >> > >> >> > nk >> >> > ng >> >> > >> >> > >> >> > >> >> > ... then every nk will be substituted by ng. That is not what you want, >> >> > for sure. So better to put a mark in the dictionnary to know which "nk" >> >> > may be changed (in some contexts) to nk. >> >> > >> >> > Missatge de Zanga Chimombo del dia dj., 27 d’ag. >> >> > 2020 a les 15:18: >> >> >> >> >> >> Looking at the examples in apertium-fra.post-fra.dix it is clear that >> >> >> the tilde/ ~/ is inserted as some sort of marker earlier in the >> >> >> pipeline so that the PG recognises it and actions on it. >> >> >> >> >> >> Where in the pipeline is it inserted? Could you give me a line number >> >> >> of the insertion within the monodix perhaps? >> >> >> >> >> >> On Thu, Aug 27, 2020 at 12:12 PM Hèctor Alòs i Font >> >> >> wrote: >> >> >> > >> >> >> > You can take a look, for instance to >> >> >> > https://github.com/apertium/apertium-fra/blob/master/apertium-fra.post-fra.dix >> >> >> > >> >> >> > For example (at line 633) : >> >> >> > nen' >> >> >> > >> >> >> > Missatge de Hèctor Alòs i Font del dia dj., >> >> >> > 27 d’ag. 2020 a les 13:07: >> >> >> >> >> >> >> >> There two things in: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> nk >> >> >> >> ng >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> First is the that must precede (that's the ~ Kevin said >> >> >> >> because it is shown as a tilde in the output). If you don't have >> >> >> >> it, there won't be any matching. >> >> >> >> >> >> >> >> Second, is the , i.e. a space. So nk- will not match, but only >> >> >> >> nk followed by a blank (a preceded by an ). If matched, it will >> >> >> >> be replaced by ng followed by a blank to. >> >> >> >> >> >> >> >> Hèctor >> >> >> >> >> >> >> >> >> >> >> >> Missatge de Zanga Chimombo del dia dj., 27 >> >> >> >> d’ag. 2020 a les 12:31: >> >> >> >>> >> >> >> >>> Not sure I know what you mean by "~"...? Sorry. I'm new to this >> >> >> >>> >> >> >> >>> The input is "nkutenda". Expected output: "ngutenda". >> >> >> >>> >> >> >> >>> On Thu, Aug 27, 2020 at 11:26 AM Kevin Brubeck Unhammer >> >> >> >>> wrote: >> >> >> >>> > >> >> >> >>> > Zanga Chimombo >> >> >> >>> > čálii: >> >> >> >>> > >> >> >> >>> > > One of the processes that occurs in one of the languages I am >> >> >> >>> > > dealing >> >> >> >>> > > with is "nk-" becoming "ng-" >> >> >> >>> > > >> >> >> >>> > > I thought I would be able to fix this using the post generator >> >> >> >>> > > here: >> >> >> >>> > > https://gitlab.com/zangaphee/CiBantu/-/blob/master/twoc/apertium-yao/apertium-yao.post-yao.dix >> >> >> >>> > > >> >> >> >>> > > However, that doesn't fix it. Have I done it incorrectly? >> >> >> >>> > > Should I