Gourab you can also follow
https://github.com/RajarshiRoychoudhury/apertium-bn-hi

On Wed, 31 Mar 2021 at 01:21, Hèctor Alòs i Font <hectora...@gmail.com>
wrote:

> Hi, Gourab.
>
> I don't know if you already got other reviews in the IRC channel. Here are
> my five cents:
>
> 1) Did you do the coding challenge? This is a must.
>
> 2) It would be good to know much about the current state of the hin-ben
> pair. Because there isn't any information on this in your proposal, I've
> taken a look at the repositories on GitHub. I've been surprised that there
> is no hin-ben yet created in the Apertium repository (although there is
> https://github.com/srj31/apertium-ben-hin) The hin monodix has 30,000+
> entries and the ben monodix some 8,000. Furthermore, as I imagined, the
> morphological disambiguator for Hindi has very few rules (I guess they are
> not very necessary for translating to Urdu).
>
> So there is quite a lot of work. It'll be very hard to really create a
> translator with a WER below 25% (except if srj31's project has already
> quite a lot of work and may be used).
>
> 3) Are there any free sources than can be used to fill the bidix (e.g. the
> Wiktionary)? Or do you plan to translate by hand at least 10,000 Hindi
> words? (much better 12,000-14,000 words for getting a WER bellow 30%). How
> many words will you be able to translate per day? Only this would take most
> of your time. And, since there are only 8,000 words in the Bengali monodix,
> you'll need to add many of them in the Bengali monodix, which also needs
> quite a lot of time. Again the same question: we'll you need to create
> these words (and maybe the paradigms) in the monodix, or you'll be able to
> get many new words (and their association to Apertium paradigms) from free
> electronic sources?
>
> 4) In fact, your targets seem to be more a wish than something able. I
> recommend that you try to create a calendar per week, in order to better
> understand how much time you'll have to add words, create transfer rules,
> morphological disambiguation rules and lexical selection rules. I don't
> know anything on Indo-Iranian languages, but all Indo-European languages I
> know need quite a lot of work on morphological disambiguation and, despite
> this, it is one of the main sources of errors in the Apertium translators.
>
> You can take a look on this work plans:
> https://wiki.apertium.org/wiki/Grfro3d/proposal_apertium_cat-srd_and_ita-srd#Workplan
>
> https://wiki.apertium.org/wiki/User:Hectoralos/GSOC_2020_proposal:_French-Arpitan#Workplan
> (but take into account that in the previous years the number of hours
> devoted to a GSoC project were twice as high as this year's)
>
> 5) Why do you have to improve the Bengali morphological analyser? Why
> adding inflections for both Bangladeshi Bengali and Indian Bengali? The
> project is already too complex and overloaded to add the possibility of
> generating two flavours of Bengali (because it would be a matter of
> generating Bengali, not of parsing it for translating into Hindi). I would
> generate the Bengali that is currently in the Bengali monodix (the Indian
> one, I guess).
>
> Best,
> Hèctor
>
> Missatge de Gourab Chakraborty IIIT Dharwad <19bcs...@iiitdwd.ac.in> del
> dia dl., 29 de març 2021 a les 20:20:
>
>> Hi all,
>> I am planning to create the Apertium Hindi-Bengali language pair as per
>> the suggestions I was given by the developers. The GSoC application window
>> would begin soon, so I request the mentors to kindly give a review of my
>> final proposal, for any last minute changes that are required.
>>
>> Thanks a lot!
>> --
>> Gourab Chakraborty
>> IRC: gourab337
>> _______________________________________________
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to