Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-09 Thread Hèctor Alòs i Font
Yes, my own experience is that more or less simultaneous update of the
dictionaries is the quickest option.
I usually work on a spreadsheet with words in decreasing order of
frequency, and I write a script that reads it and generates the XML code
for inserting in the dictionaries. It's quick and it avoids lots of silly
errors.
Hèctor

Missatge de Sevilay Bayatlı  del dia dv., 9
d’abr. 2021 a les 11:58:

> Hi Anuradha,
> You need to update your proposal based on what Hèctor suggested, yeah it
> is better to work on both monodix and bidix simultaneously, but for a good
> lexicon, you need to take a small corpus and analysis the sentences and
> adding words.
>
> Sevilay
>
> On Thu, Apr 8, 2021 at 9:24 AM Anuradha Pandey 
> wrote:
>
>> Thank you for your response, Hèctor. I read the proposal for the
>> Hindi-Bengali translator. There aren't open-source dictionaries for the
>> Bhojpuri language (though there are resources for getting a Bhojpuri
>> corpus), so I was using a hardcopy of a BHO-HIN dictionary for manually
>> adding the pairs. I did some rough calculations, and I shall be able to add
>> at least 8,000 words to the monodix. And, based on my experience with
>> Apertium, I think simultaneously adding words in the bidix makes the work
>> easier, so I think roughly the same number of words in the bidix too. But,
>> I don't think I will be able to achieve a WER below 20% with 8000 words.
>> Should I aim for a WER of nearly 30% then?
>>
>> Since the time for GSoC has been reduced, I am planning to modify my
>> proposal and the inputs from mentors would be extremely helpful.
>>
>> On Wed, 7 Apr 2021 at 20:24, Hèctor Alòs i Font 
>> wrote:
>>
>>> Hi, Anuradha.
>>>
>>> Thanks for your proposal draft. First, I would like to tell you that if
>>> Apertium is a rule-based translation system, it is because this paradigm
>>> still makes sense for many languages (indeed, for the vast majority of
>>> them). If Bhojpuri has extensive electronic language resources and,
>>> particularly, bilingual linguistic corpora, then Apertium is probably not
>>> the best approach. But this is probably not the case. If it was, it would
>>> probably already be on Google Translate.
>>>
>>> As for the project. I would advise you to look at Gourab Chakraborty's
>>> proposal for a Hindi-Bengali translator and the comments on it. Most of the
>>> comments apply to your proposal as well. The following message would be
>>> useful to you, for instance:
>>> https://sourceforge.net/p/apertium/mailman/message/37251899/
>>>
>>> Your proposal seems to me unrealistic. 10,000 words in the monodix (and
>>> how many in the bidix?) are not enough for a WER below 20%, I think (maybe
>>> for two extremely close related languages).
>>>
>>> For better evaluation your proposal I'd like to find the answer for some
>>> basic questions:
>>>
>>> * Which is the current state of Bhojpuri language and, eventually,
>>> the Bhojpuri-Hindi language pair in Apertium?
>>> * Would you have to write a whole Bhojpuri morphological analyser from
>>> scratch and, afterwards, to add some 10,000 words manually assigning them
>>> to a given paradigm? How much time you'll need for this?
>>> * From where would you get the bilingual dictionary? Would you have to
>>> create it yourself? Are there freely available bilingual electronic
>>> dictionaries (like e.g. Wiktionary)?
>>> * Would you work on a Bhojpuri-to-Hindi translator or on a
>>> Hindi-to-Bhojpuri one? In any case there will be a quite a lot of work in
>>> the morphological disambiguation. But for one side you'll have it only
>>> once. If both Hindi-to-Bhojpuri and Hindi-to-Bengali are chosen (which is
>>> entirely possible), this work can be divided by the two projects.
>>>
>>> There is nothing wrong to this all this work by hand, if needed. It
>>> depends on the state of the language resources for the given language. But
>>> it is necessary to know to what extent you will have to do this
>>> time-consuming work.
>>>
>>> When we had twice the time in most of the cases the projects couldn't
>>> reach to create a working translator for a new language pair. In the
>>> current conditions, it is even more difficult.
>>>
>>> Hèctor
>>>
>>>
>>>
>>>
>>> Missatge de Anuradha Pandey  del dia dc., 7
>>> d’abr. 2021 a les 16:28:
>>>
 Hello everyone,
 I am Anuradha Pandey, a sophomore student at BITS Pilani. I am
 interested I participating in GSoC 2021, on the project - "*Develop a
 prototype MT system for a strategic language pair*".

 I have prepared a rough draft for the same and I am planning to build
 Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
 the coding challenge and I will update my work on the GitHub repository
 mentioned in the draft. It would be really helpful if I could get some
 feedback before I make the final submission.

 Link to the draft -

 

Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-09 Thread Sevilay Bayatlı
Hi Anuradha,
You need to update your proposal based on what Hèctor suggested, yeah it is
better to work on both monodix and bidix simultaneously, but for a good
lexicon, you need to take a small corpus and analysis the sentences and
adding words.

Sevilay

On Thu, Apr 8, 2021 at 9:24 AM Anuradha Pandey 
wrote:

> Thank you for your response, Hèctor. I read the proposal for the
> Hindi-Bengali translator. There aren't open-source dictionaries for the
> Bhojpuri language (though there are resources for getting a Bhojpuri
> corpus), so I was using a hardcopy of a BHO-HIN dictionary for manually
> adding the pairs. I did some rough calculations, and I shall be able to add
> at least 8,000 words to the monodix. And, based on my experience with
> Apertium, I think simultaneously adding words in the bidix makes the work
> easier, so I think roughly the same number of words in the bidix too. But,
> I don't think I will be able to achieve a WER below 20% with 8000 words.
> Should I aim for a WER of nearly 30% then?
>
> Since the time for GSoC has been reduced, I am planning to modify my
> proposal and the inputs from mentors would be extremely helpful.
>
> On Wed, 7 Apr 2021 at 20:24, Hèctor Alòs i Font 
> wrote:
>
>> Hi, Anuradha.
>>
>> Thanks for your proposal draft. First, I would like to tell you that if
>> Apertium is a rule-based translation system, it is because this paradigm
>> still makes sense for many languages (indeed, for the vast majority of
>> them). If Bhojpuri has extensive electronic language resources and,
>> particularly, bilingual linguistic corpora, then Apertium is probably not
>> the best approach. But this is probably not the case. If it was, it would
>> probably already be on Google Translate.
>>
>> As for the project. I would advise you to look at Gourab Chakraborty's
>> proposal for a Hindi-Bengali translator and the comments on it. Most of the
>> comments apply to your proposal as well. The following message would be
>> useful to you, for instance:
>> https://sourceforge.net/p/apertium/mailman/message/37251899/
>>
>> Your proposal seems to me unrealistic. 10,000 words in the monodix (and
>> how many in the bidix?) are not enough for a WER below 20%, I think (maybe
>> for two extremely close related languages).
>>
>> For better evaluation your proposal I'd like to find the answer for some
>> basic questions:
>>
>> * Which is the current state of Bhojpuri language and, eventually,
>> the Bhojpuri-Hindi language pair in Apertium?
>> * Would you have to write a whole Bhojpuri morphological analyser from
>> scratch and, afterwards, to add some 10,000 words manually assigning them
>> to a given paradigm? How much time you'll need for this?
>> * From where would you get the bilingual dictionary? Would you have to
>> create it yourself? Are there freely available bilingual electronic
>> dictionaries (like e.g. Wiktionary)?
>> * Would you work on a Bhojpuri-to-Hindi translator or on a
>> Hindi-to-Bhojpuri one? In any case there will be a quite a lot of work in
>> the morphological disambiguation. But for one side you'll have it only
>> once. If both Hindi-to-Bhojpuri and Hindi-to-Bengali are chosen (which is
>> entirely possible), this work can be divided by the two projects.
>>
>> There is nothing wrong to this all this work by hand, if needed. It
>> depends on the state of the language resources for the given language. But
>> it is necessary to know to what extent you will have to do this
>> time-consuming work.
>>
>> When we had twice the time in most of the cases the projects couldn't
>> reach to create a working translator for a new language pair. In the
>> current conditions, it is even more difficult.
>>
>> Hèctor
>>
>>
>>
>>
>> Missatge de Anuradha Pandey  del dia dc., 7
>> d’abr. 2021 a les 16:28:
>>
>>> Hello everyone,
>>> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am
>>> interested I participating in GSoC 2021, on the project - "*Develop a
>>> prototype MT system for a strategic language pair*".
>>>
>>> I have prepared a rough draft for the same and I am planning to build
>>> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
>>> the coding challenge and I will update my work on the GitHub repository
>>> mentioned in the draft. It would be really helpful if I could get some
>>> feedback before I make the final submission.
>>>
>>> Link to the draft -
>>>
>>> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>>>
>>> Thanks & Regards,
>>> Anuradha Pandey
>>> IRC: Anuradha_Pandey
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> 

Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-08 Thread Anuradha Pandey
Thank you for your response, Hèctor. I read the proposal for the
Hindi-Bengali translator. There aren't open-source dictionaries for the
Bhojpuri language (though there are resources for getting a Bhojpuri
corpus), so I was using a hardcopy of a BHO-HIN dictionary for manually
adding the pairs. I did some rough calculations, and I shall be able to add
at least 8,000 words to the monodix. And, based on my experience with
Apertium, I think simultaneously adding words in the bidix makes the work
easier, so I think roughly the same number of words in the bidix too. But,
I don't think I will be able to achieve a WER below 20% with 8000 words.
Should I aim for a WER of nearly 30% then?

Since the time for GSoC has been reduced, I am planning to modify my
proposal and the inputs from mentors would be extremely helpful.

On Wed, 7 Apr 2021 at 20:24, Hèctor Alòs i Font 
wrote:

> Hi, Anuradha.
>
> Thanks for your proposal draft. First, I would like to tell you that if
> Apertium is a rule-based translation system, it is because this paradigm
> still makes sense for many languages (indeed, for the vast majority of
> them). If Bhojpuri has extensive electronic language resources and,
> particularly, bilingual linguistic corpora, then Apertium is probably not
> the best approach. But this is probably not the case. If it was, it would
> probably already be on Google Translate.
>
> As for the project. I would advise you to look at Gourab Chakraborty's
> proposal for a Hindi-Bengali translator and the comments on it. Most of the
> comments apply to your proposal as well. The following message would be
> useful to you, for instance:
> https://sourceforge.net/p/apertium/mailman/message/37251899/
>
> Your proposal seems to me unrealistic. 10,000 words in the monodix (and
> how many in the bidix?) are not enough for a WER below 20%, I think (maybe
> for two extremely close related languages).
>
> For better evaluation your proposal I'd like to find the answer for some
> basic questions:
>
> * Which is the current state of Bhojpuri language and, eventually,
> the Bhojpuri-Hindi language pair in Apertium?
> * Would you have to write a whole Bhojpuri morphological analyser from
> scratch and, afterwards, to add some 10,000 words manually assigning them
> to a given paradigm? How much time you'll need for this?
> * From where would you get the bilingual dictionary? Would you have to
> create it yourself? Are there freely available bilingual electronic
> dictionaries (like e.g. Wiktionary)?
> * Would you work on a Bhojpuri-to-Hindi translator or on a
> Hindi-to-Bhojpuri one? In any case there will be a quite a lot of work in
> the morphological disambiguation. But for one side you'll have it only
> once. If both Hindi-to-Bhojpuri and Hindi-to-Bengali are chosen (which is
> entirely possible), this work can be divided by the two projects.
>
> There is nothing wrong to this all this work by hand, if needed. It
> depends on the state of the language resources for the given language. But
> it is necessary to know to what extent you will have to do this
> time-consuming work.
>
> When we had twice the time in most of the cases the projects couldn't
> reach to create a working translator for a new language pair. In the
> current conditions, it is even more difficult.
>
> Hèctor
>
>
>
>
> Missatge de Anuradha Pandey  del dia dc., 7
> d’abr. 2021 a les 16:28:
>
>> Hello everyone,
>> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am interested
>> I participating in GSoC 2021, on the project - "*Develop a prototype MT
>> system for a strategic language pair*".
>>
>> I have prepared a rough draft for the same and I am planning to build
>> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
>> the coding challenge and I will update my work on the GitHub repository
>> mentioned in the draft. It would be really helpful if I could get some
>> feedback before I make the final submission.
>>
>> Link to the draft -
>>
>> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>>
>> Thanks & Regards,
>> Anuradha Pandey
>> IRC: Anuradha_Pandey
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Hèctor Alòs i Font
Hi, Anuradha.

Thanks for your proposal draft. First, I would like to tell you that if
Apertium is a rule-based translation system, it is because this paradigm
still makes sense for many languages (indeed, for the vast majority of
them). If Bhojpuri has extensive electronic language resources and,
particularly, bilingual linguistic corpora, then Apertium is probably not
the best approach. But this is probably not the case. If it was, it would
probably already be on Google Translate.

As for the project. I would advise you to look at Gourab Chakraborty's
proposal for a Hindi-Bengali translator and the comments on it. Most of the
comments apply to your proposal as well. The following message would be
useful to you, for instance:
https://sourceforge.net/p/apertium/mailman/message/37251899/

Your proposal seems to me unrealistic. 10,000 words in the monodix (and how
many in the bidix?) are not enough for a WER below 20%, I think (maybe for
two extremely close related languages).

For better evaluation your proposal I'd like to find the answer for some
basic questions:

* Which is the current state of Bhojpuri language and, eventually,
the Bhojpuri-Hindi language pair in Apertium?
* Would you have to write a whole Bhojpuri morphological analyser from
scratch and, afterwards, to add some 10,000 words manually assigning them
to a given paradigm? How much time you'll need for this?
* From where would you get the bilingual dictionary? Would you have to
create it yourself? Are there freely available bilingual electronic
dictionaries (like e.g. Wiktionary)?
* Would you work on a Bhojpuri-to-Hindi translator or on a
Hindi-to-Bhojpuri one? In any case there will be a quite a lot of work in
the morphological disambiguation. But for one side you'll have it only
once. If both Hindi-to-Bhojpuri and Hindi-to-Bengali are chosen (which is
entirely possible), this work can be divided by the two projects.

There is nothing wrong to this all this work by hand, if needed. It depends
on the state of the language resources for the given language. But it is
necessary to know to what extent you will have to do this time-consuming
work.

When we had twice the time in most of the cases the projects couldn't reach
to create a working translator for a new language pair. In the current
conditions, it is even more difficult.

Hèctor




Missatge de Anuradha Pandey  del dia dc., 7
d’abr. 2021 a les 16:28:

> Hello everyone,
> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am interested
> I participating in GSoC 2021, on the project - "*Develop a prototype MT
> system for a strategic language pair*".
>
> I have prepared a rough draft for the same and I am planning to build
> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
> the coding challenge and I will update my work on the GitHub repository
> mentioned in the draft. It would be really helpful if I could get some
> feedback before I make the final submission.
>
> Link to the draft -
>
> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>
> Thanks & Regards,
> Anuradha Pandey
> IRC: Anuradha_Pandey
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Kevin Brubeck Unhammer
Rajarshi Roychoudhury
 čálii:

> Bhojpuri and Hindi are very closely related language pairs
>  As far as I know(correct me if I am wrong) , apart from some minor
> phoenetical changes they can be considered identical pairs .

Seems like a good fit for Apertium then :) considering one of the most
popular pairs in Apertium is Nynorsk–Bokmål. Here's a sentence in
Nynorsk:

- Dette språkparet er kjempepopulært, veldig rart når det er så likt.

And here's the same sentence translated into Bokmål:

- Dette språkparet er kjempepopulært, veldig rart når det er så likt.

I could give a tree structure but I think you get the point.

If people write or want to write things in Bhojpuri then it would be
useful to have an MT system and if it doesn't differ much from Hindi
then it's more likely to succeed in a (short) Apertium GsoC project.


signature.asc
Description: PGP signature
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Rajarshi Roychoudhury
# in the grammar

On Wed, Apr 7, 2021, 19:34 Rajarshi Roychoudhury 
wrote:

> Please give an example where CFG vary significantly in the 2 languages
>
> On Wed, Apr 7, 2021, 19:25 Anuradha Pandey 
> wrote:
>
>> Yes, I did look into the constraint grammar and the two languages vary
>> significantly though lemmas in Bhojpuri are mostly an extension to those in
>> Hindi. So what would you suggest? Should I translate it to Marathi instead?
>> Since in terms of linguistics, I am proficient in Hindi, English, Marathi,
>> and Bhojpuri.
>>
>> On Wed, 7 Apr 2021 at 19:11, Rajarshi Roychoudhury <
>> rroychoudhu...@gmail.com> wrote:
>>
>>> Bhojpuri and Hindi are very closely related language pairs
>>>  As far as I know(correct me if I am wrong) , apart from some minor
>>> phoenetical changes they can be considered identical pairs . Have you tried
>>> building disambiguation rules? What are their structures?
>>>
>>>
>>> On Wed, Apr 7, 2021, 18:57 Anuradha Pandey 
>>> wrote:
>>>
 Hello everyone,
 I am Anuradha Pandey, a sophomore student at BITS Pilani. I am
 interested I participating in GSoC 2021, on the project - "*Develop a
 prototype MT system for a strategic language pair*".

 I have prepared a rough draft for the same and I am planning to build
 Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
 the coding challenge and I will update my work on the GitHub repository
 mentioned in the draft. It would be really helpful if I could get some
 feedback before I make the final submission.

 Link to the draft -

 https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing

 Thanks & Regards,
 Anuradha Pandey
 IRC: Anuradha_Pandey
 ___
 Apertium-stuff mailing list
 Apertium-stuff@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/apertium-stuff

>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Rajarshi Roychoudhury
Please give an example where CFG vary significantly in the 2 languages

On Wed, Apr 7, 2021, 19:25 Anuradha Pandey  wrote:

> Yes, I did look into the constraint grammar and the two languages vary
> significantly though lemmas in Bhojpuri are mostly an extension to those in
> Hindi. So what would you suggest? Should I translate it to Marathi instead?
> Since in terms of linguistics, I am proficient in Hindi, English, Marathi,
> and Bhojpuri.
>
> On Wed, 7 Apr 2021 at 19:11, Rajarshi Roychoudhury <
> rroychoudhu...@gmail.com> wrote:
>
>> Bhojpuri and Hindi are very closely related language pairs
>>  As far as I know(correct me if I am wrong) , apart from some minor
>> phoenetical changes they can be considered identical pairs . Have you tried
>> building disambiguation rules? What are their structures?
>>
>>
>> On Wed, Apr 7, 2021, 18:57 Anuradha Pandey 
>> wrote:
>>
>>> Hello everyone,
>>> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am
>>> interested I participating in GSoC 2021, on the project - "*Develop a
>>> prototype MT system for a strategic language pair*".
>>>
>>> I have prepared a rough draft for the same and I am planning to build
>>> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
>>> the coding challenge and I will update my work on the GitHub repository
>>> mentioned in the draft. It would be really helpful if I could get some
>>> feedback before I make the final submission.
>>>
>>> Link to the draft -
>>>
>>> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>>>
>>> Thanks & Regards,
>>> Anuradha Pandey
>>> IRC: Anuradha_Pandey
>>> ___
>>> Apertium-stuff mailing list
>>> Apertium-stuff@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>>
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Anuradha Pandey
Yes, I did look into the constraint grammar and the two languages vary
significantly though lemmas in Bhojpuri are mostly an extension to those in
Hindi. So what would you suggest? Should I translate it to Marathi instead?
Since in terms of linguistics, I am proficient in Hindi, English, Marathi,
and Bhojpuri.

On Wed, 7 Apr 2021 at 19:11, Rajarshi Roychoudhury 
wrote:

> Bhojpuri and Hindi are very closely related language pairs
>  As far as I know(correct me if I am wrong) , apart from some minor
> phoenetical changes they can be considered identical pairs . Have you tried
> building disambiguation rules? What are their structures?
>
>
> On Wed, Apr 7, 2021, 18:57 Anuradha Pandey 
> wrote:
>
>> Hello everyone,
>> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am interested
>> I participating in GSoC 2021, on the project - "*Develop a prototype MT
>> system for a strategic language pair*".
>>
>> I have prepared a rough draft for the same and I am planning to build
>> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
>> the coding challenge and I will update my work on the GitHub repository
>> mentioned in the draft. It would be really helpful if I could get some
>> feedback before I make the final submission.
>>
>> Link to the draft -
>>
>> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>>
>> Thanks & Regards,
>> Anuradha Pandey
>> IRC: Anuradha_Pandey
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] GSOC proposal draft - building a prototype MT system

2021-04-07 Thread Rajarshi Roychoudhury
Bhojpuri and Hindi are very closely related language pairs
 As far as I know(correct me if I am wrong) , apart from some minor
phoenetical changes they can be considered identical pairs . Have you tried
building disambiguation rules? What are their structures?


On Wed, Apr 7, 2021, 18:57 Anuradha Pandey  wrote:

> Hello everyone,
> I am Anuradha Pandey, a sophomore student at BITS Pilani. I am interested
> I participating in GSoC 2021, on the project - "*Develop a prototype MT
> system for a strategic language pair*".
>
> I have prepared a rough draft for the same and I am planning to build
> Bhojpuri(BHO)-Hindi(HIN) MT pair. I am improving my translation system for
> the coding challenge and I will update my work on the GitHub repository
> mentioned in the draft. It would be really helpful if I could get some
> feedback before I make the final submission.
>
> Link to the draft -
>
> https://docs.google.com/document/d/1U19gJ3TMKYkYsp-FRthrvXkCRJUnNYSYKi46XhvZGOE/edit?usp=sharing
>
> Thanks & Regards,
> Anuradha Pandey
> IRC: Anuradha_Pandey
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff