Hi all,
I've completed the preliminary draft of my proposal and would really
appreciate your comments/suggestions on the same :
http://wiki.apertium.org/wiki/Pmodi/GSOC_2020_proposal:_Hindi-Punjabi
Francis(firstly sorry for cc'ing you personally), since you have been
managing the repo, could you
Hi Prinyak,
Yes, I now see that the Hindi गलत__adj paradigm is like this, and the
Punjabi ਗਲਤ__adj seems to be a copy of it.
I can only say that we do differently in the Romance languages I work with.
I can say that the "Hindi method" is bad. It works for Hindi-Urdu, doesn't
it? This makes
>
> By the way, it seems strange that you have 9 analyses for this adjective.
> Usually in these cases we put only the first analysis in the dictionary.
> The other, in really needed, can be added as .
Regarding this, I found a number of such anomalies in the Hindi monodix,
and tried to resolve
Hi Hector,
Thank you so much for taking time to look at my challenge in detail and
providing the feedback. I already understand this error and will work on
removing all '#' symbols in the final submission of my coding challenge. To
start with, the number of '#'s were atleast 3-4 times of what I
Hi Prinyak,
I've been looking at you coding challenge. I can't understand anything, but
I see the symbol # relatively often. That is annoying. See:
http://wiki.apertium.org/wiki/Apertium_stream_format#Special
This happens, for instance, when in the bidix the target word has a given
gender and/or
Hi Priyank,
I'll try to look at your coding challenge later, although I'm not sure I'll
be able to read anything :)
With regard to the use of mass matching techniques of words in the two
languages, I would strongly advise against it in the first phase of the
project and would use it very
Hi Hector, Francis;
I've made progress on the coding challenge and wanted your* feedback *on it
- https://github.com/priyankmodiPM/apertium-hin-pan_pmodi
*(The bin files remained after a `make clean`, so I didn't remove them from
the repo, let me know if this is incorrect)*
> I've attempted to
Hi Priyank,
I calculated the coverage on the Wikipedia dumps I got, and which I used
for getting the frequency lists. I think this is fair, since these corpora
are enormous. But I calculated WER on the basis of other texts. I
calculated it only a few times, at fixed project benchmarks, since I
Hi Hector,
Thank you so much for the reply. The proposals were really helpful. I've
completed the coding challenge for a small set of 10 sentences(for now)
which I believe Francis has added to the repo as a test set. I'll included
the same in the proposal. For now, I'm working on building the
Hi Priyank,
Hindi-Punjabi seems to me a very nice pair for Apertium. It is usual that
closely related pairs give not very satisfactory results with Google,
because most of the time there is as an intermediate translation into
English. In any case, if you can give some data about the quality of
Hi,
I am trying to work towards developing the Hindi-Punjabi pair and needed
some guidance on how to go about it. I ran the test files and could notice
that the dictionary file for Punjabi needs work(even a lot of function
words could not be found by the translator). Should I start with that? Are
11 matches
Mail list logo