|
Hello Brother
I am impressed that you achieve 80% accuracy with this system.
I suppose the line "Ioann=Ioann" must be "Ioann=Ioánn"
I also suppose "hosti==hósti" has to be corrected into
"hosti=hósti"
In the recent Vatican editions, "Míchaël" is written "Míchael" but
in that case, after the replacement of "ae" by "æ" at the end ot
the processing, some corrections will have to follow, like
"Míchæl" back into "Míchael".
I see your program foresees the "j" as well as the "i", like in
"eius" and "ejus". So it could work on the Vulgata and the Nova
Vulgata. Is that right?
Do you think it is possible to improve in a significant way the
accuracy of the system, or do you think you reached the limit?
Anyway, I am ready to help as far as I can. Some cases will always
make a human intervention necessary, like "ténere" (tenderly) and
"tenére" (to hold). You also pointed out "advenit".
I wonder what will happen with "coegit" v.gr. where "oe" cannot
be changed into "œ".
Anyway, we have to keep in touch for this matter. Kind regards.
Fr. Pierre
On 02/22/2014 03:13 AM, Brother Gabriel-Marie wrote:
Hello, y'all.
I've dabbled in this, and have an effective method that is about
80% accurate.
It does a sequential find and replace, replacing certain
combinations of letters first, then other combinations
afterward. The list of latin words/particles is in a very
particular order, and still needs a good bit of tweaking.
It is actually part of a program I have been writing in my free
time. Since it is set up in an ini file, you should be able to
easy reproduce the search and replace in whatever language you
like. If you improve it, however, I want to be involved, please!
I have attached the file: Latin.ini
-BGM
On 2/19/2014 8:23 AM, Benjamin
Bloomfield wrote:
For
some words, it is easy to tell that the penultimate syllable
is long, and should therefore be accented (e.g., adventus
because -ven- ends in a consonant, and if the penultimate
vowel were a dipthong (au, æ, œ) that would make the
syllable long as well.) The real trick would be to have a
list of words whose penultimate syllable is never long, and
one of words that always have a long vowel in the
penultimate syllable (e.g., advenit is ambiguous because has
a long e if it is in the perfect tense, and a short e in the
present tense). If anyone could get such lists of Latin
words together, I could write a script to add accents to all
the words whose accent is unambiguous, and then list all the
3+ syllable words whose accent would need to be determined
by the context.
Does
anyone have an accented Latin word list of any kind, though?
Even if it were just a list of every Latin word with accent
marked, or with vowel lengths marked, I could write a script
to extract the 3+ syllable words into their proper lists
when they are not ambiguously accented words like advenit.
I
could probably figure out a way to download a list of all
the Latin words contained in Wiktionary, but I'm not sure
how accurate or complete that would be.
Benjamin
Bloomfield
_______________________________________________
Gregorio-users mailing list
[email protected]
https://mail.gna.org/listinfo/gregorio-users
_______________________________________________
Gregorio-users mailing list
[email protected]
https://mail.gna.org/listinfo/gregorio-users
--
Father Pierre FRANÇOIS ( http://www.romanliturgy.org)
Bosmanslei 16
B-2018 Antwerpen (Belgium)
mobile: +32 474 719 131
phone: +32 3 237 63 96
|