Hello, y'all.

I've dabbled in this, and have an effective method that is about 80% accurate.

It does a sequential find and replace, replacing certain combinations of letters first, then other combinations afterward. The list of latin words/particles is in a very particular order, and still needs a good bit of tweaking.

It is actually part of a program I have been writing in my free time. Since it is set up in an ini file, you should be able to easy reproduce the search and replace in whatever language you like. If you improve it, however, I want to be involved, please! I have attached the file: Latin.ini

-BGM


On 2/19/2014 8:23 AM, Benjamin Bloomfield wrote:
For some words, it is easy to tell that the penultimate syllable is long, and should therefore be accented (e.g., adventus because -ven- ends in a consonant, and if the penultimate vowel were a dipthong (au, æ, oe) that would make the syllable long as well.) The real trick would be to have a list of words whose penultimate syllable is never long, and one of words that always have a long vowel in the penultimate syllable (e.g., advenit is ambiguous because has a long e if it is in the perfect tense, and a short e in the present tense). If anyone could get such lists of Latin words together, I could write a script to add accents to all the words whose accent is unambiguous, and then list all the 3+ syllable words whose accent would need to be determined by the context.

Does anyone have an accented Latin word list of any kind, though? Even if it were just a list of every Latin word with accent marked, or with vowel lengths marked, I could write a script to extract the 3+ syllable words into their proper lists when they are not ambiguously accented words like advenit.

I could probably figure out a way to download a list of all the Latin words contained in Wiktionary, but I'm not sure how accurate or complete that would be.

/Benjamin Bloomfield/


On Wed, Feb 19, 2014 at 6:40 AM, Innocent Smith <[email protected] <mailto:[email protected]>> wrote:

    Dear Gregorio Users,

    I'm experimenting with using an OCR program to extract
    liturgical texts from a PDF of a Latin Missal, for
    various purposes including setting texts with
    Gregorio. With the software I have available, I am
    having difficulty doing the OCR in a way that
    preserves the accents accurately.

    Is anyone aware of automated ways to take a Latin text
    that does not have accents and to add them in?

    Yours,

    bro. Innocent, op

    _______________________________________________
    Gregorio-users mailing list
    [email protected] <mailto:[email protected]>
    https://mail.gna.org/listinfo/gregorio-users




_______________________________________________
Gregorio-users mailing list
[email protected]
https://mail.gna.org/listinfo/gregorio-users

;Monday, March 18, 2013
[settings]
title=Latin Accentuation
sarmode=3
regex=0
;Be very careful whilst editing this file.  The keys and values are arranged in 
a very particular order.
;It starts by replacing the most common endings - which will catch MOST Latin 
words.
;If you want to add a word that doesn't get accented properly, 
;make sure it's ending isn't already present, and then just add the word's STEM 
in the ";open general-replacements" area
;At the very end we will replace the dypthong glyphs.
;Note what we left out the accented Æ and accented Œ because they are Unicode 
and don't always work with every application
;sar mode 3 will make replacements for EVERY possible case - this works on ALL 
CAPS as well as Sentences and Titles and lower case.
;In order for this script to work, all words must be in lower case format.  
Only use capital letters in order to function for exceptions, such as proper 
names.
;File must be encoded as UTF-8 with BOM
[keys]
;open prefixes
ione=ióne
ioni=ióni
;close prefixes
;open case-endings
isti=ísti
abant=ábant
abilis=ábilis
abile=ábile
abis=ábis
abitur=ábitur
amur=ámur
amus=ámus
amul=ámul
amina=ámina
anda=ánda
andum=ándum
anti=ánti
antem=ántem
antur=ántur
anita=ánita
arent=árent
arium=árium
aris=áris
arata=aráta
arum=árum
asti=ásti
atici=átici
atio=átio
atura=atúra
atur=átur
atus=átus
atis=átis
averit=áverit  ;must come before érit
avit=ávit
arato=aráto
brata=bráta
buntur=búntur
ebit=ébit
ebis=ébis
emur=émur
emus=émus
endum=éndum
enda=énda
ende=énde
endo=éndo
ensa=énsa
enta=énta
enti=énti
ente=énte
entum=éntum
enturu=entúru
entur=éntur
erium=érium
eris=éris
erit=érit ; -érit, -éritis
iéri=íeri ;correct third declension
erii=érii
eria=éria
erunt=érunt
erserit=érserit
ervia=érvia
escens=éscens
etate=etáte
etur=étur
exui=éxui
ginta=gínta
ginte=gínte
ginto=gínto
;are=áre  ;this will affect dare
iari=iári
ibitur=íbitur
ibil=íbil
icias=ícias
iciat=íciat
ictio=íctio
icite=ícite
icere=ícere
icium=ícium
icator=icátor
ieti=iéti
iesc=iésc
ieri=íeri
ievi=iévi
iente=iénte
ifices=ífices
ificet=íficet
iliter=íliter
ilibus=ílibus
ilium=ílium
ilius=ílius
ilii=ílii
irat=írat
irant=írant
iseum=iséum
istant=ístant
istat=ístat
isti=ísti
itia=ítia
itiu=ítiu
itud=itúd
orate=oráte
orium=órium
oribus=óribus
orem=órem
orum=órum
ortuo=órtuo
tatem=tátem
triti=tríti
tura=túra
tute=túte
tutis=tútis
urum=úrum
unare=unáre
uere=úere
;-úerit, -úeris, -úerint, -ueristi
ueri=úeri
ustre=ústre
uria=úria
utia=útia
etua=étua
;close case-endings
;open general-replacements
acere=ácere
acti=ácti
alia=ália
anter=ánter
bique=bíque
acul=ácul
cipere=cípere
cepi=cépi
cipi=cípi
ceda=céda
cordia=córdia
ectu=éctu
egere=égere
egibus=égibus
empestat=empestát
ericors=éricors
estibus=éstibus
etita=etíta
etite=étite
icali=icáli
icante=icánte
itate=itáte
icord=icórd
istros=ístros
gante=gánte
iosa=iósa
irigi=írigi
itica=ítica
itati=itáti
titui=títui
lecto=lécto
onante=onánte
osque=ósque
esque=ésque
isque=ísque
umque=úmque
todi=tódi
tori=tóri
ugite=úgite
ugia=úgia
mmundi=mmúndi
ntasia=ntásia
nign=nígn
ripia=rípia
vitio=vítio
;close general-replacements
;open proper-names
Christi=Chrísti
Elias=Elías
Mari=Marí
Michael=Míchaël
Levit=Levít
Israel=Ísrael
Ioann=Ioann
Joann=Joánn
Paracli=Parácli
;close proper-names
;open word-replacements
;add here until you find a more generalized replacement
abiten=ábiten
essemu=essému
accipe=áccipe
accende=accénde
adora=adóra
adesse=adésse
adesto=adésto
adjuvand=adjuvánd
adesti=adésti
adsumus=ádsumus
appona=appóna
agnosce=agnósce
aetern=aetérn
affligi=afflígi
agamus=agámus
altare=altáre
alea=álea
amice=amíce
amicus=amícus
amor=amór
ancillar=ancillár
ancilla=ancílla
anima=ánima
antia=ántia
apostol=apóstol
arcani=arcáni
archangel=archángel
aspera=áspera
asperu=ásperu
aspersa=aspérsa
aspersu=aspérsu
averte=avérte
espersa=espérsa
espersu=espérsu
baptist=baptíst
baptismu=baptísmu
beat=beát
edicat=edícat
advers=advérs
calice=cálice
cithara=cíthara
colere=cólere
concede=concéde
condigni=condígni
condigne=condígne
confiteb=confitéb
confit=confít
consequi=conséqui
corpor=córpor
cizo=cízo
cizat=cizát
creator=creátor
cupia=cúpia
cultibus=cúltibus
defensi=defénsi
devote=devóte
dextera=déxtera
diabolic=diabólic
diabol=diábol
edicta=edícta
dimissis=dimíssis
dirige=dírige
discerne=discérne
discipul=discípul
divina=divína
doloso=dolóso
domina=dómina
domibus=dómibus
eamen=eámen
ecclesi=ecclési
ejusdem=ejúsdem
eleison=eléison
futuro=futúro
gelic=gélic
angel=ángel
emitte=emítte
ement=emént
eodem=eódem
epistola=epístola
erue=érue
eumdem=eúndem   ;fix misspelling
eundem=eúndem
evangel=evangél
exaudi=exáudi
expugna=expúgna
faci=fáci
fect=féct
fideli=fidéli
fidei=fidéi
fili=fíli
flagella=flagélla
;return foedéris to foederis
foedér=foeder
fulgo=fulgó
gaudet=gaudét
genit=génit
genéris=géneris
glori=glóri
grati=gráti
homin=hómin
hosti==hósti
icumque=icúmque
idia=ídia
illius=illíus
iracundi=iracúndi
ministri=mínistri
modi=módi
lumina=lúmina
lumine=lúmine
habe=hábe               ;this will catch if the first declension case endings 
haven't been applied
imagines=imágines
immundi=immúndi
imper=impér
Inced=Incéd
inced=incéd
induca=indúca
indulgenti=indulgénti
induti=indúti
infunde=infúnde
infundi=infúndi
inimic=inimíc
iniquo=iníquo
initi=iníti
introib=introíb
invid=invíd
jejuniu=jejúniu
judica=júdica
judici=judíci
juventut=juventút
kyrie=kýrie
laetifica=laetífica
lecti=lécti
Liber=Líber
liber=líber
magnifi=magnífi
manea=mánea
mani=máni
mari=mári
maxim=máxim
media=média
memori=memóri
menta=ménta
mentes=méntes
metua=métua
miserere=miserére
misericors=miséricors
misericordi=misericórdi
missal=míssal
mixti=míxti
necessa=necessá
nocere=nócere
nomin=nómin
nosi=nósi
obtentu=obténtu
obtin=obtín
oculo=óculo
oculi=óculi
oculu=óculu
omnia=ómnia
omniu=ómniu
ómnip=omnip
omnib=ómnib
omnipotens=omnípotens
ostium=óstium
pericu=perícu
petim=pétim
pontifici=pontífici
populi=pópuli
populo=pópulo
potent=potént
omnisque=ómnisque
oper=óper
orare=oráre
orate=oráte
oratre=orátre
ostend=osténd
pariter=páriter
pecca=peccá
perduc=perdúc
pesti=pésti
petiti=pétiti
phator=phátor
praeesse=praeésse
praefati=praefáti
prefati=prefáti
principiu=princípiu
principio=princípio
principii=princípii
princip=príncip
propheta=prophéta
proprio=próprio
prospere=próspere
quicumque=quicúmque
quoniam=quóniam
quoti=quóti
quótidie=quotídie
repro=répro
reple=replé
reprimis=répremis
respice=réspice
rilit=rílit
rumpens=rúmpens
sacerdot=sacerdót
sacerdótália=sacerdotália
sacerdos=sacérdos
salutem=sálutem
saluti=salúti
sanguine=sánguine
sanguini=sánguini
saper=sáper
secund=secúnd
sempiter=sempitér
septiformis=septifórmis
sequenti=sequénti
serviens=sérviens
sidea=sídea
silia=sília
similis=símilis
spersu=spérsu
spersé=spérse
suscip=súscip
stringi=stríngi
solva=sólva
ritui=rítui
rituu=rítuu
spiritu=spíritu
supera=súpera
supplice=súpplice
sulte=súlte
tabernacul=tabernácul
tempor=témpor
terrorque=terrórque
totius=totiús
tribu=tríbu
tríbuí=tribuí
turbas=túrbas
ubri=úbri
ultera=últera
ultere=últere
unitas=únitas
unitat=unitát
uniti=úniti
utero=útero
venire=veníre
veni=véni
vestiri=vestíri
veritas=véritas
veritat=veritát
versa=vérsa
versu=vérsu
victae=víctae
virgin=vírgin
vivifica=vivificá
vobiscum=vobíscum
;close word-replacements
;open last-place-replacements
ástitá=astitá
évísti=evísti
ítiatió=itiatió
averit=ávérit
éniéti=eniéti
aerí=æri
;close last-place-replacements
ae=æ
oe=œ
Ae=Æ
Oe=Œ

_______________________________________________
Gregorio-users mailing list
[email protected]
https://mail.gna.org/listinfo/gregorio-users

Reply via email to