Using Tessaract to help the Irish New Testament project is suggested. See http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works
We should try and establish personal contact with Pastor Craig Ledbetter. http://www.biblebc.com/Projects/irish_new_testament_project.htm http://www.biblebc.com/Projects/irish_new_testament_project.htm I think CrossWire could provide some useful technical help. -- David Peter von Kaehne wrote: > > Mike Hart wrote: >> That's interesting, because ancle is one of the words I corrected in >> JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and >> my JPS complete Tanach (individual volumes) had ankle... I can't say >> what verse it was, at the time I was hunting for e's that had been >> OCR'd into c's (search for 'regular expression' >> [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite) > > You should have a look at Troy's work with tesseract. Rather than search > and replace a text badly ocred he seems to have figured out how to > "educate" tesseract with one or two sample pages until it does the right > thing. That might be way easier and with a better outcome in the long > term for you too. > > Peter > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > > -- View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21368903.html Sent from the SWORD Dev mailing list archive at Nabble.com. _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page