Thanks David and Peter.

This module ERtr_en has been around since 2002. No one has complained. Every 
one of the first 26 examples are wrong. (according to Google translate) Even 
the one for Christ! I’m of the opinion that the module should be taken down as 
being bad and not of value. I don’t have the time or inclination to update it.

I found the Ergane website from where the data was obtained. The data is still 
there, but I can’t run the Ergane 8.0 windows software under Windows 10 to see 
the data. So I can’t tell if it is a useful source. The latest date (2005) is 
more recent than our module. So, according to our current practices, this 
should be gotten from upstream, not downstream.

I went to wordgumbo.com <http://wordgumbo.com/> via the Internet Wayback 
Machine (as the site is currently unavailable) and found the listing. There it 
is showing aguustos: 1. August. When I view the source I see a<!u 
011f>g<sup>u</sup>ustos. The <!u 011f> is my browser’s debugger’s 
representation for the non-printing character.

I looked up 011F and it is the code for ğ in the MacTurkish encoding. 011E is 
for Ğ. Respectively these are codepoints 219 and 218 in UTF-8. I didn’t look at 
other encodings.

Here is a summary of the number of times that this occurs for the number of 
entries.
[ERtr_en] Ergane Turkish to English Glossary
        Entries=1029 Errors=47

The other LexDict modules that are exhibiting the same problems (invalid UTF-8 
character):
[EReo_en] Ergane Esperanto to English Glossary
        Entries=16852 Errors=1118
 [ERja_en] Ergane Japanese to English Glossary
        Entries=552 Errors=21
[ERpo_en] Ergane Polish to English Glossary
        Entries=2026 Errors=650
[ERro_en] Ergane Romanian to English Glossary
        Entries=1218 Errors=264
[ERru_en] Ergane Russian to English Glossary
        Entries=1225 Errors=100

I’m also debugging a different set of problems I’m having with [ERde_en] Ergane 
German to English Glossary.

I’m suspicious about all the Ergane modules built from WordGumbo.

In Him,
        DM


> On Jan 5, 2016, at 10:33 AM, David Haslam <dfh...@googlemail.com> wrote:
> 
> For the word you cited as an example, the missing letter would be Ğ.
> 
> The word is AĞUSTOS = Ağustos = the name of the month August.
> 
> Given enough samples, it may be feasible to reconstruct the mapping table
> without external help.
> 
> This site may help.
> 
> http://en.bab.la/dictionary/turkish-english/a%C4%9Fustos
> 
> Regards,
> 
> David
> 
> 
> 
> --
> View this message in context: 
> http://sword-dev.350566.n4.nabble.com/Turkish-to-English-glossary-problem-tp4655585p4655588.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to