Just wanted to close the loop on this and let you know that I published a CPAN module for this extended DOS encoding of Hebrew (including vowels and dageshim). Here it is:
http://search.cpan.org/~tzadikv/Encode-DosHebrew-0.51/lib/Encode/DosHebrew.pm The encoding table itself is at the end of the code in hex. The first character of each line is the 8-bit DOS Hebrew encoding. The rest of the characters are the unicode equivalent. Example: *e1 d1 bc* beis means that the byte *e1* (hex) in the DOS encoding converts to the 2 bytes: 05*d1* 05*bc* in unicode (the leading unicode "05" for Hebrew is inferred)/ The last part is just a comment that this represents a beis with a dagesh (oops sorry for the ashkenaz transliteration) I am still looking for more information about the origin of this encoding and what programs used it, so I can add that to the documentation. Tzadik
_______________________________________________ Perl mailing list [email protected] http://mail.perl.org.il/mailman/listinfo/perl
