CSets 2.1 released

2008-05-30 Thread Mark Leisher
ns there, and I will always notify these lists of updates as well. As always, corrections, new mapping tables, information about mappings, and even pointers to things like fonts or texts with odd encodings are gladly accepted. -- Mark Leisher

CSets 2.1 released

2007-06-14 Thread Mark Leisher
e mappings not typically found in character set conversion tools available today." As always, I am happy to accept mapping tables/conversion program source code for any other obscure or under-represented encodings. -- Mark Leisher

CSets 2.0 released

2005-10-28 Thread Mark Leisher
welcome. -- ------- Mark Leisher Computing Research LabA sneer is the weapon of the weak. New Mexico State University -- James Russell Lowell (1819-1891) Box 30001, MSC 3CRL Las Cruces, NM 88003

CSets 1.9 released

2005-07-25 Thread Mark Leisher
these encodings is not likely to be included in the most popular character set conversion tools (i.e. iconv), so this package was put together to ease conversion of text in obscure character encodings to Unicode and for historical curiosity. Mark Leisher

CSets 1.8 released

2005-05-04 Thread Mark Leisher
relevant information. http://crl.nmsu.edu/~mleisher/csets.html -- --- Mark Leisher Computing Research LabAll political parties die at last of New Mexico State University swallowing their own lies. Box 30001, MSC

Re: Unicode::Collate 0.23 Released

2002-09-05 Thread Mark Leisher
Tomoyuki> Unicode::Collate 0.23 is released. Could you remind us where to find it again? Thanks! ----- Mark Leisher Computing Research LabThe mountain remains unmoved at New Mexico State Univers

Re: [Encode] Farsi is Okay. The problem is in Indics!

2002-04-05 Thread Mark Leisher
he Persian Hamshahri visual encoding. --------- Mark Leisher Computing Research LabTelevision has raised writing New Mexico State University to a new low. Box 30001, Dept. 3CRL -

Re: [Encode] Farsi is Okay. The problem is in Indics!

2002-04-05 Thread Mark Leisher
the general problems that come up in Indic encodings. ----- Mark Leisher Computing Research LabTelevision has raised writing New Mexico State University to a new low. Box 30001, Dept. 3

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
irectly available from ftp://www.unicode.org/Public/3.1-Update1/Unihan-3.1.1.txt.gz. ----- Mark LeisherOrthodoxy, of whatever color, seems to Computing Research Lab demand a lifeless, imitati

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
abase. August 1, 2001." --------- Mark LeisherOrthodoxy, of whatever color, seems to Computing Research Lab demand a lifeless, imitative style. New Mexico State University Box 30001, Dept. 3CRL -- Politics and the Engli

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Mark Leisher
ill get an answer. ----- Mark LeisherOrthodoxy, of whatever color, seems to Computing Research Lab demand a lifeless, imitative style. New Mexico State University Box 30001, Dept. 3CRL -- Politics and the English Languag

RE: Source data for perl encodings

2001-01-08 Thread Mark Leisher
asonable conversion capability at about 1/16 the size of ICU. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look with

Armenian encoding tables updated

2000-11-13 Thread Mark Leisher
web page. http://crl.nmsu.edu/~mleisher/csets.html These tables will be part of the CSets 1.8 distribution. - Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State

NISO standards now free

2000-11-06 Thread Mark Leisher
ional Standards Institute (ANSI). - Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept.

NISO standards now free

2000-11-06 Thread Mark Leisher
ional Standards Institute (ANSI). - Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept.

Csets 1.7 released

2000-11-03 Thread Mark Leisher
/csets.tar.gz ftp://crl.nmsu.edu/CLR/multiling/character-sets/csets.zip - Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people

Re: .enc docs comments [was Re: Encode's .enc files and a question]

2000-10-27 Thread Mark Leisher
Philip> On Thu, 26 Oct 2000, Mark Leisher wrote: >> Following the first page will be all the other pages, each in the same >> format as the first: one number identifying the page followed by 256 >> double-byte Unicode (UCS-2) characters. If a char

Re: .enc docs comments [was Re: Encode's .enc files and a question]

2000-10-26 Thread Mark Leisher
ly unrecognized Peter> characters? I don't know. I last used Tcl/Tk in the days of tcl7.?/tk4.? and haven't had time to play with anything newer. I do prefer Perl :-) ----- Mark Leisher Computing Research Lab

.enc docs comments [was Re: Encode's .enc files and a question]

2000-10-26 Thread Mark Leisher
unknown characters in the source text or change the 0x's to 0x. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of in

Re: Encode's .enc files and a question

2000-10-26 Thread Mark Leisher
Peter> Mark Leisher then replied: >> If the converted string contains 0x, it will be pretty clear the >> source text had bogus characters the moment you display it. Peter> According to Nick's translated doc the first character on the third Peter

Re: Encode's .enc files and a question

2000-10-26 Thread Mark Leisher
Philip> On Wed, 25 Oct 2000, Mark Leisher wrote: >> There may some day be a use for the Unicode codepoint 0x. It might >> be better to make this 0x, which is a guaranteed non-character in >> Unicode and probably in ISO10646. Philip> Isn'

Re: Encode's .enc files and a question

2000-10-25 Thread Mark Leisher
Unicode and probably in ISO10646. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept. 3CRL

Re: Encode's .enc files and a question

2000-10-25 Thread Mark Leisher
e a while now. But like many of us, I've got a handful of critical projects with hard deadlines to meet. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State Un

Re: Encode's .enc files and a question

2000-10-25 Thread Mark Leisher
happen as well. Although complicated on the surface, I highly recommend using Tech Report #22 on the Unicode website as a guideline for designing future mapping tables. ----- Mark Leisher Computing Research Lab

Re: UCS-2 and UTF-16 [was Re: Encode, take five]

2000-09-14 Thread Mark Leisher
ed, the term UTF-32 will be deprecated and the term UCS-4 will be used instead. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of ina

UCS-2 and UTF-16 [was Re: Encode, take five]

2000-09-13 Thread Mark Leisher
e the Unicode Standard 3.0 page 19). Combining surrogates constitutes a UCS-4 encoding (or UTF-32 until unavailable 10646 private use regions are removed). ----- Mark Leisher Computing Research LabCinema, radio, telev

Re: Encode, take two

2000-09-13 Thread Mark Leisher
le answer :-) ----- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept. 3CRL seeing, listen without hearing. Las Cruces,

Re: Encode, take two

2000-09-13 Thread Mark Leisher
ter. Then chars_to_utf8() and utf8_to_chars() don't need an encoding parameter because they simply convert between Unicode characters and UTF-8. Or is there some other factor I've missed in all the confusion? -----

Re: Encode, take three

2000-09-12 Thread Mark Leisher
; (an actual complaint I received more than once). ----- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept. 3CRL seei

Re: Encode, take three

2000-09-12 Thread Mark Leisher
My only comment would be that the functions which assume 8859-1 should be removed to avoid the inevitable confusion. Or as some else suggested earlier, changed to use the active system encoding. - Mark Leisher Computing

Re: Encode, take two

2000-09-12 Thread Mark Leisher
Perl is free to burp on us. Quite right. Sorry. --------- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people

Re: Encode, take two

2000-09-12 Thread Mark Leisher
xtraneous. ----- Mark Leisher Computing Research LabCinema, radio, television, magazines are a New Mexico State University school of inattention: people look without Box 30001, Dept. 3CRL seeing, listen without hearing. Las Cruces,