Re: Cyrillic character mapping tables, HP MSL to Unicode
More precisely, try this file: http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf which contains all the symbol sets charts and cross-references with the MSL/Unicode code and their assignment in other subsets. It is refered within the downloadable reference CDROM for the PCL language. The MSL index seems to be the Unicode code point, so the MSL is merely a subset of Unicode, as used in the HP implementation of the HP PCL - GL/2 symbol sets and fonts. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. > - Original Message - > From: "Neil J Geddes" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Thursday, August 28, 2003 2:23 PM > Subject: Cyrillic character mapping tables, HP MSL to Unicode > > > > Hello, > > > > I'm looking for symbol set and character metric information for the > two > > Hewlett-Packard symbol sets "3R" (PC Cyrillic) and "9R" (Windows 3.1 > > Latin/Cyrillic). Specifically I'm after:- > > > > 1) .TFM files for Univers, CG Times, Courier and other common > typefaces > > that use Cyrllic. > > > > 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode. > > > > Thanks for any help you can offer. It's appreciated!
Re: Cyrillic character mapping tables, HP MSL to Unicode
First start with this page: http://www.hp.com/cposupport/printers/support_doc/bpl04568.html You may want to buy this: "Refer to the HP PCL5 Technical Reference Bundle. To order, call HP's driver/software distribution at 661-257-5565. The part number is 5961-0976." You may also look at: http://www.hp.com/cposupport/printers/support_doc/bpl02705.html and refer to this: "For further information about PCL commands, HP-GL/2, macros, or PJL commands, use the Technical Reference Manual set, part number 5021-0377. Order the manual set from HP's Support Materials Organization." Or you may download this: http://h2.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf "PCL 5 Printer Language Technical Reference Manual - ENWW - HP Part No. 5961-0509. Printed in USA. First Edition - October 1992 PCL 5 Printer LanguageTechnical Reference Manual." I have the same book, but dated September 1990 (this was really the first edition), HP part number 33459-90903. Also: http://h2.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?locBasepartNum=5961-0976&lang=English%20%28US%29 "HP PCL Tech Reference Manual CD-ROM - The HP PCL Tech Reference Bundle CD-ROM includes, the Technical Quick Reference Guide, Printer Job Language Technical Reference Manual, PCL 5 Color Technical Reference Manual, PCL 5 Printer Language Technical Reference Manual. In English in a PDF. Format." Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. - Original Message - From: "Neil J Geddes" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, August 28, 2003 2:23 PM Subject: Cyrillic character mapping tables, HP MSL to Unicode > Hello, > > I'm looking for symbol set and character metric information for the two > Hewlett-Packard symbol sets "3R" (PC Cyrillic) and "9R" (Windows 3.1 > Latin/Cyrillic). Specifically I'm after:- > > 1) .TFM files for Univers, CG Times, Courier and other common typefaces > that use Cyrllic. > > 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode. > > Thanks for any help you can offer. It's appreciated! > > Best regards, > > Neil Geddes > [EMAIL PROTECTED] >
Re: QBCS
From: "Asmus Freytag" <[EMAIL PROTECTED]> > At 08:26 PM 9/1/03 -0700, Doug Ewell wrote: > >Tex Texin wrote: > > > > > In most industry usages, MBCS refers to variable width encodings, not > > > fixed width. > > > >Well, if variable-width encodings are referred to as both DBCS (see, for > >example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what > >term is used to describe a fixed-width encoding of more than 1 byte? Or > >was the concept not common enough to warrant a name until Unicode? > > The most common 'pure' DBCS was encountered in mainframe environments. > All the other platforms used 'mixed' single and double-byte or other > variable length encodings, so that 'DBCS' could stand in for a variable > lenght encoding with maximum length 2 without confusion (except when > talking to mainframe people). In the late 80's, the acronym DBCS was also used to refer to user-defined characters, that could be assigned in a codepage and defined by a transferable bitmap, and accessed with an encoding sequence allowing you to remap the upper-half of the 8-bit character set. In a 7-bit environment, these 8-bit "characters" (in fact relative positions in a 7-bit codepage) could be accessed using control sequences (like SS2 used to shift temporarily in the upper subset only for the next character). For these reasons, those assigned characters in the selected codepage for the upper-half of the 8-bit encoding, and accessed by at least 2 encoding 7-bit bytes were qualified as "double-byte character", and the general encoding scheme was called "DBCS". This has inspired the ISO-2022 standards for East-Asian languages, but also the European Teletext and Videotext standard, then restricted to a 7-bit encoding scheme. These systems are still used today. But in any case the "DBCS" usage was refering to a complex encoding scheme with variable length for characters (and sometimes varying with the encoding context or exceeding the 2 bytes limit). You may find references to these character sets with also reference to special escape sequences used to define and transport the bitmaps needed to represent "user-defined" characters (as they were defined notably to support Japanese or Chinese in the late 80's, or to create custom graphic characters, in fact bitmap glyphs, within interactive documents or applications).
Re: Last Resort Font
At 06:48 -0700 2003-09-02, [EMAIL PROTECTED] wrote: Michael Eversion wrote on 08/19/2003 02:52:55 PM: >p. 63 (Syloti Nagri): both top and bottom read "SILOTI NAGRI". > I will look into all of that, and thank you for it; but note that of those only Thaana can be expected to display, as none of the others > have been encoded. So none of those could EVER be displayed; they are just extra glyphs in the current font. Syloti Nagri has been approved by UTC and assigned to A800..A82F, though this is yet to be ratified by WG2 (presumably will happen in October) and published in a new version of Unicode (will be 4.1) or an amendment to ISO 10646 (I don't know what timetable is in place for publishing further amendments). And it will be two years before the LR font has to be updated -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: Character codes for Egyptian transliteration
Peter Kirk on 08/21/2003 09:33:27 AM: > As for the requirement for distinct upper and lower case variants of > ayin, I understood that there was a similar requirement in some minor > Cyrillic languages, at least for apostrophe and double apostrophe. > Earlier this year Peter Constable was gathering information for a > possible proposal. But I never heard if it was proceeded with. I was given charts reporting these things being used for various languages, but don't think I ever got an explanation of what the purpose for them was, and I didn't get any confirmation of actual use let alone samples from actual publications. If you can provide samples, that would be great. Peter Constable
Re: Last Resort Font
Michael Eversion wrote on 08/19/2003 02:52:55 PM: > >p. 63 (Syloti Nagri): both top and bottom read "SILOTI NAGRI". > I will look into all of that, and thank you for it; but note that of > those only Thaana can be expected to display, as none of the others > have been encoded. So none of those could EVER be displayed; they are > just extra glyphs in the current font. Syloti Nagri has been approved by UTC and assigned to A800..A82F, though this is yet to be ratified by WG2 (presumably will happen in October) and published in a new version of Unicode (will be 4.1) or an amendment to ISO 10646 (I don't know what timetable is in place for publishing further amendments). Peter Constable
[OT]Re: Breaking free from UNICODE
Michael Eversion wrote on 08/19/2003 03:14:47 PM: > Golly, I was able to distinguish Latin and Georgian and Cyrillic on a > Mac SE 30 in 1985. Or was it 1987.(Long before Worldscript I admit.) > And years before that there was the Osborne with its dot-matrix > miracles. IIRC, the Mac SE did not exist in 1985; I was using a relatively new Fat Mac in the summer of that year. BTW, the Osborne didn't particularly have dot-matrix miracles. That was the domain of printers like the Toshiba P321 and various Epson LQ models, and such printers could be connected to CP/M machines like the Osbornes and Kaypros, DOS machines like the IBM PC and Sharp PC 5000, and the Macs. But in 1985 I think the only dot matrix printers were the 9-pin variety, which weren't all that conducive to readable Latin with diacritics, let alone Chinese or Arabic typesetting. The P321 was one of the first 24-pin models, and I think it came out in 1987 or maybe late 1986. Peter Constable
Re: QBCS
Tex Texin wrote: > In most industry usages, MBCS refers to variable width encodings, not > fixed width. Well, if variable-width encodings are referred to as both DBCS (see, for example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what term is used to describe a fixed-width encoding of more than 1 byte? Or was the concept not common enough to warrant a name until Unicode? -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: QBCS
At 08:26 PM 9/1/03 -0700, Doug Ewell wrote: Tex Texin wrote: > In most industry usages, MBCS refers to variable width encodings, not > fixed width. Well, if variable-width encodings are referred to as both DBCS (see, for example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what term is used to describe a fixed-width encoding of more than 1 byte? Or was the concept not common enough to warrant a name until Unicode? The most common 'pure' DBCS was encountered in mainframe environments. All the other platforms used 'mixed' single and double-byte or other variable length encodings, so that 'DBCS' could stand in for a variable lenght encoding with maximum length 2 without confusion (except when talking to mainframe people). A./