Re: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-02 Thread Philippe Verdy
More precisely, try this file:
http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf
which contains all the symbol sets charts and cross-references with the
MSL/Unicode code and their assignment in other subsets.
It is refered within the downloadable reference CDROM for the PCL
language.

The MSL index seems to be the Unicode code point, so the MSL is merely a
subset of Unicode, as used in the HP implementation of the HP PCL - GL/2
symbol sets and fonts.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
> - Original Message - 
> From: "Neil J Geddes" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Thursday, August 28, 2003 2:23 PM
> Subject: Cyrillic character mapping tables, HP MSL to Unicode
>
>
> > Hello,
> >
> > I'm looking for symbol set and character metric information for the
> two
> > Hewlett-Packard symbol sets "3R" (PC Cyrillic) and "9R" (Windows 3.1
> > Latin/Cyrillic). Specifically I'm after:-
> >
> > 1) .TFM files for Univers, CG Times, Courier and other common
> typefaces
> > that use Cyrllic.
> >
> > 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode.
> >
> > Thanks for any help you can offer. It's appreciated!





Re: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-02 Thread Philippe Verdy
First start with this page:
http://www.hp.com/cposupport/printers/support_doc/bpl04568.html
You may want to buy this:
"Refer to the HP PCL5 Technical Reference Bundle. To order, call HP's
driver/software distribution at 661-257-5565. The part number is
5961-0976."

You may also look at:
http://www.hp.com/cposupport/printers/support_doc/bpl02705.html
and refer to this:
"For further information about PCL commands, HP-GL/2, macros, or PJL
commands, use the Technical Reference Manual set, part number 5021-0377.
Order the manual set from HP's Support Materials Organization."

Or you may download this:
http://h2.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf
"PCL 5 Printer Language Technical Reference Manual - ENWW - HP Part No.
5961-0509. Printed in USA. First Edition - October 1992 PCL 5 Printer
LanguageTechnical Reference Manual."
I have the same book, but dated September 1990 (this was really the
first edition), HP part number 33459-90903.

Also:
http://h2.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?locBasepartNum=5961-0976&lang=English%20%28US%29
"HP PCL Tech Reference Manual CD-ROM - The HP PCL Tech Reference Bundle
CD-ROM includes, the Technical Quick Reference Guide, Printer Job
Language Technical Reference Manual, PCL 5 Color Technical Reference
Manual, PCL 5 Printer Language Technical Reference Manual. In English in
a PDF. Format."

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
- Original Message - 
From: "Neil J Geddes" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, August 28, 2003 2:23 PM
Subject: Cyrillic character mapping tables, HP MSL to Unicode


> Hello,
>
> I'm looking for symbol set and character metric information for the
two
> Hewlett-Packard symbol sets "3R" (PC Cyrillic) and "9R" (Windows 3.1
> Latin/Cyrillic). Specifically I'm after:-
>
> 1) .TFM files for Univers, CG Times, Courier and other common
typefaces
> that use Cyrllic.
>
> 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode.
>
> Thanks for any help you can offer. It's appreciated!
>
> Best regards,
>
> Neil Geddes
> [EMAIL PROTECTED]
>




Re: QBCS

2003-09-02 Thread Philippe Verdy
From: "Asmus Freytag" <[EMAIL PROTECTED]>
> At 08:26 PM 9/1/03 -0700, Doug Ewell wrote:
> >Tex Texin  wrote:
> >
> > > In most industry usages, MBCS refers to variable width encodings,
not
> > > fixed width.
> >
> >Well, if variable-width encodings are referred to as both DBCS (see,
for
> >example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then
what
> >term is used to describe a fixed-width encoding of more than 1 byte?
Or
> >was the concept not common enough to warrant a name until Unicode?
>
> The most common 'pure' DBCS was encountered in mainframe environments.
> All the other platforms used 'mixed' single and double-byte or other
> variable length encodings, so that 'DBCS' could stand in for a
variable
> lenght encoding with maximum length 2 without confusion (except when
> talking to mainframe people).

In the late 80's, the acronym DBCS was also used to refer to
user-defined characters, that could be assigned in a codepage and
defined by a transferable bitmap, and accessed with an encoding sequence
allowing you to remap the upper-half of the 8-bit character set.

In a 7-bit environment, these 8-bit "characters" (in fact relative
positions in a 7-bit codepage) could be accessed using control sequences
(like SS2 used to shift temporarily in the upper subset only for the
next character). For these reasons, those assigned characters in the
selected codepage for the upper-half of the 8-bit encoding, and accessed
by at least 2 encoding 7-bit bytes were qualified as "double-byte
character", and the general encoding scheme was called "DBCS".

This has inspired the ISO-2022 standards for East-Asian languages, but
also the European Teletext and Videotext standard, then restricted to a
7-bit encoding scheme. These systems are still used today. But in any
case the "DBCS" usage was refering to a complex encoding scheme with
variable length for characters (and sometimes varying with the encoding
context or exceeding the 2 bytes limit). You may find references to
these character sets with also reference to special escape sequences
used to define and transport the bitmaps needed to represent
"user-defined" characters (as they were defined notably to support
Japanese or Chinese in the late 80's, or to create custom graphic
characters, in fact bitmap glyphs, within interactive documents or
applications).




Re: Last Resort Font

2003-09-02 Thread Michael Everson
At 06:48 -0700 2003-09-02, [EMAIL PROTECTED] wrote:
Michael Eversion wrote on 08/19/2003 02:52:55 PM:

 >p. 63 (Syloti Nagri): both top and bottom read "SILOTI NAGRI".
 > I will look into all of that, and thank you for it; but note that of
 those only Thaana can be expected to display, as none of the others
 > have been encoded. So none of those could EVER be displayed; they are
 just extra glyphs in the current font.
Syloti Nagri has been approved by UTC and assigned to A800..A82F, though
this is yet to be ratified by WG2 (presumably will happen in October) and
published in a new version of Unicode (will be 4.1) or an amendment to ISO
10646 (I don't know what timetable is in place for publishing further
amendments).
And it will be two years before the LR font has to be updated
--
Michael Everson * * Everson Typography *  * http://www.evertype.com


Re: Character codes for Egyptian transliteration

2003-09-02 Thread Peter_Constable
Peter Kirk on 08/21/2003 09:33:27 AM:

> As for the requirement for distinct upper and lower case variants of 
> ayin, I understood that there was a similar requirement in some minor 
> Cyrillic languages, at least for apostrophe and double apostrophe. 
> Earlier this year Peter Constable was gathering information for a 
> possible proposal. But I never heard if it was proceeded with.

I was given charts reporting these things being used for various 
languages, but don't think I ever got an explanation of what the purpose 
for them was, and I didn't get any confirmation of actual use let alone 
samples from actual publications. If you can provide samples, that would 
be great.


Peter Constable





Re: Last Resort Font

2003-09-02 Thread Peter_Constable
Michael Eversion wrote on 08/19/2003 02:52:55 PM:

> >p. 63 (Syloti Nagri): both top and bottom read "SILOTI NAGRI".

> I will look into all of that, and thank you for it; but note that of 
> those only Thaana can be expected to display, as none of the others 
> have been encoded. So none of those could EVER be displayed; they are 
> just extra glyphs in the current font.

Syloti Nagri has been approved by UTC and assigned to A800..A82F, though 
this is yet to be ratified by WG2 (presumably will happen in October) and 
published in a new version of Unicode (will be 4.1) or an amendment to ISO 
10646 (I don't know what timetable is in place for publishing further 
amendments).



Peter Constable



[OT]Re: Breaking free from UNICODE

2003-09-02 Thread Peter_Constable
Michael Eversion wrote on 08/19/2003 03:14:47 PM:

> Golly, I was able to distinguish Latin and Georgian and Cyrillic on a 
> Mac SE 30 in 1985. Or was it 1987.(Long before Worldscript I admit.) 
> And years before that there was the Osborne with its dot-matrix 
> miracles.

IIRC, the Mac SE did not exist in 1985; I was using a relatively new Fat 
Mac in the summer of that year. 

BTW, the Osborne didn't particularly have dot-matrix miracles. That was 
the domain of printers like the Toshiba P321 and various Epson LQ models, 
and such printers could be connected to CP/M machines like the Osbornes 
and Kaypros, DOS machines like the IBM PC and Sharp PC 5000, and the Macs. 
But in 1985 I think the only dot matrix printers were the 9-pin variety, 
which weren't all that conducive to readable Latin with diacritics, let 
alone Chinese or Arabic typesetting. The P321 was one of the first 24-pin 
models, and I think it came out in 1987 or maybe late 1986.


Peter Constable



Re: QBCS

2003-09-02 Thread Doug Ewell
Tex Texin  wrote:

> In most industry usages, MBCS refers to variable width encodings, not
> fixed width.

Well, if variable-width encodings are referred to as both DBCS (see, for
example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what
term is used to describe a fixed-width encoding of more than 1 byte?  Or
was the concept not common enough to warrant a name until Unicode?

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/




Re: QBCS

2003-09-02 Thread Asmus Freytag
At 08:26 PM 9/1/03 -0700, Doug Ewell wrote:
Tex Texin  wrote:

> In most industry usages, MBCS refers to variable width encodings, not
> fixed width.
Well, if variable-width encodings are referred to as both DBCS (see, for
example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what
term is used to describe a fixed-width encoding of more than 1 byte?  Or
was the concept not common enough to warrant a name until Unicode?
The most common 'pure' DBCS was encountered in mainframe environments.
All the other platforms used 'mixed' single and double-byte or other
variable length encodings, so that 'DBCS' could stand in for a variable
lenght encoding with maximum length 2 without confusion (except when
talking to mainframe people).
A./