subject:"RE\: American English translation of character names"

Re: Punched tape (was: "Re: American English translation of character names")

2004-01-08 Thread Doug Ewell

Anto'nio Martins-Tuva'lkin  wrote:

> Anyway -- it was space for three wholes, the small whole for the
> tractor wheel, and space for four more, IIRC.
>
> |O OoOO  |
> ...
>
> Any bells ringing? Wouldn't this be a nice "complete" set of chars to
> be encoded, a la Braille patterns?...

This is not so much a script as a UTF.  (In fact, Ken Whistler has
already done something similar as a joke; search the Unicode mailing
list archives for "BTF".)

The analogy with Braille is tempting, but Braille has mappings to many
other alphabets besides the commonly seen English/Latin mapping.  There
is Cyrillic Braille, Hebrew Braille, kana Braille, etc.  More
importantly, there is the concept of "Level 2 Braille" in which a single
dot pattern or a combination of two or three is assigned a meaning that
varies depending on context, and is not always mnemonically derived from
the individual letters.  Punched-tape codes and card codes don't have
these characteristics.

You can find more codes for punched cards and tape, as well as internal
codes for early computers, at Dik Winter's site:

http://homepages.cwi.nl/~dik/english/codes/

or at Roman Czyborra's site, rumored to be at http://czyborra.com but
usually not available.

BTW, speaking of Roman, thanks to everyone who responded to my inquiries
about his whereabouts.  Actually, I admit I was primarily interested in
what had happened to his *site*, since it was (and still is) usually
unavailable.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/

RE: Punched tape (was: "Re: American English translation of character names")

2004-01-07 Thread Eric Scace

   Your 7-bit paper tape system was rather unusual, actually, and was not a Telex 
system.

   Telex systems used what was then termed a "5-level code"; i.e., 5 bits.  The code 
was often called "baudot" but its formal name
was International Telegraph Alphabet #2 (ITA2).  It was standardized by the 
International Telegraph Union (now International
Telecommunications Union, the second-oldest international treaty body and now a 
specialty agency of the UN).  ITA2 provided 32 code
points.  Several of these were reserved for special functions: carriage return, line 
feed, "letters" (forced a shift into letters
case for subsequent characters), "figures" (forced a shift into figures and 
punctuation for subsequent characters), space and
"blank".  The remaining 26 code points represented A-Z (in letters case) and 
punctuation, digits, and other special symbols (vulgar
fractions, meteorological symbols, bell signal, etc depending on local conventions).  
These systems used paper tape with 2 holes, a
tractor hole for feeding the tape, and then 3 holes in a column; each of the 5 holes 
represented one of the bits in the 5-bit
encoding.

   Later, ASCII paper tape systems became common; many of these used 3 holes, a 
tractor hole, and 5 holes to represent 8-bit
encodings (including a parity bit).

   "7-level" tape systems were used in some special applications.   Some of the ones 
that I encountered were based on ASCII without
the 8th parity bit; others used special encodings to control typesetting equipment.

   Two-level tape systems were used throughout the first six decades of the 20th 
century to key submarine telegraph cables.

   All of these tape systems were mechanisms for storing information.  They aren't 
alphabets.

-- Eric Scace

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Behalf Of Anto'nio Martins-Tuva'lkin
Sent: 2004 January 6 15:05
To: [EMAIL PROTECTED]
Subject: Punched tape (was: "Re: American English translation of
character names")

On 2003.12.19, 00:24, Carl W. Brown <[EMAIL PROTECTED]> wrote:

> Jill,
>
>> I'm a programmer, and I'm older than most programmers. I'm old enough
>> to remember punched paper tape ...
<...>
> Yes I worked with paper tape as well. I even worked on one machine
> that I would write programs on paper tape loops.

Well, I'm only 34 but I did work with one of these, on a telex machine
for terminal. I still keep some reels of tape, punched with my some of
high school stuff.

Anyway -- it was space for three wholes, the small whole for the tractor
wheel, and space for four more, IIRC.

|O OoOO  |
|O  oOOO |
| OOo O O|
|OO oOO  |
| O o  OO|
|OO o|
|O Oo OO |
|O  o|
| OOo OOO|
|O OoO   |
| OOo OOO|
|O  o OOO|

Any bells ringing? Wouldn't this be a nice "complete" set of chars to be
encoded, a la Braille patterns?...

--.
António MARTINS-Tuválkin |  ()|
<[EMAIL PROTECTED]>||
Rua Alberto Bramão, 8-1º d.to |
PT-1700-132 LISBOA   Não me invejo de quem tem|
+351 934 821 700 carros, parelhas e montes|
http://www.tuvalkin.web.pt/bandeira/ só me invejo de quem bebe|
http://pagina.de/bandeiras/  a água em todas as fontes|

Punched tape (was: "Re: American English translation of character names")

2004-01-06 Thread Anto'nio Martins-Tuva'lkin

On 2003.12.19, 00:24, Carl W. Brown <[EMAIL PROTECTED]> wrote:

> Jill,
>
>> I'm a programmer, and I'm older than most programmers. I'm old enough
>> to remember punched paper tape ...
<...>
> Yes I worked with paper tape as well. I even worked on one machine
> that I would write programs on paper tape loops.

Well, I'm only 34 but I did work with one of these, on a telex machine
for terminal. I still keep some reels of tape, punched with my some of
high school stuff. 

Anyway -- it was space for three wholes, the small whole for the tractor
wheel, and space for four more, IIRC.

|O OoOO  |
|O  oOOO |
| OOo O O|
|OO oOO  |
| O o  OO|
|OO o|
|O Oo OO |
|O  o|
| OOo OOO|
|O OoO   |
| OOo OOO|
|O  o OOO|

Any bells ringing? Wouldn't this be a nice "complete" set of chars to be
encoded, a la Braille patterns?...

--.
António MARTINS-Tuválkin |  ()|
<[EMAIL PROTECTED]>||
Rua Alberto Bramão, 8-1º d.to |
PT-1700-132 LISBOA   Não me invejo de quem tem|
+351 934 821 700 carros, parelhas e montes|
http://www.tuvalkin.web.pt/bandeira/ só me invejo de quem bebe|
http://pagina.de/bandeiras/  a água em todas as fontes|

RE: American English translation of character names

2003-12-19 Thread jarkko.hietaniemi

> particularly the 1930s, 40s, 50s, and 60s sections, and 
> follow the many
> links from each entry.  In particular, you can see the basic 
> character set
> of the IBM 360 (as generated by the IBM 29 Card Punch) here:
> 
>   http://www.columbia.edu/acis/history/029.html
> 
> (scroll down a bit after the photo).

http://www.unicode.org/Public/MAPPINGS/VENDORS/IBM/IBM360.TXT

404 Not Found

:-)

RE: American English translation of character names

2003-12-18 Thread Carl W. Brown

Jill,

>I'm a programmer, and I'm older than most
>programmers. I'm old enough to remember
>punched paper tape ... but not quite old
>enough to remember punched cards.

Don't feel bad.  My first job was for IBM helping them set up a production
line for the 1401.  This was a computer that had no vacuum tubes.  All
transistors.  The plant manager was not happy because it was taking up
valuable floor space that could have been used to build time clocks.  After
all this was IBM's real business not computers.

It used punched cards however when the 360 came out and IBM switched to
EBCDIC they changed the punch card encoding.  The special characters changed
to accommodate new characters like the not sign.  The 024 keypunches had to
be replaced. Be careful because even after the 360, other companies like
Control Data continued to use the old BCD punch card encoding.

Yes I worked with paper tape as well.  I even worked on one machine that I
would write programs on paper tape loops.  You were very limited in what you
could write.  Branching was difficult.

The first APL system that I worked on was the 5101.  It was a predecessor of
the PC (5150)  It had an APL keyboard.  They also offered Basic but the
Basic was very limited.  For example strings were limited to 18 characters.
No bad for a $30,000+ machine.

Carl

RE: American English translation of character names

2003-12-18 Thread Philippe Verdy

Philippe Verdy
> Isn't a caron a model (or trademark?) for crochet hooks?
> When I look at some handwritten texts using hacek, it looks much
> more like a rounded and oblique crochet hook than to a
> reversed circumflex (as seen in Unicode charts).
> 
> The handwritten hacek glyph looks approximately like this,
> it is completely rounded without the angular shape:
> (select a monospace font to view it)
> 
> ##
>   ###
>  ###
>###  ###
>    ###
>######
>     
>
>   ## 
> 
> It is easily read distinctly from the breve and accute accents,
> and it's not even a mirrored comma above.
> The glyph is visibly drawn as a continuous stroke from the
> middle-left to the thiner upper-right.

I should have noted also that this handwritten glyph is coherent
with its possible notation on the right side of letters with
large ascenders, notably D, L, l and T.

Which makes sense in that case, because this apostrophe is also
more or less interpreted as a variant of the accute accent, and
not a simply reversed circumflex.

"Hacek" (pronounced hatchek, with the 'h' expirated,
and with 'a' pronounced nearly like a short schwa) also means
"little hook" in Czech...

So the rounded "hook" glyph makes sense here, where the angular shape
in Unicode charts is suspect and may have come from a historic bad
interpretation of the Czech hatchek accent of by other latinists
and typographers, who may have just borrowed the same metal shape
used for circumflex to print Czech texts.

If someone can find in a Czech library some old Handwritten scripts
or even some source of Czech calligraphy, we could see if the
angular modern form of hacek corresponds to its initial shape.


__
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE!  http://www.ellaforspam.com
<>

Re: American English translation of character names

2003-12-18 Thread Edward H. Trager

On Thursday 2003.12.18 04:05:53 -0800, Peter Kirk wrote:
> On 18/12/2003 02:51, Arcane Jill wrote:
> 
> >...
> >In fact, until Kenneth Whistler's email about American English - I 
> >actually thought the Unicode character names /were/ in American 
> >English, because they are certainly not in my native dialect (although 
> >I did know that most Americans don't say "full stop"). Rest assured, 
> >Kenneth, we in Britain do /not/ refer to slash as "solidus", 
> >underscore as "low line",  backslash as "reverse solidus", paragraph 
> >sign as "pilcrow sign", and so on. I have no idea where these terms 
> >came from, but, take it from someone who lives here, they are not in 
> >common usage in Britain. (With the exceptions of "full stop" and 
> >"anticlockwise"). Curious -- I wonder where those "official" names 
> >came from?
> 
> 
> They are not the names used by British programmers. But they are perhaps 
> the names which were used by British typesetters, and maybe American 
> ones too, in the old days of hot metal.
> 
> >
> >I've never attached any importance to the "proper" names (and I'm also 
> >a programmer). In fact, I don't even see why a Unicode character /has/ 
> >to have a "proper name" at all. ASCII characters never had them. And, 
> >hey - the official names for CJK Unified Ideographs Extension A (for 

Hopefully most of you will agree that having official names for Unicode 
characters in ASCII-only English is very useful when various characters get 
discussed on mailing lists such as this one.  It saves having to look up hex values
endlessly, since many still don't have (or, as in my case, don't always 
have access to) Unicode-enabled email clients.

I personally think that it is an *interesting* omission that the CJK ideographs
do not have meaningful names.

I'm probably going to be just opening up a can of worms by suggesting a meaningful 
CJK ideograph naming system (and I fully expect lots of comments back from the 
experts to the tune of "Yes, the CJK group considered all manner of things like 
this before, but it wouldn't work because of X, Y, and Z..." or "You really don't 
know what you are talking about").  But assuming that risk, I'm going to say it 
anyway and give some reasons for why I would do it this way:  A useful system for 
naming CJK ideographs would be to construct names by stringing together:

   (1) An indicator if the character is simplified (SIMPLIFIED) or traditional 
(TRADITIONAL)
   for ideographs originating in China which come in both traditional and simplified 
forms, 
   or an indicator for a variant form (VARIANT) if an encoded variant of 
   another more commonly-used glyph.  Omit indicator if the character of Chinese 
origin 
   only comes in one form. If the character was "invented" by the Japanese, use 
"JAPANESE" as 
   the indicator.  If invented by the Koreans, use "KOREAN" as the indicator.  If 
invented 
   by the Vietnamese, use "VIETNAMESE" as the indicator.

   (2) If the character is used in Chinese, then the primary pronounciation of 
   the ideograph in modern standard Mandarin Chinese using pinyin followed by a 
   digit 1-4 to indicate the tone under the primary pronounciation. If the character
   does not appear in Chinese but rather was invented by the Japanese, Korean, or 
historical 
   Vietnamese, then provide the primary pronounciation in Japanese if used in 
   Japan, Korean if used only in Korea, Vietnamese if use historically only in Vietnam.
   (3) The primary meaning of the character in english according to the primary 
language
   in which that character appears.

For example:

   ç u7231 SIMPLIFIED AI4 LOVE 
   æ u611B TRADITIONAL AI4 LOVE
   æ u6208 GE1 SPEAR
   ç u70BA TRADITIONAL WEI2 TO BE
   ç u7232 VARIANT WEI2 TO BE
   å u5713 TRADITIONAL YUAN2 CIRCLE
   å u5186 JAPANESE EN YEN

Standardized names such as these, at least for the BMP CJK characters, 
would make it pretty clear to most knowledgeable readers what characters were being 
discussed even when unable to see the glyphs for whatever reasons.
Perhaps more importantly, if this were in the unihan database, which is
the database that most developers are going to access first, it would be trivial to
query out various useful subsets of ideographs, such as the TRADITIONAL vs. SIMPLIFIED
(vs. the "Doesn't change" subset), or those that are uniquely JAPANESE, etc.  I'm not
saying it would be the complete solution for everything -- of course not.  But it would
put this information "at ones fingertips", so to speak, in a prominent database that
many people look at.

> >example) tell me nothing more than the script and codepoint anyway. I 
> >tend to regard them as "comments".
> >
> Agreed. The names are useful for selecting a character from a drop-down 
> list. But they are only useful if they are accurate. I agree with Doug 
> that "As a programmer, I can't personally imagine designing a program 
> that relies on the Unicode names to identify characters uniquely". I 
> suspect that

RE: American English translation of character names

2003-12-18 Thread Frank da Cruz

>Yes, I did both cards and punched paper tape as a teenager.
>
I did them too.  Nothing to do with Unicode, but those who would like an
introduction to punched cards and early computing (mainly IBM oriented)
are welcome to take a look at this:

  http://www.columbia.edu/acis/history/

particularly the 1930s, 40s, 50s, and 60s sections, and follow the many
links from each entry.  In particular, you can see the basic character set
of the IBM 360 (as generated by the IBM 29 Card Punch) here:

  http://www.columbia.edu/acis/history/029.html

(scroll down a bit after the photo).  And for a fascinating (to some :-)
history of the early development of IBM and ASCII character sets, see:

  Mackenzie, Charles E., Character Sets, History and Development,
  Addison-Wesley (1980).

It might be surprising to learn that there was almost as much discussion,
argument, and compromise over the early 64- and 96-character and 8-bit
character sets as there is today over the worldwide Universal Character
Set.  Well, maybe not so surprising since the demand for including
characters was so great and the space so small.

- Frank

RE: American English translation of character names

2003-12-18 Thread Eric Scace

Hi Jill --

   I'll try to answer your questions.

   Yes, I did both cards and punched paper tape as a teenager.  In fact, I used paper 
tape on Teletype Corp model 33 ASR teleprinter
machines.  Sigh: didn't even think I was dating myself that badly *grin*.  I was 
lucky: my father got involved in computer
programming very early on and he was thoughtful enough to teach me a couple of 
languages, carry my card decks to work, bring
print-outs home, etc.  Slow turnaround but...  Mechanical teleprinter machines were a 
special treat for me.  I fell in love with
them as a 14 year old kid when I saw them at the local meteorological office.  I got 
some of my own around age 16, and learned out
to maintain and repair them.  When I got to college, I earned a lot of spending money 
as a free-lance repairman for all those Model
33 and 35 Teletype machines used as computer consoles in laboratories around campus.

   As best as I recall, some versions of IBM 360 FORTRAN compilers and PL/C used the 
not symbol.  It was represented on punched
cards as a an L-shaped character.  Imagine an uppercase L, rotated 90? clockwise, and 
then reflected around the vertical axis so
that the downward stroke is on the right.  I haven't looked at the U+00AC glyph to see 
if it is the same.  If it is necessary to
come up with some historical references, I'll check my college programming course 
material.  I think the keypunch machines that
produced the cards were known as IBM 2714s.

   I think C chose "!" as the negation operator (to be precise) because it was a 
widely-available glyph on common keyboards which
did not yet have a meaning assigned to it.  But it's all a bit arbitrary, this 
assignment of programming operators to glyphs.

   And... on the other aspects of the thread about keyboard layouts... laptop 
keyboards are often laid out rather differently when
it comes to the less frequently used punctuation and diacritical marks.  This makes it 
quite entertaining when one jumps between
American, German, Finnish and French laptop computers, as I was forced to do on a 
recent trip to Albania.

-- Eric Scace


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Arcane Jill
Sent: 2003 December 18 11:36
To: [EMAIL PROTECTED]
Subject: RE: American English translation of character names


> -Original Message-
> From: Eric Scace [mailto:[EMAIL PROTECTED]
> Sent: Thursday, December 18, 2003 3:57 PM
> To: John Cowan; Arcane Jill
> Cc: [EMAIL PROTECTED]
> Subject: RE: American English translation of character names
>
>
>The logical "not" glyph got into EBCDIC because the
> concept was needed in computer programming.

I'm a programmer, and I'm older than most programmers. I'm old enough to remember 
punched paper tape ... but not quite old enough to
remember punched cards. I am interested in this, though. Could you possibly clarify 
which computer language used (the EBCDIC
equivalent of) U+00AC? I only ask because I'm not aware of one, and I'm intrigued.


>In the late 1970s the C programming language was one of
> the first to use the glyph "!" to mean logical "not"; e.g.,
> "!=".

"!" is used to mean "logical not" in contexts other than just "not equal". As in, for 
example: bool b1 = ! b2; (although there
wasn't a bool type back then). I remember that BASIC used the keyword "NOT" for the 
same purpose. C also uses "~" as a "bitwise
not".

So ... let me see if I have understood you correctly, because this is a tad confusing 
(but very interesting). You are saying that
... in the days of punched cards ... there was an EBCDIC code whose meaning was 
LOGICAL NOT. So far so good - but how would such a
character code have been written? Was it written like the U+00AC glyph is now? Or did 
its visual appearance vary depending on who
was writing it? Or ... did it even have a visual appearance at all? I figure that, if 
it didn't have the visual appearance of the
U+00AC glyph then "logical not" would map better to Unicode character U+223C TILDE 
OPERATOR (also known as "not", according to the
code charts) which at least looks like the character mathematicians use. On the other 
hand, if it did have U+00AC appearance then
fair enough.


> etc).  Earlier keyboard languages used a different
> workaround; e.g., "<>" for "not equal".

Yeah, I always wondered why C chose to deploy ! to mean "not". Weird. Maybe they just 
picked a character at random and said "Ah
yes - we'll use that one - no-one else seems to be using it for anything"

Jill

RE: American English translation of character names

2003-12-18 Thread Jim Allan

Arcane Jill wrote:

You are saying that ... in the
days of punched cards ... there was an EBCDIC code whose meaning was
LOGICAL NOT. So far so good - but how would such a character code have
been /written/? Was it written like the U+00AC glyph is now?
Yes, exactly the same.

It appeared in original EBCDIC in 1964. See 
http://homepages.cwi.nl/~dik/english/codes/stand.html#ebcdic

It appeared on IBM mainframe terminal keyboards. It still appears on 
terminals in an EBCDIC environment.

Jim Allan

RE: American English translation of character names

2003-12-18 Thread Philippe Verdy

Michael Everson writes:
> >John Cowan wrote:
> >>  The most mysterious term is "caron" for the hacek accent: this word
> >>  seems to exist only in ISO standards, and nobody has any idea where it
> >>  came from.
>
> This doesn't make any sense to me, but in any case it does not 
> explain the origin of the word "caron". The most plausible suggestion 
> I've ever come up with is folk-etymological: It's a CARet that sits 
> ON the vowel. :-(

Isn't a caron a model (or trademark?) for crochet hooks?
When I look at some handwritten texts using hacek, it looks much
more like a rounded and oblique crochet hook than to a
reversed circumflex (as seen in Unicode charts).

The handwritten hacek glyph looks approximately like this,
it is completely rounded without the angular shape:
(select a monospace font to view it)

##
  ###
 ###
   ###  ###
   ###
   ######

  ## 

It is easily read distinctly from the breve and accute accents,
and it's not even a mirrored comma above.
The glyph is visibly drawn as a continuous stroke from the
middle-left to the thiner upper-right.

__
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE!  http://www.ellaforspam.com
<>

RE: American English translation of character names

2003-12-18 Thread Marco Cimarosti

John Cowan wrote:
> In the New York City subway system (of underground trains, that is,
> not underground pedestrian tunnels!), this letter has been 
> consistently avoided since 1967, when the system of distinguishing trains
> by letter or number was instituted.  The only other letters never used are
> I and O (presumably to avoid confusion with 1 and 0, though 0 has never
> been used either), and Y.  Why Y is a mystery to me: perhaps there has
> simply never been a need for it.

Probably, having to get train "Why?" to reach one's workplace could have a
negative effect on employees' attitude towards hard working.

_ Marco

RE: American English translation of character names

2003-12-18 Thread Arcane Jill





> -Original Message-
> From: Eric Scace [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, December 18, 2003 3:57 PM
> To: John Cowan; Arcane Jill
> Cc: [EMAIL PROTECTED]
> Subject: RE: American English translation of character names
> 
> 
>    The logical "not" glyph got into EBCDIC because the 
> concept was needed in computer programming.

I'm a programmer, and I'm older than most programmers. I'm old enough
to remember punched paper tape ... but not quite old enough to remember
punched cards. I am interested in this, though. Could you
possibly clarify which computer language used (the EBCDIC
equivalent of) U+00AC? I only ask because I'm not aware of one, and I'm
intrigued.


>    In the late 1970s the C programming language was one of 
> the first to use the glyph "!" to mean logical "not"; e.g., 
> "!=".

"!" is used to mean "logical not" in contexts other than just "not
equal". As in, for example: bool b1 = ! b2; (although
there wasn't a bool type back then). I remember that BASIC used the
keyword "NOT" for the same purpose. C also uses "~" as a "bitwise not".

So ... let me see if I have understood you correctly, because this is a
tad confusing (but very interesting). You are saying that ... in the
days of punched cards ... there was an EBCDIC code whose meaning was
LOGICAL NOT. So far so good - but how would such a character code have
been written? Was it written like the U+00AC glyph is now? Or
did its visual appearance vary depending on who was writing it? Or ...
did it even have a visual appearance at all? I figure that, if
it didn't have the visual appearance of the U+00AC glyph then "logical
not" would map better to Unicode character U+223C TILDE OPERATOR (also
known as "not", according to the code charts) which at least looks
like the character mathematicians use. On the other hand, if it did
have U+00AC appearance then fair enough.


> etc).  Earlier keyboard languages used a different 
> workaround; e.g., "<>" for "not equal".

Yeah, I always wondered why C chose to deploy ! to mean "not". Weird.
Maybe they just picked a character at random and said "Ah yes - we'll
use that one - no-one else seems to be using it for anything"

Jill

RE: American English translation of character names

2003-12-18 Thread Jim Allan

Arcane Jill wrote:

(Incidently, the code charts for U+00AC (NOT SIGN) also say "= angled
dash (in typography)." So I'm still a bit confused about in which
discipline it is actually known as "not sign").
The not sign is often used in logical notation in Boolean algebra or 
sentential logic. See 
http://whatis.techtarget.com/definition/0,,sid9_gci843775,00.html

Other conventions are often used instead, especially use of the tilde. I 
believe, but could be mistaken, that use of tilde for "logical not" is 
older usage and that the specific "logical not" sign was introduced as a 
substitution because the tilde most often suggests approximation in 
mathematic use.

The not sign is used on the IBM mainframe platform in some computer 
languages, notably REXX. See http://www.ilook.fsnet.co.uk/rexx/rexcmdc5.htm

The backslash was also given the meaning "logical not" in REXX at some 
stage as an alternate in environments where the "logical not" sign was 
not available.

Versions of REXX adapted to ASCII generally replace the "logical not" 
sign by either ~ or ^ or allow either as well as recognizing the backslash.

See also 
http://www.uwm.edu/IMT/Computing/sasdoc8/sashtml/mindex/sc-index.htm for 
its use in another computer language.

Use of ^ meaning "logical not" generally derives from the use of "^" as 
a translation of the proper not sign in text files from EBCDIC to ASCII 
where the two symbols are normally equated. For example, from 
http://www.printek.com/products/autoforms.html

<< The following commands use the logical not ( ) sign or a caret (^). 
IBM terminals generally have the logical not sign. PC's running a 
terminal emulation program have a caret. In either case, both characters 
are a shift 6 on the keyboard. >>

Jim Allan

Re: American English translation of character names

2003-12-18 Thread Michael Everson

At 09:01 -0500 2003-12-18, John Cowan wrote:

"Underscore" would suggest rather U+0332, the combining low line.  As
for "pilcrow", it's probably descended from a perversion of "paragraph",
but nobody knows for sure.
The OED gives other forms for it:

15th-century pylcraft(e), pilecrafte; 16th-century pilcrowe; 
17th-century pilkrow, pill-crow, peelcrow, pilgrow. Apparently for 
pilled crow, cf. pilcord, pilgarlic. The application of the word, 
with the form pylcraft, has suggested that it originated in a 
perversion of PARAGRAPH, through pargrafte, *parcrafte, etc.: cf 
quote c 1460 and 1617. But the history of the word is obscure, and 
evidence is wanting.
--
Michael Everson * * Everson Typography *  * http://www.evertype.com

RE: American English translation of character names

2003-12-18 Thread Francois Yergeau

Arcane Jill wrote:
> Or, indeed, why the "proper" name for a character must be in
> English, and spellable in ASCII, instead of, say, Japanese.

The names are in English in the English version of the standard.  The French
version of 10646 appropriately has French names, not restricted to ASCII but
to the repertoire of ISO 8859-15.  See
http://iquebec.ifrance.com/hapax/ListeDesNoms-4.0.0.txt (work in progress).

-- 
François

RE: American English translation of character names

2003-12-18 Thread Eric Scace

   The logical "not" glyph got into EBCDIC because the concept was needed in computer 
programming.  An example is the instruction
that if A does not equal B, then do something.  IBM picked up the glyph and 
incorporated it into its punch card systems.

   In the late 1970s the C programming language was one of the first to use the glyph 
"!" to mean logical "not"; e.g., "!=".  This
was a response to the use of mechanisms other than punch cards to enter program 
instructions (keyboards and CRTs, teletypewriters,
etc).  Earlier keyboard languages used a different workaround; e.g., "<>" for "not 
equal".

   (Apologies if this duplicates earlier information; I'm jumping into the thread 
rather late.)

-- Eric Scace

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Behalf Of John Cowan
Sent: 2003 December 18 09:31
To: Arcane Jill
Cc: [EMAIL PROTECTED]
Subject: Re: American English translation of character names


Arcane Jill scripsit:

> For example, U+00AC (NOT
> SIGN) is something that most people I know would describe in terms like
> "oh - you know - that character to the left of 'one' on a keyboard, if
> you press shift".

On the standard U.S. keyboard, that gesture generates ~.
If I turn on the U.S.-International keyboard, then RightAlt-\ gives me the
NOT SIGN, where \ is the rightmost key in the QWERTYUIOP row.

> (And even then, the usual response is "Oh that one -
> I've never used it. What's it for?". Curiously, to a mathematician,
> tilde, overscore and prime are used in various contexts to mean "not
> sign"; wheras to a programmer, exclamation mark and tilde could mean
> "not sign".

In this case, it's logicians who use U+00AC for "not", or at least
some of them.  It got into EBCDIC, I don't know exactly how, and from
there into ISO 8859-1.

> Curiously, [the BBC] never had the same problem
> with the name of the letter P.

In the New York City subway system (of underground trains, that is,
not underground pedestrian tunnels!), this letter has been consistently
avoided since 1967, when the system of distinguishing trains by letter
or number was instituted.  The only other letters never used are I
and O (presumably to avoid confusion with 1 and 0, though 0 has never
been used either), and Y.  Why Y is a mystery to me: perhaps there has
simply never been a need for it.

--
The Imperials are decadent, 300 pound   John Cowan <[EMAIL PROTECTED]>
free-range chickens (except they have   http://www.reutershealth.com
teeth, arms instead of wings andhttp://www.ccil.org/~cowan
dinosaurlike tails).--Elyse Grasso

RE: American English translation of character names

2003-12-18 Thread Michael Everson

At 16:21 +0100 2003-12-18, Philippe Verdy wrote:
John Cowan wrote:
 The most mysterious term is "caron" for the hacek accent: this word
 seems to exist only in ISO standards, and nobody has any idea where it
 came from.
I think it may have occured in some typographic terminology, because 
the intial glyph looked more like a crochet hook than to a reversed 
circumflex, i.e. caron was not angular in handwritten form, as it is 
now in typesetted fonts, but looked like a rounded and oblique check 
mark (a slight variation of the accute accent with a small rounded 
hook on its bottom end, but still much more distinctful from the 
lower half-circle form used by breve).
This doesn't make any sense to me, but in any case it does not 
explain the origin of the word "caron". The most plausible suggestion 
I've ever come up with is folk-etymological: It's a CARet that sits 
ON the vowel. :-(
--
Michael Everson * * Everson Typography *  * http://www.evertype.com

RE: American English translation of character names

2003-12-18 Thread Philippe Verdy

John Cowan wrote:
> The most mysterious term is "caron" for the hacek accent: this word
> seems to exist only in ISO standards, and nobody has any idea where it
> came from.

I think it may have occured in some typographic terminology, because
the intial glyph looked more like a crochet hook than to a reversed
circumflex, i.e. caron was not angular in handwritten form,
as it is now in typesetted fonts, but looked like a rounded and oblique
check mark (a slight variation of the accute accent with a small
rounded hook on its bottom end, but still much more distinctful from
the lower half-circle form used by breve).

__
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE!  http://www.ellaforspam.com
<>

Re: American English translation of character names

2003-12-18 Thread John Wilcock

On Thu, 18 Dec 2003 09:30:42 -0500, John Cowan wrote:
> In this case, it's logicians who use U+00AC for "not", or at least
> some of them.  It got into EBCDIC, I don't know exactly how, and from
> there into ISO 8859-1.

Wasn't it used for that purpose in APL?

John.

-- 
-- Over 2000 webcams from ski resorts around the world - www.snoweye.com
-- Translate your technical documents and web pages- www.tradoc.fr

Re: American English translation of character names

2003-12-18 Thread John Cowan

Arcane Jill scripsit:

> For example, U+00AC (NOT 
> SIGN) is something that most people I know would describe in terms like 
> "oh - you know - that character to the left of 'one' on a keyboard, if 
> you press shift".

On the standard U.S. keyboard, that gesture generates ~.
If I turn on the U.S.-International keyboard, then RightAlt-\ gives me the
NOT SIGN, where \ is the rightmost key in the QWERTYUIOP row.

> (And even then, the usual response is "Oh that one - 
> I've never used it. What's it for?". Curiously, to a mathematician, 
> tilde, overscore and prime are used in various contexts to mean "not 
> sign"; wheras to a programmer, exclamation mark and tilde could mean 
> "not sign". 

In this case, it's logicians who use U+00AC for "not", or at least
some of them.  It got into EBCDIC, I don't know exactly how, and from
there into ISO 8859-1.

> Curiously, [the BBC] never had the same problem 
> with the name of the letter P.

In the New York City subway system (of underground trains, that is,
not underground pedestrian tunnels!), this letter has been consistently
avoided since 1967, when the system of distinguishing trains by letter
or number was instituted.  The only other letters never used are I
and O (presumably to avoid confusion with 1 and 0, though 0 has never
been used either), and Y.  Why Y is a mystery to me: perhaps there has
simply never been a need for it.

-- 
The Imperials are decadent, 300 pound   John Cowan <[EMAIL PROTECTED]>
free-range chickens (except they have   http://www.reutershealth.com
teeth, arms instead of wings andhttp://www.ccil.org/~cowan
dinosaurlike tails).--Elyse Grasso

Re: American English translation of character names

2003-12-18 Thread John Cowan

Arcane Jill scripsit:

> In fact, until Kenneth Whistler's email about American English - I 
> actually thought the Unicode character names /were/ in American English, 
> because they are certainly not in my native dialect (although I did know 
> that most Americans don't say "full stop"). 

My father and I never could convince my mother (native German speaker
who immigrated to the U.S. at age 12) that the football (i.e. American
rugby) player she dated in high school was a "fullback" and not a
"full stop".

> Rest assured, Kenneth, we in
> Britain do /not/ refer to slash as "solidus", underscore as "low line",  
> backslash as "reverse solidus", paragraph sign as "pilcrow sign", and so 
> on. 

"Solidus" is probably the most interesting one: it's Latin for "shilling",
and until 1971 the usual way of writing "six shillings eightpence" was
6/8, i.e. "sex solidi octo denarii".  In this use, the / descends from
U+017F, the old "long s".

"Underscore" would suggest rather U+0332, the combining low line.  As
for "pilcrow", it's probably descended from a perversion of "paragraph",
but nobody knows for sure.

The most mysterious term is "caron" for the hacek accent: this word
seems to exist only in ISO standards, and nobody has any idea where it
came from.

-- 
John Cowan  [EMAIL PROTECTED]  www.reutershealth.com  www.ccil.org/~cowan
Original line from _The Warrior's Apprentice_ by Lois McMaster Bujold:
"Only on Barrayar would pulling a loaded needler start a stampede toward one."
English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk to
lose support instead of finding it when you threat with the charged weapon."

RE: American English translation of character names

2003-12-18 Thread Arcane Jill

Thanks, that's interesting. It may well be the case that printers,
typesetters, etc., are the only people who actually need these
things to have names, so I guess their names should be respected. The
rest of us just seem to get by without them, somehow. For example,
U+00AC (NOT SIGN) is something that most people I know would describe
in terms like "oh - you know - that character to the left of 'one' on a
keyboard, if you press shift". (And even then, the usual response is
"Oh that one - I've never used it. What's it for?". Curiously, to a
mathematician, tilde, overscore and prime are used in various contexts
to mean "not sign"; wheras to a programmer, exclamation mark and tilde
could mean "not sign". Do printers, typesetters, editors and publishers
use U+00AC to actually mean "not sign" then, or is it an
arbitrary name? (Incidently, the code charts for U+00AC (NOT SIGN) also
say "= angled
dash (in typography)." So I'm still a bit confused about in which
discipline it is actually known as "not sign").

Going back to the American English point, our terms for things are
really not so far apart. "Counterclockwise" sounds just as acceptable
to my ears as "Anticlockwise". I confess that  "period" still sounds
weird to my ears, but every programmer calls that character "dot"
anyway. In short, Kenneth's "translation into American" is more
understandable to me, in Britain, than the original. Okay, so we now
have an explanation - they are typesetters' terms. (I don't know if
they are British or American, but don't think it really matters, now
that we've established that the majority of the population don't use
them).

As an amusing aside, when character names migrated from programmers to
the general public via BBC television (because TV presenters started
having to read out email addresses and URIs), they purposefully started
a new trend of referring to slash (solidus) as "right-slash" or
"forward-slash". Everyone else had called it "slash" for as
long as I could remember, but the BBC couldn't allow their presenters
to say "slash" because (in Britain, at least), the verb 'to slash' is a
slang term meaning 'to urinate'. Curiously, they never had the same
problem with the name of the letter P.

Jill

> -Original Message-
> From: Séamas Ó Brógáin [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, December 18, 2003 12:05 PM
> To: Unicode-L
> Subject: RE: American English translation of character names
> 
> 
> Jill Ramonsky wrote:
> 
> > . . . I have no idea where these terms came from, but, take
it from 
> > someone who lives here, they are not in common usage in
Britain.
> 
> If you were a printer, typesetter, editor or publisher---i.e. one
of 
> those who _use_ all these characters and therefore must have 
> names for 
> them---you would probably be more familiar with traditional 
> terminology.
> 
> Séamas Ó Brógáin
> 
> 
>

RE: American English translation of character names

2003-12-18 Thread D. Starner

> Or, indeed, why the "proper" name for a character must be in English, 
> and spellable in ASCII, instead of, say, Japanese.

Because it's an English character list; limiting the use of the list to
those who know 15 languages wouldn't be of much help. And ASCII, because
once you've restricted it to English, it's not much of a restriction, and
there's few channels where ASCII gets restricted, but many where arbitrary
UTF-8 isn't accepted.
 
> In fact, I don't even see why a Unicode character /has/ to 
> have a "proper name" at all. 

Because a great pain of Unicode is the lack of a standard JIS X0218-Unicode
mapping, and part of that reason is the fact that JIS X0218 is a glyph
standard without proper names and definitions of what the characters are.

> ASCII characters never had them. 

http://www.itscj.ipsj.or.jp/ISO-IR/006.pdf (ISO 646, USA Version X3.4 - 1968)
certainly seems to have them. 

> And, hey - 
> the official names for CJK Unified Ideographs Extension A (for example) 
> tell me nothing more than the script and codepoint anyway. 

And they are the exceptions to the rules. 


-- 
___
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm

Re: American English translation of character names

2003-12-18 Thread Peter Kirk

On 18/12/2003 02:51, Arcane Jill wrote:

...
In fact, until Kenneth Whistler's email about American English - I 
actually thought the Unicode character names /were/ in American 
English, because they are certainly not in my native dialect (although 
I did know that most Americans don't say "full stop"). Rest assured, 
Kenneth, we in Britain do /not/ refer to slash as "solidus", 
underscore as "low line",  backslash as "reverse solidus", paragraph 
sign as "pilcrow sign", and so on. I have no idea where these terms 
came from, but, take it from someone who lives here, they are not in 
common usage in Britain. (With the exceptions of "full stop" and 
"anticlockwise"). Curious -- I wonder where those "official" names 
came from?


They are not the names used by British programmers. But they are perhaps 
the names which were used by British typesetters, and maybe American 
ones too, in the old days of hot metal.

I've never attached any importance to the "proper" names (and I'm also 
a programmer). In fact, I don't even see why a Unicode character /has/ 
to have a "proper name" at all. ASCII characters never had them. And, 
hey - the official names for CJK Unified Ideographs Extension A (for 
example) tell me nothing more than the script and codepoint anyway. I 
tend to regard them as "comments".

Agreed. The names are useful for selecting a character from a drop-down 
list. But they are only useful if they are accurate. I agree with Doug 
that "As a programmer, I can't personally imagine designing a program 
that relies on the Unicode names to identify characters uniquely". I 
suspect that the issue is more that WG2 people who are not programmers 
decided on behalf of programmers, but without asking them, that 
stability of names would be a good thing. And maybe because they want to 
make sure their work lasts 1000 years.

Well, I don't want to be offensive to WG2 again, so I invite WG2 members 
to correct me on this and explain why stability of character names is 
considered so important. Don't just say "we promised stability so we 
must deliver", I want to know why the promise was made and to whom. If 
the people to whom the promise was made don't actually want it, then 
maybe WG2 can be released from its unwise commitment.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/

RE: American English translation of character names

2003-12-18 Thread Séamas Ó Brógáin

Jill Ramonsky wrote:

. . . I have no idea where these terms came from, but, take it from 
someone who lives here, they are not in common usage in Britain.
If you were a printer, typesetter, editor or publisher---i.e. one of 
those who _use_ all these characters and therefore must have names for 
them---you would probably be more familiar with traditional 
terminology.

Séamas Ó Brógáin

RE: American English translation of character names

2003-12-18 Thread Arcane Jill





> From: Christopher John Fynn [mailto:[EMAIL PROTECTED]]
> There is plenty of disagreement about what the "proper" name  for
many
> characters should be

Or, indeed, why the "proper" name for a character must be in English,
and spellable in ASCII, instead of, say, Japanese.

> From: Kenneth Whistler [mailto:[EMAIL PROTECTED]]
> And, indeed, some of us have toyed around with the notion of
> publishing an American English translation of the Unicode
> names list, including such obvious improvements as:

In fact, until Kenneth Whistler's email about American English - I
actually thought the Unicode character names were in American
English, because they are certainly not in my native dialect (although
I did know that most Americans don't say "full stop"). Rest assured,
Kenneth, we in Britain do not refer to slash as "solidus",
underscore as "low line",  backslash as "reverse solidus", paragraph
sign as "pilcrow sign", and so on. I have no idea where these terms
came from, but, take it from someone who lives here, they are not in
common usage in Britain. (With the exceptions of "full stop" and
"anticlockwise"). Curious -- I wonder where those "official" names came
from?

I've never attached any importance to the "proper" names (and I'm also
a programmer). In fact, I don't even see why a Unicode character has
to have a "proper name" at all. ASCII characters never had them. And,
hey - the official names for CJK Unified Ideographs Extension A (for
example) tell me nothing more than the script and codepoint anyway. I
tend to regard them as "comments".

Jill

Re: Punched tape (was: "Re: American English translation of character names")

RE: Punched tape (was: "Re: American English translation of character names")

Punched tape (was: "Re: American English translation of character names")

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

Re: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

Re: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

Re: American English translation of character names

Re: American English translation of character names

Re: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

Re: American English translation of character names

RE: American English translation of character names

RE: American English translation of character names

27 matches

Site Navigation

Mail list logo

Footer information