[Patch] Re: Unicode Operators cheatsheet, please!

2005-06-02 Thread Kevin Puetz
Rob Kinyon wrote:

 xOn 5/31/05, Sam Vilain [EMAIL PROTECTED] wrote:
 Rob Kinyon wrote:
  I would love to see a document (one per editor) that describes the
  Unicode characters in use and how to make them. The Set implementation
  in Pugs uses (at last count) 20 different Unicode characters as
  operators.
 
 I have updated the unicode quickref, and started a Perlmonks discussion
 node for this to be explored - see
 http://www.perlmonks.org/index.pl?node_id=462246
 
 As I replied on Perlmonks, it would be more helpful if the Compose
 keys were listed and not just the ASCII versions. Plus, a quick primer
 on how to enable Unicode in your favorite editor. I don't know about
 Emacs, but the Vim documentation on multibyte is difficult to work
 with, at best.

Well, :help digraph isn't particularly bad, though the included table only
covers latin-1. The canonical source is RFC1345. But I've attached a patch
for the set symbols that have them.

 Thanks,
 Rob
Index: docs/quickref/unicode
===
--- docs/quickref/unicode	(revision 4305)
+++ docs/quickref/unicode	(working copy)
@@ -21,6 +21,10 @@
 Note that the compose combinations here are an X11R6 standard, and do not
 necessarily correspond to the compose combinations available when you use your
 compose key.
+
+The digraphs used in vim come from Character Mnemonics  Character Sets,
+RFC1345 (http://www.ietf.org/rfc/rfc1345.txt). After doing :set digraph,
+the digraph ^k A B may also be entered as A BS B.
 
 Unicode ASCIIkey sequence
 charfallbackVimEmacs   Unix Compose Key combination
@@ -30,22 +34,22 @@
 ¥   Y   ^k Y e C-x 8 Y Compose Y =
 
 Set.pm operators (included for reference):
-≠   !=
-∩   *
-∪   +
+≠   !=  ^k ! =
+∩   *   ^k ( U
+∪   +   ^k ) U
 ∖   -
-⊂   
-⊃   
-⊆   =
-⊇   =
-⊄  !( $a  $b ) 
+⊂  ^k ( C
+⊃  ^k ) C
+⊆   =  ^k ( _
+⊇   =  ^k ) _
+⊄  !( $a  $b ) 
 ⊅  !( $a  $b )
 ⊈ !( $a = $b )
 ⊉ !( $a = $b )
-⊊   
+⊊  
 ⊋   
-∋/∍   $a.includes($b)
-∈/∊   $b.includes($a)
+∋/∍   $a.includes($b)   ^k ) -
+∈/∊   $b.includes($a)   ^k ( -
 ∌!$a.includes($b)
 ∉!$b.includes($a)
 
@@ -58,20 +62,20 @@
 
 So, these *might* be considered not too awful;
 
-×   *
-¬   !
+×   *   ^k * X
+¬   !   ^k N O
 ∕   /
 ≡  =:=
 ≔   :=
   ⩴ or ≝   ::=
-  ≈ or ≊~~
+  ≈ or ≊~~  ^k ? 2
 …  ...
-√  sqrt()
-∧   
-∨   ||
+√  sqrt()   ^k R T
+∧ ^k A N
+∨   ||  ^k O R
 ∣   mod  (? bit of a stretch, perhaps)
-   ⌈$x⌉ceil($x)
-   ⌊$x⌋floor($x)
+   ⌈$x⌉ceil($x) ^k / 7
+   ⌊$x⌋floor($x)^k 7 / 7
 
 
 However I think it is a BAD idea that the following unicode characters


Re: Unicode Operators cheatsheet, please!

2005-06-01 Thread Rob Kinyon
xOn 5/31/05, Sam Vilain [EMAIL PROTECTED] wrote:
 Rob Kinyon wrote:
  I would love to see a document (one per editor) that describes the
  Unicode characters in use and how to make them. The Set implementation
  in Pugs uses (at last count) 20 different Unicode characters as
  operators.
 
 I have updated the unicode quickref, and started a Perlmonks discussion node
 for this to be explored - see http://www.perlmonks.org/index.pl?node_id=462246

As I replied on Perlmonks, it would be more helpful if the Compose
keys were listed and not just the ASCII versions. Plus, a quick primer
on how to enable Unicode in your favorite editor. I don't know about
Emacs, but the Vim documentation on multibyte is difficult to work
with, at best.

Thanks,
Rob


Re: Unicode Operators cheatsheet, please!

2005-05-31 Thread Sam Vilain

Rob Kinyon wrote:

I would love to see a document (one per editor) that describes the
Unicode characters in use and how to make them. The Set implementation
in Pugs uses (at last count) 20 different Unicode characters as
operators.


I have updated the unicode quickref, and started a Perlmonks discussion node
for this to be explored - see http://www.perlmonks.org/index.pl?node_id=462246

Sam.


Unicode Operators cheatsheet, please!

2005-05-27 Thread Rob Kinyon
I would love to see a document (one per editor) that describes the
Unicode characters in use and how to make them. The Set implementation
in Pugs uses (at last count) 20 different Unicode characters as
operators.

While I'm sure these documents exist on the web somewhere, since P6 is
the first time most of us will be using these operators, it'd be nice
if P6 provided a nice cheatsheet for them.

Thanks,
Rob


Re: Unicode Operators cheatsheet, please!

2005-05-27 Thread Gaal Yahas
On Fri, May 27, 2005 at 10:29:39AM -0400, Rob Kinyon wrote:
 I would love to see a document (one per editor) that describes the
 Unicode characters in use and how to make them. The Set implementation
 in Pugs uses (at last count) 20 different Unicode characters as
 operators.

Good idea. A modest start is at docs/quickref/unicode .

-- 
Gaal Yahas [EMAIL PROTECTED]
http://gaal.livejournal.com/


Re: Unicode operators

2002-11-07 Thread Brad Hughes
Flaviu Turean wrote:
[...]

5. if you want to wait for the computing platforms before programming in
p6, then there is quite a wait ahead. how about platforms which will never
catch up? VMS, anyone?


Not to start an OS war thread or anything, but why do people still have
this mistaken impression of VMS?  We have compilers and hard drives and
networking and everything.  We even have color monitors.  Sure, we lack
a decent c++ compiler, but we consider that a feature.  :-)

brad




Re: Unicode operators

2002-11-07 Thread Dan Sugalski
At 1:27 PM -0800 11/6/02, Brad Hughes wrote:

Flaviu Turean wrote:
[...]

5. if you want to wait for the computing platforms before programming in
p6, then there is quite a wait ahead. how about platforms which will never
catch up? VMS, anyone?


Not to start an OS war thread or anything, but why do people still have
this mistaken impression of VMS?  We have compilers and hard drives and
networking and everything.  We even have color monitors.  Sure, we lack
a decent c++ compiler, but we consider that a feature.  :-)


Lacking a decent C++ compiler isn't necessarily a strike against 
VMS--to be a strike against, there'd actually have to *be* a decent 
C++ compiler...
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Unicode operators

2002-11-07 Thread Kurt D. Starsinic
On Nov 07, Dan Sugalski wrote:
 Lacking a decent C++ compiler isn't necessarily a strike against 
 VMS--to be a strike against, there'd actually have to *be* a decent 
 C++ compiler...

Doesn't VMS have a /bin/false?

- Kurt




vote no - Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-06 Thread David Dyck

The first message had many of the following characters viewable in my
telnet window, but the repost introduced a 0xC2 prefix to the 0xA7 character.

I have this feeling that many people would vote against posting all these
funny characters, as is does make reading the perl6 mailing lists difficult
in some contexts.  Ever since introducing these UTF-8   127 characters
into this mailing list, I can never be sure of what the posting author
intended to send.  I'm all for supporting UTF-8 characters in strings,
and perhaps even in variable names but to we really have to have
perl6 programs with core operators in UTF-8.  I'd like to see all
the perl6 code that had UTF-8 operators start with  use non_portable_utf8_operators.

As it stands now, I'm going to have to find new tools for my linux platform
that has been performing fine since 1995 (perl5.9 still supports libc5!),
and I don't yet know how I am
going to be able to telnet in from win98, and I'll bet that the dos kermit that I
use when I dial up won't support UTF-8 characters either.

 David

ps.

I just read how many people will need to upgrade their operating systems
if the want to upgrade to MS Word11.

Do we want to require operating system and/or many support tools to
be upgraded before we can share perl6 scripts via email?


On Tue, 5 Nov 2002 at 09:56 -0800, Michael Lazzaro [EMAIL PROTECTED]:

  CodeSymbol  Comment
  167 §  Could be used
  169 ©  Could be used
  171 «  May well be used
  172 ¬  Not?
  174 ®  Could be used
  176 °  Could be used
  177 ±  Introduces an interesting level of uncertainty?  Useable
  181 µ  Could be used
  182 ¶  Could be used
  186 º  Could be used (but I dislike it as it is alphabetic)
  187 »  May well be used
  191 ¿  Could be used




Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Dan Kogai
On Tuesday, Nov 5, 2002, at 04:58 Asia/Tokyo, Larry Wall wrote:
(B It would be really funny to use cent $B!q(B, pound $B!r(B, or yen (J\(B as a sigil, 
(B though...
(B
(BWhich 'yen' ?  I believe you already know \ (U+005c - REVERSE SOLIDUS) 
(Bis prited as a yen figure in most of Japanese platforms so yen is 
(Balready everywhere :)
(B
(BOne big problem for introducing Unicode operator is that there are too 
(Bmany symbols that look the same but with different code points (Unicode 
(Bconsortium has so done to make its capitalist members happy so their 
(Bproprietary symbols in their legacy codes are preserved).  Therefore I 
(Bobject to the idea of making Unicode operator "standard", however 
(Badvanced that particular operator would be.  At the same time, things 
(Blike "use (more) operators = taste;" is very welcome.  i.e.
(B
(B	use operators = "smooth";
(B	$hashref = $B!j(B%hash  # U+2640 FEMALE SIGN
(B	$value   = $hashref$B!i(B{key}; # U+2642 MALE SIGN
(B
(B People who believe slippery slope arguments should never go skiing.
(B
(BI don't want perl6 to be as "tough" as skiing, though.
(B
(B On the other hand, even the useful slippery slopes have "beginner"
(B slopes.  I think one advantage of using Unicode for advanced features
(B is that it *looks* scary.  So in general we should try to keep the
(B basic features in ASCII, and only use Unicode where there be dragons.
(B
(BHeck.  We already have source filters in perl5 and I'm pretty much sure 
(Bsomeone will just invent yet another 'use operators = "ascii";' kind 
(Bof stuff in perl6.  I thought "use English" was already enough.
(B
(B It will certainly be possible to write APL in Perl, but if you do,
(B you'll get what you deserve.
(B
(BAnd even APL has j.  Methinks the question is now whether you make APL 
(Bout of j or j out of APL.
(B
$BCF(B the $B!i(B with Too Many Symbols to Deal With
(B
(BP.S.  Here is even wilder idea than Unicode operators.  Why don't we 
(Bjust make perl6 XML-based and allow inline objects to be operators?
(B
(Bperl
(B$two = $one operator src="plus.png" $one;
(B/perl
(B
(B. Yuck!


Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Richard Proctor
This UTF discussion has got silly.

I am sitting at a computer that is operating in native Latin-1 and is
quite happy - there is no likelyhood that UTF* is ever likely to reach it.

The Gillemets are coming through fine, but most of the other heiroglyphs need
a lot to be desired.

Lets consider the coding comparisons.

Chars in the range 128-159 are not defined in Latin-1 (issue 1) and are
used differently by windows to Latin-1 (later issues) so should be avoided.

Chars in the range 160-191 (which include the gillemot) are coming through
fine if encoded by the sender as UTF8.

Anything in the range 192-255 is encoded differently and thus should be
avoided.

Therefore the only addition characters that could be used, that will work
under UTF8 and Latin-1 and Windows are:

CodeSymbol  Comment
160 Non-breaking space (map to normal whitespace)
161 ¡   Could be used
162 ¢   Could be used
163 £   Could be used
164 ¤   Could be used
165 ¥   Could be used
166 ¦   Could be used
167 §   Could be used
168 ¨   Could be used thouugh risks confusion with 
169 ©   Could be used
170 ª   Could be used (but I dislike it as it is alphabetic)
171 «   May well be used
172 ¬   Not?
173 ­   Nonbreaking - treat as the same
174 ®   Could be used
175 ¯   May cause confusion with _ and -
176 °   Could be used
177 ±   Introduces an interesting level of uncertainty?  Useable
178 ²   To the power of 2 (squaring ? ) Otherwise best avoided
179 ³   Cubing? Otherwise best avoided
180 ´   Too confusing with ' and `
181 µ   Could be used
182 ¶   Could be used
183 ·   Dot Product? though likely to be confused with .
184 ¸   treat as ,
185 ¹   To the power 1? Probably best avoided
186 º   Could be used (but I dislike it as it is alphabetic)
187 »   May well be used
188 ¼   Could be used
189 ½   Could be used
190 ¾   Could be used
191 ¿   Could be used

Richard 

-- 
Personal [EMAIL PROTECTED]http://www.waveney.org
Telecoms [EMAIL PROTECTED]  http://www.WaveneyConsulting.com
Web services [EMAIL PROTECTED]http://www.wavwebs.com
Independent Telecomms Specialist, ATM expert, Web Analyst  Services




Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Michael Lazzaro
Thanks, I've been hoping for someone to post that list.  Taking it one 
step further, we can assume that the only chars that can be used are 
those which:

-- don't have an obvious meaning that needs to be reserved
-- appear decently on all platforms
-- are distinct and recognizable in the tiny font sizes
 used when programming

Comparing your list with mine, with some subjective editing based on my 
small courier font, that chops the list of usable operators down to 
only a handful:

Code	Symbol	Comment
167	§	Could be used
169	©	Could be used
171	«	May well be used
172	¬	Not?
174	®	Could be used
176	°	Could be used
177	±	Introduces an interesting level of uncertainty?  Useable
181	µ	Could be used
182	¶	Could be used
186	º	Could be used (but I dislike it as it is alphabetic)
187	»	May well be used
191	¿	Could be used


That's all.  A shame, because some of the others have very interesting 
possibilities:

   • ≠ ø † ∑ ∂ ƒ ∆ ≤ ≥ ∫ ≈ Ω ‡ ± ˇ ∏ Æ

But if Windows can't easily do them, that's a pretty big problem.  
Thanks for the list.

MikeL



Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Jonathan Scott Duff

I'm all for one or two unicode operators if they're chosen properly
(and I trust Larry to do that since he's done a stellar job so far),
but what's the mechanism to generate unicode operators if you don't
have access to a unicode-aware editor/terminal/font/etc.?  IS the only
recourse to use the named versions?  Or will there be some sort of
digraph/trigraph/whatever sequence that always gives us the operator
we need?  Something like \x[263a] but in regular code and not just
quote-ish contexts:  

$campers = $a \x[263a] $b   # make $a and $b happy

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]



Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Smylers
Dan Kogai wrote:

 We already have source filters in perl5 and I'm pretty much sure
 someone will just invent yet another 'use operators = ascii;' kind
 of stuff in perl6.

I think that's backwards to have operators being funny characters by
default but requiring explicit declaration to use well-known Ascii
characters.

Doing it t'other way round would mean that you can always write fully
portable code fragments in pure Ascii, something that'd be helpful on
mailing lists and the like.

There could be an alias syntax for people in an environment where they'd
prefer to have a non-Ascii character in place of a conglomerate of Ascii
symbols, maybe:

  treat '»...«' as '[...]';

That has the documentational advantage that any non-Ascii character used
in code must be declared earlier in that file.  And even if the
non-Ascii character gets warped in the post and displays oddly for you,
you can still see what the author intended it to do.

This has the risk that Damian described of everybody defining their own
operators, but I think that's unlikely.  There's likely to be a
convention used by many people, at least those who operate in a given
character set.  This way also permits those who live in a Latin 2 (or
whatever) world to have their own convention using characters that make
sense to them.

Smylers



Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Smylers
Richard Proctor wrote:

 I am sitting at a computer that is operating in native Latin-1 and is
 quite happy - there is no likelyhood that UTF* is ever likely to reach
 it.
 
 ... Therefore the only addition characters that could be used, that
 will work under UTF8 and Latin-1 and Windows ...

What about people who don't use Latin-1, perhaps because their native
language uses Latin-2 or some other character set mutually exclusive
with Latin-1?

I don't have a Latin-2 ('Central and East European languages') typeface
handy, but its manpage includes:

  253   171   AB LATIN CAPITAL LETTER T WITH CARON
  273   187   BB LATIN SMALL LETTER T WITH CARON

Caron is sadly missing from my dictionary so I'm not sure what those
would look like, but I suspect they wouldn't be great symbols for vector
operators.

 171   «   May well be used

Also I wonder how similar to doubled less-than or greater-than signs
guillemets would look.  In this font they're fine, but I'm concerned at
my abilities to make them sufficiently distinguishable on a whiteboard,
and whether publishers will cope with them (compare a recent discussion
on 'use Perl' regarding curly quotes and fi ligatures appearing in
code samples).

Smylers



Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Richard Proctor
On Tue 05 Nov, Smylers wrote:
 Richard Proctor wrote:
 
  I am sitting at a computer that is operating in native Latin-1 and is
  quite happy - there is no likelyhood that UTF* is ever likely to reach
  it.
  
  ... Therefore the only addition characters that could be used, that
  will work under UTF8 and Latin-1 and Windows ...
 
 What about people who don't use Latin-1, perhaps because their native
 language uses Latin-2 or some other character set mutually exclusive
 with Latin-1?


Once you go beyond latin-1 there is nothing common anyway.  The Gullimots
become T and t with inverted hats under Latin-2, oe and G with an inverted
hat under Latin-3, oe and G with a squiggle under it under Latin-4, No
meaning and a stylisd K for Latin-5, (cant find latin6), Gullimots under
Latin 7, nothing under latin-8. 

Richard

-- 
Personal [EMAIL PROTECTED]http://www.waveney.org
Telecoms [EMAIL PROTECTED]  http://www.WaveneyConsulting.com
Web services [EMAIL PROTECTED]http://www.wavwebs.com
Independent Telecomms Specialist, ATM expert, Web Analyst  Services




Re: Unicode operators

2002-11-05 Thread Flaviu Turean
one more data point from a person who lived, travelled and used computers
in a few countries (Romania, France, Germany, Belgium, UK, Canada, US,
Holland, Italy). paraphrasing:

rule 1: if it's not on my keyboard, it doesn't exist;
rune 2: if it's not on everybody's keyboard, it doesn't exist.


long, windy argument:

1. enter an internet cafe in Amsterdam, read your account in the web
browser. you get a window, it's hard to guess which OS is underneath. all
you get is a browser window, full screen. you are on the perl6-language
mailing list. before even contributing to the list you need to configure
your keyboard, and you have to figure out how. and you have to trust the
OS and browser installation to correctly transfer the funnies;

2. different keyboards have different symbols on them. did you know that
the UK keyboard is different from the US one? Belgium has two national
keyboards (Vallon and Flemish), the Vallon one is different from the one
used in France (and from the one used in Quebec), the Flemish one
different from the one used in Holland, and so on;

3. backquote is not on all keyboards, similarly the curlies. some have a
funny quote (oblique), which doesn't transfer/translate well, and which,
visually, seems fine until you run it through the interpreter;

4.  everybody is doing it! first one is free!
actually, it is like the other favourite pastime: everybody is doing it,
but the first time hurts the most (of the people ;-)

setting it up is difficult, afterwards yes, it may come up fine for more
symbols;

5. if you want to wait for the computing platforms before programming in
p6, then there is quite a wait ahead. how about platforms which will never
catch up? VMS, anyone?

6.  they'll catch up with p6 and employ Unicode, or they'll die
or the other way 'round;

7. I type this on a Solaris box, telnet'd into a Linux box, I run pine
(please _do_not_ ask people to change application so that they become
worthy of reading your messages!). accented letters don't go through;

8.  and  are not exactly common in non-Latin scripts. one more alien
symbol to learn for those who started their lives in scripts like Chinese,
Japanese, Hindi, Arabic, etc.;

9.  now you have the set-up of a six-year old Swiss
can the six-year old explain how he did it?

10. fearless leaders listen to their constituency and act accordingly,
this is the only way they can remain fearless.

still reading?
flaviu





Re: Unicode operators [Was: Re: UTF-8 and Unicode FAQ, demos]

2002-11-05 Thread Michael Lazzaro

As one of the instigators of this thread, I submit that we've probably 
argued about the Unicode stuff enough.  The basic issues are now known, 
and it's known that there's no general agreement on any of this stuff, 
nor will there ever be.  To wit:

-- Extended glyphs might be extremely useful in extending the operator 
table in non-ambiguous ways, especially for advanced things like «op»..

-- Many people loathe the idea, and predict newcomers will too.

-- Many mailers  older platforms tend to react badly for both viewing 
and inputting.

-- If extended characters are used at all, the decision needs to be 
made whether they shall be least-common-denominator Latin1, UTF-8, or 
full Unicode, and if there are backup spellings so that everyone can 
play.

It's up to Larry, and he knows where we're all coming from.  Unless 
anyone has any _new_ observations, I propose we pause the debate until 
a decision is reached?

MikeL